2017-10-08 76 views
-1

我有一個產品陣列,其看起來像見下表:使用lambda的詞典中特定鍵的值?

+---------------------------+--------------------------------+--------------------------------+ 
| name     | review      | word_count      | 
+---------------------------+--------------------------------+--------------------------------+ 
|       |        | {'and': 5, 'wipes': 1,   | 
| Planetwise    | These flannel wipes are OK, | 'stink': 1, 'because' : 2, ... | 
| Flannel Wipes    | but in my opinion ...   |        | 
|       |        |        | 
+---------------------------+--------------------------------+--------------------------------+ 
|       |        | {'and': 3, 'love': 1,   | 
| Planetwise    | it came early and was not  | 'it': 2, 'highly': 1, ...  | 
| Wipes Pouch    | disappointed. i love ...  |        | 
|       |        |        | 
+---------------------------+--------------------------------+--------------------------------+ 
|       |        | {'shop': 1, 'noble': 1,  | 
|       |        | 'is': 1, 'it': 1, 'as': ... | 
| A Tale of Baby's Days  | Lovely book, it's bound  |        | 
| with Peter Rabbit ... | tightly so you may no ...  |        | 
|       |        |        | 
+---------------------------+--------------------------------+--------------------------------+ 

基本上word_count列包含字的dictionary(key : value)一個發生的review列句子。

現在我想建立一個新的列名and它應該包含在word_count字典and值,如果and存在作爲word_count列鍵,則該值,如果它沒有作爲一個關鍵的存在,則0

對於第3行中的新and列看起來是這樣的:

+------------+ 
| and  | 
+------------+ 
|   | 
| 5   | 
|   | 
|   | 
+------------+ 
|   | 
| 3   | 
|   | 
|   | 
+------------+ 
|   | 
| 0   | 
|   | 
|   | 
+------------+ 

我寫了這個代碼和它的正常工作:

def wordcount(x): 
    if 'and' in x: 
     return x['and'] 
    else: 
     return 0 

products['and'] = products['word_count'].apply(wordcount); 

我的問題:有什麼辦法我可以使用lambda來做到這一點?

什麼我迄今所做的是:

products['and'] = products['word_count'].apply(lambda x : 'and' in x.keys()); 

這僅返回01列。我可以在上面的行中添加什麼,以便products['and']包含值and作爲密鑰存在時的密鑰products['word_count']

我正在使用ipython notebook和graphlab。

回答

1

你有正確的想法。只要返回值x['and'](如果存在),否則0

例如:

data = {"word_count":[{"foo":1, "and":5}, 
         {"foo":1}]} 
df = pd.DataFrame(data) 
df.word_count.apply(lambda x: x['and'] if 'and' in x.keys() else 0) 

輸出:

0 5 
1 0 
Name: word_count, dtype: int64 
1

我不知道什麼products['word_count'].apply(wordcount)做,但是從你的問題的休息,而你可以做類似以下與lambda

products['and'] = (
    lambda p: p['and']['and'] if 'and' in p['and'] else 0)(products) 

這是一種醜陋,笨拙,所以我會建議使用內置的字典get()方法,而不是因爲它的調試,短,維護更方便,快捷:

products['and'] = products['and'].get('and', 0) 

您在使用lambda固定提醒我所謂的Law of the Instrument:「......它是誘人的,如果你擁有的唯一工具是錘子,把所有東西當作釘子來對待」。