DataFrame通過檢查多個參數來添加布爾列

我正在尋找類似這樣的東西。DataFrame通過檢查多個參數來添加布爾列

tweets = pd.DataFrame() 

tweets['worldwide'] = [tweets['user.location'] == ["Worldwide", "worldwide", "WorldWide]]

「全球」新的列具有布爾值（真，假）通過檢查列鳴叫[「user.location」]，其具有三種不同類型的全球拼寫。

我希望值「真」應該返回拼寫「全球」的所有樹格式。

來源

2016-04-15 ambrish dhaka

IIUC那麼你要isin：

tweets['worldwide'] = [tweets['user.location'].isin(["Worldwide", "worldwide", "WorldWide"])]

這將返回True如果任何值都存在

In [229]: 
df = pd.DataFrame({'Tweets':['worldwide', 'asdas', 'Worldwide', 'WorldWide']}) 
df 

Out[229]: 
     Tweets 
0 worldwide 
1  asdas 
2 Worldwide 
3 WorldWide 

In [230]: 
df['Worldwide'] = df['Tweets'].isin(["Worldwide", "worldwide", "WorldWide"]) 
df 

Out[230]: 
     Tweets Worldwide 
0 worldwide  True 
1  asdas  False 
2 Worldwide  True 
3 WorldWide  True

不過，我個人認爲還有更多的里程在正常化的鳴叫，所以你通過使用str.lower來降低推文，然後使用str.contains來測試推文是否包含您的單詞：

In [231]: 
df['Worldwide'] = df['Tweets'].str.lower().str.contains("worldwide") 
df 

Out[231]: 
     Tweets Worldwide 
0 worldwide  True 
1  asdas  False 
2 Worldwide  True 
3 WorldWide  True

來源

2016-04-15 08:24:56 EdChum

我有這個作爲最後的形式： tweets['worldwide'] = tweets['user.location'].str.lower().str.contains("worldwide")

，並最終計成爲：

tweets['worldwide'].value_counts() 


False 4998 
True  185 
Name: worldwide, dtype: int64

來源

2016-04-15 08:39:51

DataFrame通過檢查多個參數來添加布爾列

回答

相關問題