2015-05-26 153 views
0

我有一個數據幀:熊貓集團,通過條件過濾

import pandas as pd 

df = pd.DataFrame({'First': ['Sam', 'Greg', 'Steve', 'Sam', 
          'Jill', 'Bill', 'Nod', 'Mallory', 'Ping', 'Lamar'], 
        'Last': ['Stevens', 'Hamcunning', 'Strange', 'Stevens', 
          'Vargas', 'Simon', 'Purple', 'Green', 'Simon', 'Simon'], 
        'Address': ['112 Fake St', 
           '13 Crest St', 
           '14 Main St', 
           '112 Fake St', 
           '2 Morningwood', 
           '7 Cotton Dr', 
           '14 Main St', 
           '20 Main St', 
           '7 Cotton Dr', 
           '7 Cotton Dr'], 
        'Status': ['Infected', '', 'Infected', '', '', '', '','', '', 'Infected'], 
        }) 

我按組碼應用以下

df_index = df.groupby(['Address', 'Last']).filter(lambda x: (x['Status'] == 'Infected').any()).index 
df.loc[df_index, 'Status'] = 'Infected' 

相反標記的一切,「感染」作爲組 - 中通過代碼。有沒有一種方法來選擇將被更新的值,以便將它們標記爲其他內容?例如:

df2 = df.copy(deep=True) 
df2['Status'] = ['Infected', '', 'Infected', 'Infected2', '', 'Infected2', '', '', 'Infected2', 'Infected'] 
+0

很抱歉,但你有什麼期望輸出,是吧'DF2 [「狀態」]'? – Zero

+0

@JohnGalt'df2 ['狀態'] = ['感染','','感染','感染2','','感染2','','','感染2','感染'] – ccsv

回答

0

我認爲,這達到您想要的結果,它會略有不同:

def infect_new_people(group): 
    if (group['Status'] == 'Infected').any(): 
     # Only affect people not already infected 
     group.loc[group['Status'] != 'Infected', 'Status'] = 'Infected2' 
    return group['Status'] 

# Need group_keys=False so that each group has the same index 
# as the original dataframe 
df['Status'] = df.groupby(['Address', 'Last'], group_keys=False).apply(infect_new_people) 

df 
Out[36]: 
     Address First  Last  Status 
0 112 Fake St  Sam  Stevens Infected 
1 13 Crest St  Greg Hamcunning   
2  14 Main St Steve  Strange Infected 
3 112 Fake St  Sam  Stevens Infected2 
4 2 Morningwood  Jill  Vargas   
5 7 Cotton Dr  Bill  Simon Infected2 
6  14 Main St  Nod  Purple   
7  20 Main St Mallory  Green   
8 7 Cotton Dr  Ping  Simon Infected2 
9 7 Cotton Dr Lamar  Simon Infected 
+0

有沒有辦法做到這一點沒有功能? – ccsv