2017-05-22 117 views
2

這在技術上應該是一件簡單的事情,但不幸的是,它現在還是讓我不知所措。基於另一個熊貓聚合一列

我想根據另一列找到另一列的比例。例如:

Column 1 | target_variable 
'potato'   1 
'potato'   0 
'tomato'   1 
'brocolli'  1 
'tomato'   0 

預期輸出是:

column 1 | target = 1 | target = 0 | total_count 
'potato' |  1  |  1  |  2 
'tomato' |  1  |  1  |  2 
'brocolli' |  1  |  0  |  1 

不過,我想我使用聚合錯,所以我採取了以下幼稚的做法:

z = {} 
for i in train.index: 
    fruit = train["fruit"][i] 
    l = train["target"][i] 
    if fruit not in z: 
     if l == 1: 
      z[fruit] = {1:1,0:0,'count':1} 
     else: 
      z[fruit] = {1:0,0:1,'count':1} 
    else: 
     if l == 1: 
      z[fruit][1] += 1 
     else: 
      z[fruit][0] += 1 
     z[fruit]['count'] += 1 

其中給出代之以字典形式的類似輸出。

任何人都可以啓發我正確的語法爲熊貓的方式? :)

謝謝! :)

+0

輸出是否正確? – jezrael

+0

@jezrael哎呀抱歉,修正! :)謝謝你指出了! – Wboy

+0

如果添加另一行''土豆',1',輸出會改變? – jezrael

回答

4

你需要groupby + size + unstack + add_prefix + sum

df1 = df.groupby(['Column 1','target_variable']).size() \ 
     .unstack(fill_value=0) \ 
     .add_prefix('target = ') 
df1['total_count'] = df1.sum(axis=1) 
print (df1) 
target_variable target = 0 target = 1 total_count 
Column 1            
brocolli     0   1   1 
potato     1   1   2 
tomato     1   1   2 

或者crosstab

df1 = pd.crosstab(df['Column 1'],df['target_variable'], margins=True) 
print (df1) 
target_variable 0 1 All 
Column 1     
brocolli   0 1 1 
potato   1 1 2 
tomato   1 1 2 
All    2 3 5 

df1 = df1.rename(columns = {'All': 'total_count'}).iloc[:-1] 
print (df1) 
target_variable 0 1 total_count 
Column 1       
brocolli   0 1   1 
potato   1 1   2 
tomato   1 1   2 
+0

工作就像一個魅力,非常感謝你! :) – Wboy

1

讓我們用get_dummiesadd_prefixgroupby

df = df.assign(**df['target_variable'].astype(str).str.get_dummies().add_prefix('target = ')) 
df['total_count'] = df.drop('target_variable', axis=1).sum(axis=1) 
df.groupby('Column 1').sum() 

輸出:

  target_variable target = 0 target = 1 total_count 
Column 1               
'brocolli'    1   0   1   1 
'potato'     1   1   1   2 
'tomato'     1   1   1   2 
+0

嘿斯科特,非常感謝您的幫助! :) – Wboy