在熊貓變換中應用幾個函數

在groupby之後，當使用agg時，如果傳遞columns:functions的字典，函數將被應用在相應的列中。儘管如此，這種語法不適用於transform。在transform中是否有另一種方法來應用幾個函數？在熊貓變換中應用幾個函數

讓我們舉一個例子：

import pandas as pd 
df_test = pd.DataFrame([[1,2,3],[1,20,30],[2,30,50],[1,2,33],[2,4,50]],columns = ['a','b','c']) 
Out[1]: 
    a b c 
0 1 2 3 
1 1 20 30 
2 2 30 50 
3 1 2 33 
4 2 4 50 

def my_fct1(series): 
    return series.mean() 

def my_fct2(series): 
    return series.std() 

df_test.groupby('a').agg({'b':my_fct1,'c':my_fct2}) 

Out[2]: 
    c b 
a  
1 16.522712 8 
2 0.000000 17

前面的例子顯示瞭如何將不同功能agg適用於不同的列，但如果我們要變換的列沒有聚集他們，agg不能再使用。因此：

a b c 
0 1 2 3 
1 1 22 90 
2 2 30 50 
3 1 24 2970 
4 2 34 2500

來源

2017-06-21 ysearka

我覺得現在（熊貓0.20.2）功能transform不與dict實現 - 與像agg功能列名。

如果函數返回Series與相同長度：

df1 = df_test.set_index('a').groupby('a').agg({'b':np.cumsum,'c':np.cumprod}).reset_index() 
print (df1) 
    a  c b 
0 1  3 2 
1 1 90 22 
2 2 50 30 
3 1 2970 24 
4 2 2500 34

但如果aggreagte不同長度需要join：

df2 = df_test[['a']].join(df_test.groupby('a').agg({'b':my_fct1,'c':my_fct2}), on='a') 
print (df2) 
    a   c b 
0 1 16.522712 8 
1 1 16.522712 8 
2 2 0.000000 17 
3 1 16.522712 8 
4 2 0.000000 17

來源

2017-06-21 13:18:44 jezrael

，您仍然可以使用字典，但有位黑客：

df_test.groupby('a').transform(lambda x: {'b': x.cumsum(), 'c': x.cumprod()}[x.name]) 
Out[427]: 
    b  c 
0 2  3 
1 22 90 
2 30 50 
3 24 2970 
4 34 2500

如果

df_test.groupby('a').transform({'b':np.cumsum,'c':np.cumprod}) 
Out[3]: 
TypeError: unhashable type: 'dict'

我們如何能夠用下面的預期輸出執行這樣的操作你需要保留一列，你可以這樣做：

df_test.set_index('a')\ 
     .groupby('a')\ 
     .transform(lambda x: {'b': x.cumsum(), 'c': x.cumprod()}[x.name])\ 
     .reset_index() 
Out[429]: 
    a b  c 
0 1 2  3 
1 1 22 90 
2 2 30 50 
3 1 24 2970 
4 2 34 2500

另一種方法是使用的if else檢查列名：

df_test.set_index('a')\ 
     .groupby('a')\ 
     .transform(lambda x: x.cumsum() if x.name=='b' else x.cumprod())\ 
     .reset_index()

來源

2017-06-21 13:13:00 Allen

在熊貓變換中應用幾個函數

回答

相關問題