應用我有一個數據幀融化看起來像這樣:大熊貓自定義函數在數據幀融化
date group metric n_events total_users
0 2017-01-01 control metric1 33.919910 827.416818
27 2017-01-01 variant1 metric1 55.141467 780.840083
54 2017-01-01 variant2 metric1 63.045587 436.381533
1 2017-01-02 control metric2 74.013340 145.551779
28 2017-01-02 variant1 metric2 78.539663 553.410827
我要計算在熔化的數據幀的一些隆起指標。到目前爲止,我對數據幀進行了調整,這並不理想。
import pandas as pd
df = pd.DataFrame(
{'group': sorted(['control','variant1','variant2']*27),
'metric': ['metric1', 'metric2', 'metric3']*27,
'n_events': np.random.uniform(100,20,size=81),
'total_users': np.random.uniform(1000, 20, size=81),
'date' : list(pd.date_range('1/1/2017', periods=27, freq='D'))*3
})
df = df.sort_values(['date','group','metric'])
t = pd.pivot_table(df, values=['n_events','total_users'],
index=['date','metric'],
columns=['group'],
aggfunc=np.sum).reset_index()
for var in ['variant1','variant2']:
uplift_colname = var + "_standard_uplift"
# adding daily uplift
t[uplift_colname] =(t['n_events'][var]/t['total_users'][var])-\
(t['n_events']['control']/t['total_users']['control'])
我找得到擡升,而無需轉動數據幀,從而保持熔化的數據格式的更好的方式。我試着用groupby
或使用自定義函數一起apply
,即
df.groupby(['date','metric'])['n_events','group','total_users'].apply(myfxn)
您能提供一個期望結果的例子嗎? – greole