我覺得更好的是使用DataFrameGroupBy.agg
的總量,除以size
團體的長度和mean
每它們由date_x
列分組組:
d = {'mean':'Class probability','size':'Frequency'}
df = df_train.groupby('date_x')['outcome'].agg(['mean','size']).rename(columns=d)
df.plot(secondary_y='Frequency',figsize=(22, 10))
有關更多信息,請applying multiple functions at once。
樣品:
d = {'date_x':pd.to_datetime(['2015-01-01','2015-01-01','2015-01-01',
'2015-01-02','2015-01-02']),
'outcome':[20,30,40,50,60]}
df_train = pd.DataFrame(d)
print (df_train)
date_x outcome
0 2015-01-01 20 ->1.group
1 2015-01-01 30 ->1.group
2 2015-01-01 40 ->1.group
3 2015-01-02 50 ->2.group
4 2015-01-02 60 ->2.group
d = {'mean':'Class probability','size':'Frequency'}
df = df_train.groupby('date_x')['outcome'].agg(['mean','size']).rename(columns=d)
print (df)
Class probability Frequency
date_x
2015-01-01 30 3
2015-01-02 55 2
你可以找到它在'pandas'文檔。關於分組的Pandas教程可能會有所幫助。 https://pandas.pydata.org/pandas-docs/stable/groupby.html –