2013-05-26 39 views
5

指數平滑功能我有一個交易數據如下數據框:熊貓:爲

df = pd.DataFrame({ 
'Trader': 'Carl Mark Carl Joe Mark Carl Max Max'.split(), 
'Quantity': [5,2,5,10,1,5,2,1], 
'Date' : [ 
DT.datetime(2013,1,1,13,0), 
DT.datetime(2013,1,1,13,5), 
DT.datetime(2013,2,5,20,0), 
DT.datetime(2013,2,6,10,0), 
DT.datetime(2013,2,8,12,0),          
DT.datetime(2013,3,7,14,0), 
DT.datetime(2013,6,4,14,0), 
DT.datetime(2013,7,4,14,0), 
]}) 

df.index = [df.Date, df.Trader] 

我希望來計算,平均訂單量每個交易周的統計數據。要做到這一點,我現在開拆貿易商列,並使用重採樣數據:

df.unstack('Trader').resample('1W', how='mean').fillna(0) 

是否有可能也COMPTE爲每個交易者的列與趨勢功能的交易量(最好是基於指數平滑功能在交易者以前的交易)?

感謝

安迪

+0

'df.unstack( '交易')fillna(0).resample( '1W',如何= '平均')'引發錯誤。你能否修正這個例子,以便我們更清楚地瞭解你的情況? – unutbu

+0

嗨unutbu,感謝您的評論。對不起,我忘了單獨指定索引。嘗試更新的DataFrame。謝謝 – Andy

回答

8

也許你正在尋找an exponentially weighted moving average

import pandas as pd 
import datetime as DT 

df = pd.DataFrame({ 
    'Trader': 'Carl Mark Carl Joe Mark Carl Max Max'.split(), 
    'Quantity': [5, 2, 5, 10, 1, 5, 2, 1], 
    'Date': [ 
     DT.datetime(2013, 1, 1, 13, 0), 
     DT.datetime(2013, 1, 1, 13, 5), 
     DT.datetime(2013, 2, 5, 20, 0), 
     DT.datetime(2013, 2, 6, 10, 0), 
     DT.datetime(2013, 2, 8, 12, 0), 
     DT.datetime(2013, 3, 7, 14, 0), 
     DT.datetime(2013, 6, 4, 14, 0), 
     DT.datetime(2013, 7, 4, 14, 0), 
     ]}) 

df.index = [df.Date, df.Trader] 
df2 = df.unstack('Trader').resample('1W', how='mean').fillna(0) 
print(pd.ewma(df2, span=7)) 

產量

  Quantity        
Trader   Carl  Joe  Mark  Max 
Date            
2013-01-06 5.000000 0.000000 2.000000 0.000000 
2013-01-13 2.142857 0.000000 0.857143 0.000000 
2013-01-20 1.216216 0.000000 0.486486 0.000000 
2013-01-27 0.771429 0.000000 0.308571 0.000000 
2013-02-03 0.518566 0.000000 0.207426 0.000000 
2013-02-10 1.881497 3.041283 0.448470 0.000000 
2013-02-17 1.338663 2.163837 0.319081 0.000000 
2013-02-24 0.966766 1.562696 0.230437 0.000000 
2013-03-03 0.705454 1.140307 0.168151 0.000000 
2013-03-10 1.843158 0.838219 0.123605 0.000000 
2013-03-17 1.362049 0.619423 0.091341 0.000000 
2013-03-24 1.010398 0.459502 0.067759 0.000000 
2013-03-31 0.751651 0.341831 0.050407 0.000000 
2013-04-07 0.560329 0.254823 0.037576 0.000000 
2013-04-14 0.418350 0.190254 0.028055 0.000000 
2013-04-21 0.312703 0.142209 0.020970 0.000000 
2013-04-28 0.233936 0.106388 0.015688 0.000000 
2013-05-05 0.175120 0.079640 0.011744 0.000000 
2013-05-12 0.131154 0.059645 0.008795 0.000000 
2013-05-19 0.098261 0.044687 0.006590 0.000000 
2013-05-26 0.073637 0.033488 0.004938 0.000000 
2013-06-02 0.055195 0.025101 0.003701 0.000000 
2013-06-09 0.041378 0.018818 0.002775 0.500670 
2013-06-16 0.031023 0.014108 0.002080 0.375377 
2013-06-23 0.023261 0.010579 0.001560 0.281462 
2013-06-30 0.017443 0.007933 0.001170 0.211057 
2013-07-07 0.013080 0.005949 0.000877 0.408376 

要連接這個結果與df2

df3 = pd.ewma(df2, span=7) 
df3.columns = pd.MultiIndex.from_tuples([('EWMA', item[1]) for item in df3.columns]) 
df2 = pd.concat([df2, df3], axis=1) 

print(df2) 

產生

  Quantity      EWMA        
Trader   Carl Joe Mark Max  Carl  Joe  Mark  Max 
Date                   
2013-01-06   5 0  2 0 5.000000 0.000000 2.000000 0.000000 
2013-01-13   0 0  0 0 2.142857 0.000000 0.857143 0.000000 
2013-01-20   0 0  0 0 1.216216 0.000000 0.486486 0.000000 
2013-01-27   0 0  0 0 0.771429 0.000000 0.308571 0.000000 
2013-02-03   0 0  0 0 0.518566 0.000000 0.207426 0.000000 
2013-02-10   5 10  1 0 1.881497 3.041283 0.448470 0.000000 
2013-02-17   0 0  0 0 1.338663 2.163837 0.319081 0.000000 
2013-02-24   0 0  0 0 0.966766 1.562696 0.230437 0.000000 
2013-03-03   0 0  0 0 0.705454 1.140307 0.168151 0.000000 
2013-03-10   5 0  0 0 1.843158 0.838219 0.123605 0.000000 
2013-03-17   0 0  0 0 1.362049 0.619423 0.091341 0.000000 
2013-03-24   0 0  0 0 1.010398 0.459502 0.067759 0.000000 
2013-03-31   0 0  0 0 0.751651 0.341831 0.050407 0.000000 
2013-04-07   0 0  0 0 0.560329 0.254823 0.037576 0.000000 
2013-04-14   0 0  0 0 0.418350 0.190254 0.028055 0.000000 
2013-04-21   0 0  0 0 0.312703 0.142209 0.020970 0.000000 
2013-04-28   0 0  0 0 0.233936 0.106388 0.015688 0.000000 
2013-05-05   0 0  0 0 0.175120 0.079640 0.011744 0.000000 
2013-05-12   0 0  0 0 0.131154 0.059645 0.008795 0.000000 
2013-05-19   0 0  0 0 0.098261 0.044687 0.006590 0.000000 
2013-05-26   0 0  0 0 0.073637 0.033488 0.004938 0.000000 
2013-06-02   0 0  0 0 0.055195 0.025101 0.003701 0.000000 
2013-06-09   0 0  0 2 0.041378 0.018818 0.002775 0.500670 
2013-06-16   0 0  0 0 0.031023 0.014108 0.002080 0.375377 
2013-06-23   0 0  0 0 0.023261 0.010579 0.001560 0.281462 
2013-06-30   0 0  0 0 0.017443 0.007933 0.001170 0.211057 
2013-07-07   0 0  0 1 0.013080 0.005949 0.000877 0.408376 
+0

謝謝unutbu您的回答,這正是我所期待的。有沒有可能將這個新的數據框集成到原來的數據框中?我希望在同一個數據框中既有實際的又有趨勢? – Andy

+0

非常感謝 – Andy