2012-10-26 90 views
0

我有一個df熊貓dataframe rolling_mean()如何做?

    sales net_pft 
STK_ID RPT_Date     
600141 201.780 1.833 
     20110331 13.725 0.384 
     20110630 32.733 1.132 
     20110930 50.386 1.923 
     20111231 65.685 2.325 
     20120331 21.088 0.656 
     20120630 46.952 1.591 
600809 201.166 4.945 
     20110331 18.724 5.061 
     20110630 28.948 6.586 
     20110930 35.637 7.075 
     20111231 44.882 7.805 
     20120331 22.140 4.925 
     20120630 38.157 7.868 

我想要做的所有列的滾動平均值,GROUPBY STK_ID後,該規則由像僞代碼表示:

if RPT_Date[4:8] == '0331': 
    all_column = rolling_mean(all_column,2) 

if RPT_Date[4:8] == '0630': 
    all_column = rolling_mean(all_column,3) 

if RPT_Date[4:8] == '0930': 
    all_column = rolling_mean(all_column,4) 

if RPT_Date[4:8] == '1231': 
    all_column = rolling_mean(all_column,5) 

if is_the_first_row(): 
    keep_original_values() 

all_column這裏代表'sales ','net_pft'。最終結果如下:

    sales net_pft 
STK_ID RPT_Date     
600141 201.780 1.833 # same as original value 
     20110331 30.253 1.109 # average of row1&row2 
     20110630 31.079 1.116 # average of row1&row2&row3 
...... 
600809 201.166 4.945 # same as original value 
     20110331 24.445 5.003 # average of row1&row2 
..... 

如何寫在整潔的熊貓表達?

+0

這對我來說並不清楚你想要什麼?你的意思是某種「累積平均值」 – joris

回答

2

我想你想要這個?

In [29]: df.groupby(level='STK_ID').apply(lambda x: pd.expanding_mean(x)) 
Out[29]: 
        sales net_pft 
STK_ID RPT_Date      
600141 201.780000 1.833000 
     20110331 30.252500 1.108500 
     20110630 31.079333 1.116333 
     20110930 35.906000 1.318000 
     20111231 41.861800 1.519400 
     20120331 38.399500 1.375500 
     20120630 39.621286 1.406286 
600809 201.166000 4.945000 
     20110331 24.445000 5.003000 
     20110630 25.946000 5.530667 
     20110930 28.368750 5.916750 
     20111231 31.671400 6.294400 
     20120331 30.082833 6.066167 
     20120630 31.236286 6.323571 
+0

與expansion_mean()不完全相同,因爲滾動窗口取決於RPT_Date,並且週期性地在2-5範圍內。但是expanding_mean()是非常強大的函數。謝謝你的提示。 – bigbug