2017-09-19 36 views
1

我DF看起來是這樣的:熊貓重採樣季度與顯示的開始和結束一個月

  Total 
language Julia Python R SQLite 
date     
2015-03-01 NaN NaN 17.0 NaN 
2015-04-01 NaN 156.0 189.0 NaN 
2015-05-01 13.0 212.0 202.0 NaN 

該指數是按月,我想這是每季度:

df.resample("Q").sum() 

給我這樣的:

  Total 
language Julia Python R SQLite 
date     
2015-03-31 NaN NaN 17.0 NaN 
2015-06-30 22.0 677.0 594.0 26.0 
2015-09-30 37.0 1410.0 1250.0 146.0 

但是我想表明這樣Start month - End month 2017而不是結束日期索引。期望的df:

   Total 
language  Julia Python R SQLite 
Jan - Mar, 2015 NaN NaN 17.0 NaN 
Apr - Jun, 2015 22.0 677.0 594.0 26.0 
Jul - Sep, 2015 37.0 1410.0 1250.0 146.0 

有沒有熊貓的方法呢?我做了這樣的,但它是非常髒的,我相信有一個更好的辦法做到這一點(在文檔重採樣方法是缺乏的例子......):

def quarterlyMonthNmaes(x): 
    start_date = x.name - pd.offsets.MonthBegin(3) 
    final_date = str(start_date.strftime('%b')) + " - " + str(x.name.strftime('%b, %Y')) 
    return final_date 
df["Total"].apply(quarterlyMonthNmaes, axis=1) 

回答

1

使用periods

idx = df.index.to_period('Q') 
df.index = ['{0[0]}-{0[1]}'.format(x) for x in zip(idx.asfreq('M', 's').strftime('%b'), 
                idx.asfreq('M', 'e').strftime('%b %Y'))] 
print (df) 

       Total 
       language Julia Python  R SQLite 
Jan-Mar 2015  NaN  NaN 17.0 NaN  NaN 
Apr-Jun 2015  22.0 677.0 594.0 26.0  NaN 
Jul-Sep 2015  37.0 1410.0 1250.0 146.0  NaN 

或者simplier:

idx2 = df.index.strftime('%b %Y') 
idx1 = (df.index - pd.offsets.MonthBegin(3)).strftime('%b') 
df.index = ['{0[0]}-{0[1]}'.format(x) for x in zip(idx1, idx2)]