計算使用熊貓

我有使用數據幀GROUPBY（代碼，ID，日期）在每月的事件總數如下 -計算使用熊貓

Code ID Date    Sum 
100 200 2012-05-31 50 
       2012-06-07 60 
       2012-06-25 70 
       2012-06-26 80 
       2013-06-27 85 
       2013-06-28 90

我想創建一個數據幀，其可顯示與GROUPBY（代碼，ID數據，月/年）， -

Code ID  Month/Year  Sum 
100  200 May/2012  50 
       June/2012  210 
       June/2013  175

請告知

來源

2014-01-07 Madhup Srivastava

見http://stackoverflow.com/questions/17450313/summing-over-months-with-pandas – cyborg

你可以對每個組每月resample。

因此第一 '日期' 列轉換爲datetime：

df['Date'] = pd.to_datetime(df['Date'])

然後將其設置爲指標，GROUPBY上['Code', 'ID']，然後對每個組應用resample：

df.set_index('Date').groupby(['Code', 'ID']).resample('M', 'sum')

In [6]: df = pd.DataFrame({'Code':100, 'ID':200, 'Date':pd.date_range("2012-01-01", periods=10, freq='10D'), 'Sum':np.random.randint(10, size=10)}) 

In [7]: df 
Out[7]: 
    Code    Date ID Sum 
0 100 2012-01-01 00:00:00 200 1 
1 100 2012-01-11 00:00:00 200 9 
2 100 2012-01-21 00:00:00 200 5 
3 100 2012-01-31 00:00:00 200 9 
4 100 2012-02-10 00:00:00 200 8 
5 100 2012-02-20 00:00:00 200 3 
6 100 2012-03-01 00:00:00 200 9 
7 100 2012-03-11 00:00:00 200 8 
8 100 2012-03-21 00:00:00 200 3 
9 100 2012-03-31 00:00:00 200 5 

In [8]: df.set_index('Date').groupby(['Code', 'ID']).resample('M', 'sum') 
Out[8]: 
        Code ID Sum 
Code ID Date 
100 200 2012-01-31 400 800 24 
     2012-02-29 200 400 11 
     2012-03-31 400 800 25

要繪製它，這樣的事情應該這樣做：

fig, ax = plt.subplots() 

for name, group in df.set_index('Date').groupby(['Code', 'ID']): 
    group['Sum'].resample('M', 'sum').plot(ax=ax, label=name)

但你也可以繼續與您的卓有成效的工作，「拆散」（帶指數水平列），然後劇情：

df2 = df.set_index('Date').groupby(['Code', 'ID']).resample('M', 'sum') 
df2['Sum'].unstack([0,1]).plot()

來源

2014-01-07 08:04:48 joris

謝謝，有沒有什麼辦法將上面的數據繪製爲時間序列圖，x軸爲日期，y軸爲每個ID的總和使用matplotlib？ –

併爲每個代碼/ ID一個單獨的行？ – joris

這是正確的Joris –

計算使用熊貓

回答

相關問題