2016-05-10 63 views
1

我是熊貓新手/ python: 我有一個dataframe(events.number)索引datetime對象。熊貓datetime:groupy每小時和每一個星期一

我想在每個星期一(或其他特定的工作日)每小時提取一次事件數。我寫道:

hour_tally_monday = events.number.groupby(lambda x: (x.hour & x.weekday==0)).count() 

但這不能正常工作。

我可以放棄「& x.weekday==1」,它的工作原理,但大概是在框架中的所有日子。什麼是正確的(最簡單的)句法來平均在星期一?

+0

嘗試使用逗號「,」「而不是」&「 –

+0

文檔i總是有用的:http://pandas.pydata.org/pandas-docs/stable/groupby.html –

回答

2

我想你需要先過濾數據框與boolean indexing然後用groupbysize

import pandas as pd 

start = pd.to_datetime('2016-02-01') 
end = pd.to_datetime('2016-02-25') 
rng = pd.date_range(start, end, freq='12H') 

events = pd.DataFrame({'number': [1] * 20 + [2] * 15 + [3] * 14}, index=rng) 
print events 
        number 
2016-02-01 00:00:00  1 
2016-02-01 12:00:00  1 
2016-02-02 00:00:00  1 
2016-02-02 12:00:00  1 
2016-02-03 00:00:00  1 
2016-02-03 12:00:00  1 
2016-02-04 00:00:00  1 
2016-02-04 12:00:00  1 
2016-02-05 00:00:00  1 
2016-02-05 12:00:00  1 
2016-02-06 00:00:00  1 
2016-02-06 12:00:00  1 
2016-02-07 00:00:00  1 
... 
... 
filtered = events[events.index.weekday == 0] 
print filtered 
        number 
2016-02-01 00:00:00  1 
2016-02-01 12:00:00  1 
2016-02-08 00:00:00  1 
2016-02-08 12:00:00  1 
2016-02-15 00:00:00  2 
2016-02-15 12:00:00  2 
2016-02-22 00:00:00  3 
2016-02-22 12:00:00  3 

0.18.1版本,你可以使用新的方法DatetimeIndex.weekday_name

filtered = events[events.index.weekday_name == 'Monday'] 
print filtered 
        number 
2016-02-01 00:00:00  1 
2016-02-01 12:00:00  1 
2016-02-08 00:00:00  1 
2016-02-08 12:00:00  1 
2016-02-15 00:00:00  2 
2016-02-15 12:00:00  2 
2016-02-22 00:00:00  3 
2016-02-22 12:00:00  3 

print filtered.groupby(filtered.index.hour).size() 
0  4 
12 4 
dtype: int64