import datetime as dt
import pandas as pd
from pandas import Timestamp
df = pd.DataFrame(
{'action': ['C', 'C', 'C', 'C', 'C', 'A', 'C'],
'bike': [89, 89, 57, 29, 76, 69, 17],
'cust_id': [6, 6, 30, 30, 30, 30, 30],
'date': [Timestamp('2010-02-02 00:00:00'),
Timestamp('2010-02-02 00:00:00'),
Timestamp('2010-02-05 00:00:00'),
Timestamp('2010-02-05 00:00:00'),
Timestamp('2010-02-05 00:00:00'),
Timestamp('2010-02-05 00:00:00'),
Timestamp('2010-02-05 00:00:00')],
'date_arrived': [Timestamp('2010-02-02 14:27:00'),
Timestamp('2010-02-02 15:42:00'),
Timestamp('2010-02-05 12:06:00'),
Timestamp('2010-02-05 12:07:00'),
Timestamp('2010-02-05 13:11:00'),
Timestamp('2010-02-05 13:14:00'),
Timestamp('2010-02-05 13:45:00')],
'date_removed': [Timestamp('2010-02-02 13:57:00'),
Timestamp('2010-02-02 15:12:00'),
Timestamp('2010-02-05 11:36:00'),
Timestamp('2010-02-05 11:37:00'),
Timestamp('2010-02-05 12:41:00'),
Timestamp('2010-02-05 12:44:00'),
Timestamp('2010-02-05 13:15:00')],
'hour': [14, 15, 12, 12, 13, 13, 13],
'station_arrived': [56, 56, 85, 85, 85, 85, 85],
'station_removed': [56, 56, 85, 85, 85, 85, 85]})
首先,創建一個小時指數涵蓋的日期範圍:
idx = pd.date_range(df.date.min(), df.date.max() + dt.timedelta(days=1), freq='H')
現在,你希望有一個日期時間指數,因此它設置爲「date_arrived」。然後使用groupby
與TimeGrouper
分組在小時和station_arrived
。 count
值非空值station_arrived
。取消堆疊結果以獲得數據透視表格式的數據。
最後,使用reindex
在新的小時間隔idx
索引上設置索引,並用零填充空值。
>>> (df
.set_index('date_arrived')
.groupby([pd.TimeGrouper('H'), 'station_arrived'])
.station_arrived
.count()
.unstack()
.reindex(idx)
.fillna(0)
)
station_arrived 56 85
2010-02-02 00:00:00 0 0
2010-02-02 01:00:00 0 0
2010-02-02 02:00:00 0 0
2010-02-02 03:00:00 0 0
2010-02-02 04:00:00 0 0
2010-02-02 05:00:00 0 0
2010-02-02 06:00:00 0 0
2010-02-02 07:00:00 0 0
2010-02-02 08:00:00 0 0
2010-02-02 09:00:00 0 0
2010-02-02 10:00:00 0 0
2010-02-02 11:00:00 0 0
2010-02-02 12:00:00 0 0
2010-02-02 13:00:00 0 0
2010-02-02 14:00:00 1 0
2010-02-02 15:00:00 1 0
2010-02-02 16:00:00 0 0
...
我認爲在創建數據透視表之前,您需要創建空行。所以這將涉及到一個方法來檢查,爲每個索引,哪些小時丟失,然後爲該索引生成缺失小時0 /空值的行。那麼創建樞軸。 – Sam