2016-04-07 51 views

回答

2

我認爲你可以使用Grouper與參數base

print df 
       date name 
0 2015-06-13 00:21:25  1 
1 2015-06-14 01:00:25  2 
2 2015-06-14 02:54:48  3 
3 2015-06-15 14:38:15  2 
4 2015-06-15 15:29:28  1 

print df.groupby(pd.Grouper(key='date', freq='24h', base=8)).sum() 
        name 
date      
2015-06-12 08:00:00 1.0 
2015-06-13 08:00:00 5.0 
2015-06-14 08:00:00 NaN 
2015-06-15 08:00:00 3.0 
1

或者改爲@ jezrael的方法,你可以使用自定義功能石斑魚:

start_ts = '2016-01-01 07:59:59' 
df = pd.DataFrame({'Date': pd.date_range(start_ts, freq='10min', periods=1000)}) 

def my_grouper(df, idx): 
    return df.ix[idx, 'Date'].date() if df.ix[idx, 'Date'].hour >= 8 else df.ix[idx, 'Date'].date() - pd.Timedelta('1day') 

df.groupby(lambda x: my_grouper(df, x)).size() 

測試:

In [468]: df.head() 
Out[468]: 
       Date 
0 2016-01-01 07:59:59 
1 2016-01-01 08:09:59 
2 2016-01-01 08:19:59 
3 2016-01-01 08:29:59 
4 2016-01-01 08:39:59 

In [469]: df.tail() 
Out[469]: 
        Date 
995 2016-01-08 05:49:59 
996 2016-01-08 05:59:59 
997 2016-01-08 06:09:59 
998 2016-01-08 06:19:59 
999 2016-01-08 06:29:59 

In [470]: df.groupby(lambda x: my_grouper(df, x)).size() 
Out[470]: 
2015-12-31  1 
2016-01-01 144 
2016-01-02 144 
2016-01-03 144 
2016-01-04 144 
2016-01-05 144 
2016-01-06 144 
2016-01-07 135 
dtype: int64