2015-12-14 81 views
3

我一直在與大熊貓一起分析時間序列數據,並一直堅持將它們整合到數據透視表中。我有一個CSV作爲數據:大熊貓數據透視表與平均時間

gov start end 
a 2015-12-08T16:05:00.980+03 2015-12-08T16:14:31.765+03 
a 2015-12-08T16:07:53.356+03 2015-12-08T16:34:43.413+03 
b 2015-12-08T16:08:43.371+03 2015-12-08T16:54:32.257+03 
b 2015-12-08T15:56:12.006+03 2015-12-08T17:35:04.499+03 

我有一組簡單的數據,具有startend時間,並從工作了兩者之間的時間差:

piv_t_subset = pd.read_csv('time_test.csv', parse_dates=['start','end']) 

piv_t_subset['time_diff'] = piv_t_subset['end'] - piv_t_subset['start'] 

我可以計算時間的獨立平均值爲:

t = piv_t_subset['time_diff'].mean() 
print t 

0 days 00:18:53.703286 

我想創建一個數據透視表與這次信息,但是當我嘗試:

pd.pivot_table(piv_t_subset,index=["gov"],values=['time_diff'],aggfunc=[np.mean]) 

我得到的錯誤:

DataError: No numeric types to aggregate

我需要做更多的預處理從timeseries將它轉換爲float

回答

1

現在不支持link

但是你可以通過total_seconds轉換的timedelta64SeriesfloatSeries

piv_t_subset['time_diff1'] = [td.total_seconds() for td in piv_t_subset['time_diff']] 
print piv_t_subset 
    gov     start      end 
0 a 2015-12-08 13:05:00.980 2015-12-08 13:14:31.765 
1 a 2015-12-08 13:07:53.356 2015-12-08 13:34:43.413 
2 b 2015-12-08 13:08:43.371 2015-12-08 13:54:32.257 
3 b 2015-12-08 12:56:12.006 2015-12-08 14:35:04.499 

piv_t_subset['time_diff'] = piv_t_subset['end'] - piv_t_subset['start'] 

piv_t_subset['time_diff1'] = [td.total_seconds() for td in piv_t_subset['time_diff']] 
print piv_t_subset 
    gov     start      end  time_diff \ 
0 a 2015-12-08 13:05:00.980 2015-12-08 13:14:31.765 00:09:30.785000 
1 a 2015-12-08 13:07:53.356 2015-12-08 13:34:43.413 00:26:50.057000 
2 b 2015-12-08 13:08:43.371 2015-12-08 13:54:32.257 00:45:48.886000 
3 b 2015-12-08 12:56:12.006 2015-12-08 14:35:04.499 01:38:52.493000 

    time_diff1 
0  570.785 
1 1610.057 
2 2748.886 
3 5932.493 

print piv_t_subset.groupby('gov').agg({'time_diff1':np.mean}) 
    time_diff1 
gov    
a  1090.4210 
b  4340.6895 

#omit aggfunc, in pivot_table is default numpy.mean 
print pd.pivot_table(piv_t_subset,index=["gov"],values=['time_diff1']) 
    time_diff1 
gov    
a  1090.4210 
b  4340.6895 
+0

完美!恥辱它不直接支持,但這工作得很好! – DGraham