2016-03-10 46 views
1

我有一堆日期訪問字典有大熊貓日期範圍

0 2016-01-01 
1 2016-01-02 
2 2016-01-03 
3 2016-01-04 
4 2016-01-05 
5 2016-01-06 
6 2016-01-07 
7 2016-01-08 
8 2016-01-09 
9 2016-01-10 
10 2016-01-11 
11 2016-01-12 
12 2016-01-13 
13 2016-01-14 
14 2016-01-15 
15 2016-01-16 
16 2016-01-17 
17 2016-01-20 
18 2016-01-21 
19 2016-01-22 
20 2016-01-24 
21 2016-01-25 
22 2016-01-27 
23 2016-01-28 
24 2016-01-29 
25 2016-01-30 
26 2016-01-31 

的數據幀,我想用r = df.group_by('time')組的日期數據框,然後遍歷鍵來得到一些統計。事情是,這些日子不完整(你會看到我錯過了1月18日和19日)。所以我想要做的是創建一個日期範圍,然後遍歷日期範圍。但是當我嘗試這些時,當我將日期範圍的元素傳遞給字典時,出現一個關鍵錯誤。

有關我如何做到這一點的任何想法?

下面是一些代碼:

doi = (df.time<='2016-01-31')&(df.time>='2016-01-01') 
oil = df[doi] 


#Trouble Here. 
r = oil.groupby(by = 'time') 
D = oil.time 
dates = pd.date_range(D.min(),D.max()) 
frames = [] 

for d in dates: 
#The idea here is that if the date in the date range is not in the dataframe, 
#Then there is no sum to compute. return 0 
    try: 
     sum_of_oil = oil.ix[r.groups[d]].capacity.sum() 
    except KeyError: 
     sum_of_oil = 0 
    frames.append([d,sum_of_oil]) 

frames = pd.DataFrame(frames, columns = ['time','volume']) 

也許值得指出的是的oil.time元素是Timestamps

+0

請出示一些代碼。 – roadrunner66

+0

@ roadrunner66看到新的編輯。 –

+0

所以你只需要一個月的聚合?如果是這樣,那麼在這樣的範圍內缺失的日期將被視爲0. – Parfait

回答

2

即使不完整的時間序列,您可以重新採樣。

  date qty 
0 2015-01-01 123 
1 2015-01-02 213 
2 2015-01-03 41234 
3 2015-01-04 12342 
4 2015-01-05  32 
5 2015-01-06  3 
6 2015-01-07  24 
7 2015-01-08 23423 
8 2015-01-09  4 
9 2015-01-10 234 
10 2015-01-12 234 
11 2015-01-13 324 
12 2015-01-17 123 
13 2015-01-18  5 
14 2015-01-19 3454 
15 2015-01-20 574 
16 2015-01-21  51 
17 2015-01-22  56 

嘗試

print df.set_index('date').resample('D').fillna(0).reset_index() 

其產量,

  date qty 
0 2015-01-01 123 
1 2015-01-02 213 
2 2015-01-03 41234 
3 2015-01-04 12342 
4 2015-01-05  32 
5 2015-01-06  3 
6 2015-01-07  24 
7 2015-01-08 23423 
8 2015-01-09  4 
9 2015-01-10 234 
10 2015-01-11  0 
11 2015-01-12 234 
12 2015-01-13 324 
13 2015-01-14  0 
14 2015-01-15  0 
15 2015-01-16  0 
16 2015-01-17 123 
17 2015-01-18  5 
18 2015-01-19 3454 
19 2015-01-20 574 
20 2015-01-21  51 
21 2015-01-22  56 
+0

華麗,謝謝。 –

1

考慮一個完整的一套完整的月日合併:

import datetime 
import pandas as pd 

startdate = datetime.datetime.strptime('2015-01-01', '%Y-%m-%d') 
jandates = [startdate + datetime.timedelta(days=i) for i in range(31)] 

datesdf = pd.DataFrame({'date':jandates})  
mergedf = pd.merge(datesdf, actualdf, on='date', how='left').fillna(0)