2015-09-15 153 views
3

我有一個數據框與日期列,然後我想添加到該列的天數。我想用結果值創建一個新列'Recency_Date'。熊貓:在數據框中添加不同天數的日期

DF:

fan   Community Name Count Mean_Days Date_Min 
0 855    AAA Games  6  353 2013-04-16 
1 855 First Person Shooters  2  420 2012-10-16 
2 855   Playstation  3  108 2014-06-12 
3 3148    AAA Games  1   0 2015-04-17 
4 3148   Mobile Gaming  1   0 2013-01-19 

DF信息:

merged.info() 
<class 'pandas.core.frame.DataFrame'> 
Int64Index: 4627415 entries, 0 to 4627414 
Data columns (total 5 columns): 
fan    int64 
Community Name object 
Count    int64 
Mean_Days   int32 
Date_Min   datetime64[ns] 
dtypes: datetime64[ns](1), int32(1), int64(2), object(1) 
memory usage: 194.2+ MB 

採樣數據爲CSV:

fan,Community Name,Count,Mean_Days,Date_Min 
855,AAA Games,6,353,2013-04-16 00:00:00 
855,First Person Shooters,2,420,2012-10-16 00:00:00 
855,Playstation,3,108,2014-06-12 00:00:00 
3148,AAA Games,1,0,2015-04-17 00:00:00 
3148,Mobile Gaming,1,0,2013-01-19 00:00:00 
3148,Power PCs,2,0,2014-06-17 00:00:00 
3148,XBOX,1,0,2009-11-12 00:00:00 
3860,AAA Games,1,0,2012-11-28 00:00:00 
3860,Minecraft,3,393,2011-09-07 00:00:00 
4044,AAA Games,5,338,2010-11-15 00:00:00 
4044,Blizzard Games,1,0,2013-07-12 00:00:00 
4044,Geek Culture,1,0,2011-06-03 00:00:00 
4044,Indie Games,2,112,2013-01-09 00:00:00 
4044,Minecraft,1,0,2014-01-02 00:00:00 
4044,Professional Gaming,1,0,2014-01-02 00:00:00 
4044,XBOX,2,785,2010-11-15 00:00:00 
4827,AAA Games,1,0,2010-08-24 00:00:00 
4827,Gaming Humour,1,0,2012-05-05 00:00:00 
4827,Minecraft,2,10,2012-03-21 00:00:00 
5260,AAA Games,4,27,2013-09-17 00:00:00 
5260,Indie Games,8,844,2011-06-08 00:00:00 
5260,MOBA,2,0,2012-10-27 00:00:00 
5260,Minecraft,5,106,2012-02-17 00:00:00 
5260,XBOX,1,0,2011-06-15 00:00:00 
5484,AAA Games,21,1296,2009-08-01 00:00:00 
5484,Free to Play,1,0,2014-12-08 00:00:00 
5484,Indie Games,1,0,2014-05-28 00:00:00 
5484,Music Games,1,0,2012-09-12 00:00:00 
5484,Playstation,1,0,2012-02-22 00:00:00 

我已經試過:

merged['Recency_Date'] = merged['Date_Min'] + timedelta(days=merged['Mean_Days']) 

和:

merged['Recency_Date'] = pd.DatetimeIndex(merged['Date_Min']) + pd.DateOffset(merged['Mean_Days']) 

但我有麻煩找東西,將針對一系列的工作,而不是一個單獨的int值。任何和所有的幫助將非常讚賞與此。

+0

你需要發佈更多的信息,最好原始輸入數據,代碼如果'Date_Min'列已經是datetime dtype,那麼'pd.TimedeltaIndex(merged ['Mean_Days'],unit ='D')''也可以從'df.info()'發佈輸出,將構建一個時間增量索引,您可以用來抵消'Date_Min'列 – EdChum

回答

2

如果「Date_Min」 D型已經是日期時間,那麼你可以從你的「Mean_Days」列構建Timedeltaindex並添加這些:

In [174]: 
df = pd.DataFrame({'Date_Min':[dt.datetime.now(), dt.datetime(2015,3,4), dt.datetime(2011,6,9)], 'Mean_Days':[1,2,3]}) 
df 

Out[174]: 
        Date_Min Mean_Days 
0 2015-09-15 14:02:37.452369   1 
1 2015-03-04 00:00:00.000000   2 
2 2011-06-09 00:00:00.000000   3 

In [175]: 
df['Date_Min'] + pd.TimedeltaIndex(df['Mean_Days'], unit='D') 

Out[175]: 
0 2015-09-16 14:02:37.452369 
1 2015-03-06 00:00:00.000000 
2 2011-06-12 00:00:00.000000 
Name: Date_Min, dtype: datetime64[ns] 
+0

謝謝,這工作完美,正是我所期待的。 –