2017-08-24 130 views
4

this question我知道如何插入給定時間戳的時間序列。我想知道如何插入給定值的時間戳,例如下面的示例以獲得估計值NaT的值。Python熊貓時間序列插值日期時間數據

interval   datetime    
0.782296 2012-11-19 12:40:10 
0.795469     NaT 
0.821426 2012-11-19 12:35:10 
0.834957     NaT 
0.864383 2012-11-19 12:30:10 
0.906240 2012-11-19 12:25:10 

P.S.我試圖直接使用df['datetime'].interpolate()但失敗。

回答

1

這似乎工作。有可能清理一下代碼。但你得到它的要點

from datetime import datetime 
import pandas as pd 
import time 

#Create data 
df = pd.DataFrame({ 'interval' : [0.782296, 0.795469, 0.821426, 0.834957, 
            0.864383, 0.906240], 
        'datetime' : [datetime(2012, 11, 19, 12, 40, 10), pd.NaT, 
            datetime(2012, 11, 19, 12, 35, 10), pd.NaT, 
            datetime(2012, 11, 19, 12, 30, 10), 
            datetime(2012, 11, 19, 12, 25, 10) 
            ]}) 


#Cast date to seconds (also recast the NaT to Nan) 
df['seconds'] = [time.mktime(t.timetuple()) if t is not pd.NaT else float('nan') for t in df['datetime'] ] 

#Set the interval as the index, as interpolation uses the index 
df.set_index('interval', inplace=True) 
#Use the 'values'-argument to actually use the values of the index and not the spacing 
df['intepolated'] = df['seconds'].interpolate('values') 
#Cast the interpolated seconds back to datetime 
df['datetime2'] = [datetime.utcfromtimestamp(t) for t in df['intepolated']] 

#Clean up 
df.reset_index(inplace=True) 
df = df[['interval', 'datetime2']] 

>>>>df 
Out[25]: 
    interval     datetime2 
0 0.782296 2012-11-19 11:40:10.000000 
1 0.795469 2012-11-19 11:38:29.005878 
2 0.821426 2012-11-19 11:35:10.000000 
3 0.834957 2012-11-19 11:33:35.503178 
4 0.864383 2012-11-19 11:30:10.000000 
5 0.906240 2012-11-19 11:25:10.000000 

希望這是你想要的。

+0

感謝您的回答,我正在考慮將datetime轉換爲float。 – natsuapo

+0

沒問題。編輯答案,因爲它第一次不是真的正確。我省略了插值函數中的「值」參數。 – mortysporty