2013-12-08 65 views
3

我有如下所示的數據,但我也可以控制它是如何格式化的。基本上,我想用Python和Numpy或Pandas插入數據集,以便通過第二次插值數據實現第二次分辨率,從而獲得更高的分辨率。對於日期時間相關的值,Python Numpy或Pandas線性插值

所以我想要線性插值並在保持原始值的同時在每個實際值之間產生新的值。

我該如何用熊貓或Numpy做到這一點?

舉個例子,我有這種類型的數據:

 TIME    ECI_X   ECI_Y   ECI_Z 
2013-12-07 00:00:00, -7346664.77912, -13323447.6311, 21734849.5263,@ 
2013-12-07 00:01:00, -7245621.40363, -13377562.35, 21735850.3527,@ 
2013-12-07 00:01:30, -7142326.20854, -13432541.9267, 21736462.4521,@ 
2013-12-07 00:02:00, -7038893.48454, -13487262.8599, 21736650.3293,@ 
2013-12-07 00:02:30, -6935325.24526, -13541724.0946, 21736413.9937,@ 
2013-12-07 00:03:00, -6833738.23865, -13594806.9333, 21735778.2218,@ 
2013-12-07 00:03:30, -6729905.37597, -13648746.6281, 21734705.6406,@ 
2013-12-07 00:04:00, -6625943.01291, -13702423.5112, 21733208.9233,@ 
2013-12-07 00:04:30, -6521853.17291, -13755836.5481, 21731288.1125,@ 
2013-12-07 00:05:00, -6419753.85176, -13807871.3011, 21729016.1386,@ 
2013-12-07 00:05:30, -6315415.32918, -13860754.6497, 21726259.4135,@ 
2013-12-07 00:06:00, -6210955.33186, -13913371.1187, 21723078.7695,@ 
... 

而且我想它的第二把要了第二個 - 即

2013-12-07 00:00:00, -7346664.77912, -13323447.6311, 21734849.5263,@ 
2013-12-07 00:00:01, -7346665.10000, -13323448.1000, 21734850.1000,@ 
... 
2013-12-07 00:00:59, -7346611.10000, -13323461.1000, 21734850.1000,@ 
2013-12-07 00:01:00, -7245621.40363, -13377562.3500, 21735850.3527,@ 

請告訴我怎麼我的例子可以做到這一點。謝謝!

我已經試過這樣:

#! /usr/bin/python 

import datetime 
from pandas import * 

first = datetime(2013,12,8,0,0,0) 
second = datetime(2013,12,8,0,2,0) 
dates = [first,second] 
x = np.array([617003.390723, 884235.38059]) 
newRange = date_range(first, second, freq='S') 
ts = Series(x, index=dates) 
ts.interpolate() 
print ts.head() 

#2013-12-08 00:00:00, 617003.390723, -26471116.2566, 3974868.93334,@ 
#2013-12-08 00:02:00, 884235.38059, -26519366.9219, 3601627.52947,@ 

我如何使用「newRange」到「X」的真正價值之間建立線性插值?

+2

看看[此方法](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.interpolate.html?highlight=interpolate#pandas.interies.html),它將當熊貓版本0.13在任何一天現在發佈的時候得到一個大升級...... –

+0

http://pandas.pydata.org/pandas-docs/dev/generated/pandas.Series.interpolate.html?highlight=interpolate#pandas。系列。插值0.13 – Jeff

回答

5

使用Git的熊貓大師(98e48ca),你可以做到以下幾點:

In [27]: n = 4 

In [28]: df = DataFrame(randn(n, 2), index=date_range('1/1/2001', periods=n, freq='30S')) 

In [29]: resampled = df.resample('S') 

In [30]: resampled.head() 
Out[30]: 
         0  1 
2001-01-01 00:00:00 -1.045 -1.067 
2001-01-01 00:00:01 NaN NaN 
2001-01-01 00:00:02 NaN NaN 
2001-01-01 00:00:03 NaN NaN 
2001-01-01 00:00:04 NaN NaN 

[5 rows x 2 columns] 

In [31]: interp = resampled.interpolate() 

In [32]: interp.head() 
Out[32]: 
         0  1 
2001-01-01 00:00:00 -1.045 -1.067 
2001-01-01 00:00:01 -1.014 -1.042 
2001-01-01 00:00:02 -0.983 -1.018 
2001-01-01 00:00:03 -0.952 -0.993 
2001-01-01 00:00:04 -0.921 -0.969 

[5 rows x 2 columns] 

In [33]: interp.tail() 
Out[33]: 
         0  1 
2001-01-01 00:01:26 0.393 0.622 
2001-01-01 00:01:27 0.337 0.571 
2001-01-01 00:01:28 0.281 0.519 
2001-01-01 00:01:29 0.225 0.468 
2001-01-01 00:01:30 0.169 0.416 

[5 rows x 2 columns] 

默認情況下Series.interpolate()執行線性插值。您也可以使用DataFrame.resample()以及不規則採樣的數據。

0

好吧,我這樣做:

first = datetime(2013,12,8,0,0,0) 
second = datetime(2013,12,8,0,2,0) 
dates = [first,second] 
x = np.array([617003.390723, 884235.38059]) 
newRange = date_range(first, second, freq='S') 
z = np.array([x[0]]) 
for i in range(1,len(newRange)-1): 
    z = np.append(z,np.array([np.nan])) 
z = np.append(z,np.array([1])) 
print len(z) 
print len(newRange) 
ts = Series(z, index=newRange) 
ts = ts.interpolate() 
print ts.head() 
相關問題