2013-08-23 65 views
2

我想計算時間增量不規則的數據集的10秒差值。數據存在於2個長度相等的1維數組中,一個用於時間,另一個是數據值。加速numpy數組中不規則時間間隔的移動時間增量

經過一番探討,我能夠想出一個解決方案,但它太慢基於(我懷疑)不得不遍歷數組中的每個項目。

我的一般方法是遍歷時間數組,併爲每個時間值找到時間值的索引是x秒前。然後,我使用數據數組上的這些索引來計算差異。

代碼如下所示。

首先,從碧波多黎各

def find_closest(A, target): 
    #A must be sorted 
    idx = A.searchsorted(target) 
    idx = np.clip(idx, 1, len(A)-1) 
    left = A[idx-1] 
    right = A[idx] 
    idx -= target - left < right - target 
    return idx 

find_closest功能,然後我通過以下方式使用

def trailing_diff(time_array,data_array,seconds): 
    trailing_list=[] 
    for i in xrange(len(time_array)): 
     now=time_array[i] 
     if now<seconds: 
      trailing_list.append(0) 
     else: 
      then=find_closest(time_array,now-seconds) 
      trailing_list.append(data_array[i]-data_array[then]) 
    return np.asarray(trailing_list) 

可惜這沒有規模特別好,我想成爲能夠在飛行中計算(並繪製它)。

任何想法,我怎麼能使它更有利嗎?

編輯:輸入/輸出

In [48]:time1 
Out[48]: 
array([ 0.57200003, 0.579  , 0.58800006, 0.59500003, 
     0.5999999 , 1.05999994, 1.55900002, 2.00900006, 
     2.57599998, 3.05599999, 3.52399993, 4.00699997, 
     4.09599996, 4.57299995, 5.04699993, 5.52099991, 
     6.09299994, 6.55999994, 7.04099989, 7.50900006, 
     8.07500005, 8.55799985, 9.023  , 9.50699997, 
     9.59399986, 10.07200003, 10.54200006, 11.01999998, 
     11.58899999, 12.05699992, 12.53799987, 13.00499988, 
     13.57599998, 14.05599999, 14.52399993, 15.00199985, 
     15.09299994, 15.57599998, 16.04399991, 16.52199984, 
     17.08899999, 17.55799985, 18.03699994, 18.50499988, 
     19.0769999 , 19.5539999 , 20.023  , 20.50099993, 
     20.59099984, 21.07399988]) 

In [49]:weight1 
Out[49]: 
array([ 82.268, 82.268, 82.269, 82.272, 82.275, 82.291, 82.289, 
     82.288, 82.287, 82.287, 82.293, 82.303, 82.303, 82.314, 
     82.321, 82.333, 82.356, 82.368, 82.386, 82.398, 82.411, 
     82.417, 82.419, 82.424, 82.424, 82.437, 82.45 , 82.472, 
     82.498, 82.515, 82.541, 82.559, 82.584, 82.607, 82.617, 
     82.626, 82.626, 82.629, 82.63 , 82.636, 82.651, 82.663, 
     82.686, 82.703, 82.728, 82.755, 82.773, 82.8 , 82.8 , 
     82.826]) 

In [50]:trailing_diff(time1,weight1,10) 
Out[50]: 
array([ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 
     0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 
     0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 
     0. , 0.169, 0.182, 0.181, 0.209, 0.227, 0.254, 0.272, 
     0.291, 0.304, 0.303, 0.305, 0.305, 0.296, 0.274, 0.268, 
     0.265, 0.265, 0.275, 0.286, 0.309, 0.331, 0.336, 0.35 , 
     0.35 , 0.354]) 
+0

你可以顯示一些(小)的輸入和輸出? – Daniel

+0

@Ophion。尷尬的遺漏。固定。 – Chris

+0

time_array和data_array有多大? – tom10

回答

1

使用現成的插補程序。如果你真的想近鄰的行爲,我認爲這將必須SciPy的的scipy.interpolate.interp1d,但線性插值似乎是一個更好的選擇,然後你可以使用numpy的的numpy.interp

def trailing_diff(time, data, diff): 
    ret = np.zeros_like(data) 
    mask = (time - time[0]) >= diff 
    ret[mask] = data[mask] - np.interp(time[mask] - diff, 
             time, data) 
    return ret 

time = np.arange(10) + np.random.rand(10)/2 
weight = 82 + np.random.rand(10) 

>>> time 
array([ 0.05920317, 1.23000929, 2.36399981, 3.14701595, 4.05128494, 
     5.22100886, 6.07415922, 7.36161563, 8.37067107, 9.11371986]) 
>>> weight 
array([ 82.14004969, 82.36214992, 82.25663272, 82.33764514, 
     82.52985723, 82.67820915, 82.43440796, 82.74038368, 
     82.84235675, 82.1333915 ]) 
>>> trailing_diff(time, weight, 3) 
array([ 0.  , 0.  , 0.  , 0.18093749, 0.20161107, 
     0.4082712 , 0.10430073, 0.17116831, 0.20691594, -0.31041841]) 

要獲得最近的鄰居,你會做

from scipy.interpolate import interp1d 

def trailing_diff(time, data, diff): 
    ret = np.zeros_like(data) 
    mask = (time - time[0]) >= diff 
    interpolator = interp1d(time, data, kind='nearest') 
    ret[mask] = data[mask] - interpolator(time[mask] - diff) 
    return ret 
+0

殺死它。最近的鄰居實現完美地工作,並將我從〜35ms降至〜580us。另外,非常可讀可以理解。非常感謝。 – Chris