啓動方法
我們可以創建沿第一軸滑動窗口,然後用張量乘以wtd
值的總和,減少的範圍。
的實施將是這個樣子 -
# Get all wtd values in an array
wtds = np.exp(-(np.arange(length) - m)**2/dss)
# Get the sliding windows for input array along first axis
pnp_array3D = strided_axis0(pnp_array,len(wtds))
# Initialize o/p array
out = np.zeros(pnp_array.shape)
# Get sum-reductions for the windows which don't need wrapping over
out[length:] = np.tensordot(pnp_array3D,wtds,axes=((1),(0)))[:-1]
# Last element of the output needed wrapping. So, do it separately.
out[length-1] = wtds.dot(pnp_array[np.r_[-1,range(length-1)]])
# Finally perform the divisions
out /= wtds.sum()
函數來獲取滑動窗口:strided_axis0
是從here
。
升壓用1D
卷積
那些乘法與wtds
值,然後它們的總和,減少是基本上沿該第一軸線卷積。因此,我們可以沿着axis=0
使用scipy.ndimage.convolve1d
。考慮到內存效率,這會更快,因爲我們不會創建巨大的滑動窗口。
實施將是 -
from scipy.ndimage import convolve1d as conv
avgs = conv(pnp_array, weights=wtds/wtds.sum(),axis=0, mode='wrap')
因此,out[length-1:]
,這是非零行。將相同avgs[:-length+1]
。
如果我們使用來自wtds
的非常小的內核號碼,可能會有一些精度差異。所以,請記住,如果使用這種方法convolution
。
運行測試
途徑 -
def original_app(pnp_array, length, m, dss):
alma = np.zeros(pnp_array.shape)
wtd_sum = np.zeros(pnp_array.shape)
for l in range(len(pnp_array)):
if l >= asize:
for i in range(length):
im = i - m
wtd = np.exp(-(im * im)/dss)
alma[l] += pnp_array[l - length + i] * wtd
wtd_sum[l] += wtd
alma[l] = alma[l]/wtd_sum[l]
return alma
def vectorized_app1(pnp_array, length, m, dss):
wtds = np.exp(-(np.arange(length) - m)**2/dss)
pnp_array3D = strided_axis0(pnp_array,len(wtds))
out = np.zeros(pnp_array.shape)
out[length:] = np.tensordot(pnp_array3D,wtds,axes=((1),(0)))[:-1]
out[length-1] = wtds.dot(pnp_array[np.r_[-1,range(length-1)]])
out /= wtds.sum()
return out
def vectorized_app2(pnp_array, length, m, dss):
wtds = np.exp(-(np.arange(length) - m)**2/dss)
return conv(pnp_array, weights=wtds/wtds.sum(),axis=0, mode='wrap')
計時 -
In [470]: np.random.seed(0)
...: m,n = 1000,100
...: pnp_array = np.random.rand(m,n)
...:
...: length = 6
...: sigma = 0.3
...: offset = 0.5
...:
...: asize = length - 1
...: m = np.floor(offset * asize)
...: s = length/sigma
...: dss = 2 * s * s
...:
In [471]: %timeit original_app(pnp_array, length, m, dss)
...: %timeit vectorized_app1(pnp_array, length, m, dss)
...: %timeit vectorized_app2(pnp_array, length, m, dss)
...:
10 loops, best of 3: 36.1 ms per loop
1000 loops, best of 3: 1.84 ms per loop
1000 loops, best of 3: 684 µs per loop
In [472]: np.random.seed(0)
...: m,n = 10000,1000 # rest same as previous one
In [473]: %timeit original_app(pnp_array, length, m, dss)
...: %timeit vectorized_app1(pnp_array, length, m, dss)
...: %timeit vectorized_app2(pnp_array, length, m, dss)
...:
1 loop, best of 3: 503 ms per loop
1 loop, best of 3: 222 ms per loop
10 loops, best of 3: 106 ms per loop
我檢查你的一維卷積第二種方法。它看起來有什麼問題。但我無法得到確切的結果。 我的例子: pnp_array = np.array([3924.00752506,5774.30335369,5519.40734814,4931.71344059]) 偏移= 0.85 西格瑪= 6 長度= 3 米= 1.7 DSS = 0.5 預期的結果應該是[0, 0,5594.17030842,5115.59420056]。但第二個應用程序返回[0,0,5693.3358598,5333.61073335]。所以累計誤差是-317.182084168。 這是因爲你提到的小內核數量? – Prokhozhii
@Prokhozhii對於那個集合,'wtds = np.exp( - (np.arange(length) - m)** 2/dss)''wtds'的值是什麼? – Divakar
他們在第二個應用程序中是正確的: array([0.00308872,0.3753111,0.83527021]) – Prokhozhii