一個竅門是將最後的L-1
行附加到數組外並將這些行追加到數組的開頭。那麼,這將是一個簡單的例子,使用非常高效的NumPy strides
。對於那些想知道這個技巧的成本的人來說,正如我們稍後將通過時序測試所看到的那樣,它沒有任何好處。
訣竅領先的高達將支持在代碼向前和向後跨步會是這個樣子的最終目標 -
向後跨步:
def strided_axis0_backward(inArr, L = 2):
# INPUTS :
# a : Input array
# L : Length along rows to be cut to create per subarray
# Append the last row to the start. It just helps in keeping a view output.
a = np.vstack((inArr[-L+1:], inArr))
# Store shape and strides info
m,n = a.shape
s0,s1 = a.strides
# Length of 3D output array along its axis=0
nd0 = m - L + 1
strided = np.lib.stride_tricks.as_strided
return strided(a[L-1:], shape=(nd0,L,n), strides=(s0,-s0,s1))
向前跨步:
def strided_axis0_forward(inArr, L = 2):
# INPUTS :
# a : Input array
# L : Length along rows to be cut to create per subarray
# Append the last row to the start. It just helps in keeping a view output.
a = np.vstack((inArr , inArr[:L-1]))
# Store shape and strides info
m,n = a.shape
s0,s1 = a.strides
# Length of 3D output array along its axis=0
nd0 = m - L + 1
strided = np.lib.stride_tricks.as_strided
return strided(a[:L-1], shape=(nd0,L,n), strides=(s0,s0,s1))
樣品運行 -
In [42]: inArr
Out[42]:
array([[1, 2],
[3, 4],
[5, 6]])
In [43]: strided_axis0_backward(inArr, 2)
Out[43]:
array([[[1, 2],
[5, 6]],
[[3, 4],
[1, 2]],
[[5, 6],
[3, 4]]])
In [44]: strided_axis0_forward(inArr, 2)
Out[44]:
array([[[1, 2],
[3, 4]],
[[3, 4],
[5, 6]],
[[5, 6],
[1, 2]]])
運行試驗 -
In [53]: inArr = np.random.randint(0,9,(1000,10))
In [54]: %timeit make_timesteps(inArr, 2)
...: %timeit strided_axis0_forward(inArr, 2)
...: %timeit strided_axis0_backward(inArr, 2)
...:
10 loops, best of 3: 33.9 ms per loop
100000 loops, best of 3: 12.1 µs per loop
100000 loops, best of 3: 12.2 µs per loop
In [55]: %timeit make_timesteps(inArr, 10)
...: %timeit strided_axis0_forward(inArr, 10)
...: %timeit strided_axis0_backward(inArr, 10)
...:
1 loops, best of 3: 152 ms per loop
100000 loops, best of 3: 12 µs per loop
100000 loops, best of 3: 12.1 µs per loop
In [56]: 152000/12.1 # Speedup figure
Out[56]: 12561.98347107438
的strided_axis0
的定時保持不變,甚至當我們增加子陣列的長度在輸出中。這只是爲了向我們展示strides
帶來的巨大收益,當然還有瘋狂的加速也超過了原來的loopy版本。
截至一開始答應了,這裏的一對堆疊成本時序與np.vstack
-
In [417]: inArr = np.random.randint(0,9,(1000,10))
In [418]: L = 10
In [419]: %timeit np.vstack((inArr[-L+1:], inArr))
100000 loops, best of 3: 5.41 µs per loop
時序支持堆疊是一個很有效的一個的想法。
非常感謝 - 這確實有所幫助,我曾看過as_strided,但直到您的示例和鏈接都無法理解它!爲了獲得相同的順序,我用第一行加上最後一行,然後在第一行上加上np.flip。我編輯了這個問題來顯示我的最終代碼。 – nickyzee
@nickyzee我不認爲我明白你爲什麼需要'翻轉「。你的'make_timesteps'是正確的,因爲我編碼試圖產生與'make_timesteps'相同的結果。用你的翻頁建議,我的代碼產生的結果與'make_timesteps'不同。澄清這一點? – Divakar
有趣 - 翻轉是需要做出相同的輸出,因爲我從原來的。目視檢查輸出也證實了這一點。不知道爲什麼 - numpy @ latest和py3.5 - 我原來的返回數組([[[1,2], [3,4]], [[3,4], [5,6]] , [[5,6], [7,8], [[7,8], [1,2]]])' – nickyzee