我們可以利用numexpr
module有效地執行所有後面的算術運算作爲一個評估表達式。
因此,下列步驟操作:
C = cv2.subtract(ml, mrd)
C = cv2.pow(C,2)
C = np.divide(C, sigma_m)
C = p0 + (1-p0)**(-C)
可以通過一個表達式替換 -
import numexpr as ne
C = ne.evaluate('p0 +(1-p0)**(-((ml-mrd)**2)/sigma_m)')
讓我們確定的事情。原來的做法是FUNC -
def original_app(ml, mrd, sigma_m, p0):
C = cv2.subtract(ml, mrd)
C = cv2.pow(C,2)
C = np.divide(C, sigma_m)
C = p0 + (1-p0)**(-C)
return C
驗證 - 整個數據集的大小不同
In [28]: # Setup inputs
...: S = 1024 # Size parameter
...: ml = np.random.randint(0,255,(S,S))/255.0
...: mrd = np.random.randint(0,255,(S,S))/255.0
...: sigma_m = 0.45
...: p0 = 0.56
...:
In [29]: out1 = original_app(ml, mrd, sigma_m, p0)
In [30]: out2 = ne.evaluate('p0 +(1-p0)**(-((ml-mrd)**2)/sigma_m)')
In [31]: np.allclose(out1, out2)
Out[31]: True
計時 -
In [19]: # Setup inputs
...: S = 1024 # Size parameter
...: ml = np.random.randint(0,255,(S,S))/255.0
...: mrd = np.random.randint(0,255,(S,S))/255.0
...: sigma_m = 0.45
...: p0 = 0.56
...:
In [20]: %timeit original_app(ml, mrd, sigma_m, p0)
10 loops, best of 3: 67.1 ms per loop
In [21]: %timeit ne.evaluate('p0 +(1-p0)**(-((ml-mrd)**2)/sigma_m)')
100 loops, best of 3: 12.9 ms per loop
In [22]: # Setup inputs
...: S = 512 # Size parameter
In [23]: %timeit original_app(ml, mrd, sigma_m, p0)
100 loops, best of 3: 15.3 ms per loop
In [24]: %timeit ne.evaluate('p0 +(1-p0)**(-((ml-mrd)**2)/sigma_m)')
100 loops, best of 3: 3.39 ms per loop
In [25]: # Setup inputs
...: S = 256 # Size parameter
In [26]: %timeit original_app(ml, mrd, sigma_m, p0)
100 loops, best of 3: 3.65 ms per loop
In [27]: %timeit ne.evaluate('p0 +(1-p0)**(-((ml-mrd)**2)/sigma_m)')
1000 loops, best of 3: 878 µs per loop
圍繞5x
跨越各種尺寸加速與更大的陣列更好的加速!
此外,作爲一個側面說明,我會建議使用初始化數組,而不是像最後一步那樣追加。因此,我們可以在進入循環之前初始化out = np.zeros((len(d), width, height))
/np.empty
之類的內容,並在最後一步將輸入數組分配到:out[iteration_ID] = C
。
它也取決於你如何構建OpenCV,所以你可以發佈'getBuildInformation()'的輸出。 –
@MarkSetchell'cv2.getBuildInformation()'的輸出太大而無法在評論中寫入。你是否在想這個輸出的具體內容? – Mira