我感興趣瞭解的加速數字,所以我張貼這是一個解決方案。因此,如前所述/在評論中討論的,如果輸入的是NumPy的數組,你可以使用本地NumPy的工具,在這種情況下ndarray.sum()
,像這樣 -
out = ((Orginal - Mutated)**2).sum()
您還可以使用非常有效的np.einsum
爲同樣的任務,像這樣 -
sub = Orginal - Mutated
out = np.einsum('ijk,ijk->',sub,sub)
運行測試
定義功能 -
個
def org_app(Orginal,Mutated):
Fittnes = 0
for x in range(0, Orginal.shape[0]):
for y in range(0, Orginal.shape[1]):
DR = (Orginal[x][y][0] - Mutated[x][y][0])
DG = (Orginal[x][y][1] - Mutated[x][y][1])
DB = (Orginal[x][y][2] - Mutated[x][y][2])
Fittnes += (DR * DR + DG * DG + DB * DB)
return Fittnes
def einsum_based(Orginal,Mutated):
sub = Orginal - Mutated
return np.einsum('ijk,ijk->',sub,sub)
def dot_based(Orginal,Mutated): # @ali_m's suggestion
sub = Orginal - Mutated
return np.dot(sub.ravel(), sub.ravel())
def vdot_based(Orginal,Mutated): # variant of @ali_m's suggestion
sub = Orginal - Mutated
return np.vdot(sub, sub)
計時 -
In [14]: M,N = 100,100
...: Orginal = np.random.rand(M,N,3)
...: Mutated = np.random.rand(M,N,3)
...:
In [15]: %timeit org_app(Orginal,Mutated)
...: %timeit ((Orginal - Mutated)**2).sum()
...: %timeit einsum_based(Orginal,Mutated)
...: %timeit dot_based(Orginal,Mutated)
...: %timeit vdot_based(Orginal,Mutated)
...:
10 loops, best of 3: 54.9 ms per loop
10000 loops, best of 3: 112 µs per loop
10000 loops, best of 3: 69.8 µs per loop
10000 loops, best of 3: 86.2 µs per loop
10000 loops, best of 3: 85.3 µs per loop
In [16]: # Inputs
...: M,N = 1000,1000
...: Orginal = np.random.rand(M,N,3)
...: Mutated = np.random.rand(M,N,3)
...:
In [17]: %timeit org_app(Orginal,Mutated)
...: %timeit ((Orginal - Mutated)**2).sum()
...: %timeit einsum_based(Orginal,Mutated)
...: %timeit dot_based(Orginal,Mutated)
...: %timeit vdot_based(Orginal,Mutated)
...:
1 loops, best of 3: 5.49 s per loop
10 loops, best of 3: 63 ms per loop
10 loops, best of 3: 23.9 ms per loop
10 loops, best of 3: 24.9 ms per loop
10 loops, best of 3: 24.9 ms per loop
哪裏是你用Cython代碼? –
這是cython代碼。它從我的main.py中調用 – Funktiona
如果輸入數組是NumPy數組,你可以只爲'超速('((Orginal - Mutated)** 2).sum()'。 – Divakar