2015-11-14 114 views
1

這是一個循環,用於提取兩幅圖像的RGB值,並計算所有三個通道的平方差之和。 直接在我的main.py中運行這個代碼需要0.07秒。如果我在這個.pyx文件中運行,速度會降低到1秒。我已經讀過關於cdef的函數,但是我沒有成功傳遞數組。任何幫助將此功能轉換爲cdef函數將不勝感激。我真的需要這個循環以儘可能快的速度進行。高效地計算平方差之和

from cpython cimport array 
import array 
import numpy as np 
cimport numpy as np 

def fittnes(Orginal, Mutated): 

    Fittnes = 0 

    for x in range(0, 299): 

     for y in range(0, 299): 

      DeltaRed = (Orginal[x][y][0] - Mutated[x][y][0]) 
      DeltaGreen = (Orginal[x][y][1] - Mutated[x][y][1]) 
      DeltaBlue = (Orginal[x][y][2] - Mutated[x][y][2]) 

      Fittnes += (DeltaRed * DeltaRed + DeltaGreen * DeltaGreen + DeltaBlue * DeltaBlue) 

    return Fittnes 

我Main.py函數調用

NewScore = cythona.fittnes(numpy.array(Orginal), numpy.array(MutatedImage)) 
+0

哪裏是你用Cython代碼? –

+0

這是cython代碼。它從我的main.py中調用 – Funktiona

+0

如果輸入數組是NumPy數組,你可以只爲'超速('((Orginal - Mutated)** 2).sum()'。 – Divakar

回答

1

我感興趣瞭解的加速數字,所以我張貼這是一個解決方案。因此,如前所述/在評論中討論的,如果輸入的是NumPy的數組,你可以使用本地NumPy的工具,在這種情況下ndarray.sum(),像這樣 -

out = ((Orginal - Mutated)**2).sum() 

您還可以使用非常有效的np.einsum爲同樣的任務,像這樣 -

sub = Orginal - Mutated 
out = np.einsum('ijk,ijk->',sub,sub) 

運行測試

定義功能 -

def org_app(Orginal,Mutated): 
    Fittnes = 0 
    for x in range(0, Orginal.shape[0]): 
     for y in range(0, Orginal.shape[1]): 
      DR = (Orginal[x][y][0] - Mutated[x][y][0]) 
      DG = (Orginal[x][y][1] - Mutated[x][y][1]) 
      DB = (Orginal[x][y][2] - Mutated[x][y][2]) 
      Fittnes += (DR * DR + DG * DG + DB * DB) 
    return Fittnes 

def einsum_based(Orginal,Mutated): 
    sub = Orginal - Mutated 
    return np.einsum('ijk,ijk->',sub,sub) 

def dot_based(Orginal,Mutated): # @ali_m's suggestion 
    sub = Orginal - Mutated 
    return np.dot(sub.ravel(), sub.ravel()) 

def vdot_based(Orginal,Mutated): # variant of @ali_m's suggestion 
    sub = Orginal - Mutated 
    return np.vdot(sub, sub) 

計時 -

In [14]: M,N = 100,100 
    ...: Orginal = np.random.rand(M,N,3) 
    ...: Mutated = np.random.rand(M,N,3) 
    ...: 

In [15]: %timeit org_app(Orginal,Mutated) 
    ...: %timeit ((Orginal - Mutated)**2).sum() 
    ...: %timeit einsum_based(Orginal,Mutated) 
    ...: %timeit dot_based(Orginal,Mutated) 
    ...: %timeit vdot_based(Orginal,Mutated) 
    ...: 
10 loops, best of 3: 54.9 ms per loop 
10000 loops, best of 3: 112 µs per loop 
10000 loops, best of 3: 69.8 µs per loop 
10000 loops, best of 3: 86.2 µs per loop 
10000 loops, best of 3: 85.3 µs per loop 

In [16]: # Inputs 
    ...: M,N = 1000,1000 
    ...: Orginal = np.random.rand(M,N,3) 
    ...: Mutated = np.random.rand(M,N,3) 
    ...: 

In [17]: %timeit org_app(Orginal,Mutated) 
    ...: %timeit ((Orginal - Mutated)**2).sum() 
    ...: %timeit einsum_based(Orginal,Mutated) 
    ...: %timeit dot_based(Orginal,Mutated) 
    ...: %timeit vdot_based(Orginal,Mutated) 
    ...: 
1 loops, best of 3: 5.49 s per loop 
10 loops, best of 3: 63 ms per loop 
10 loops, best of 3: 23.9 ms per loop 
10 loops, best of 3: 24.9 ms per loop 
10 loops, best of 3: 24.9 ms per loop 
+0

...或者只是'np.dot(sub.ravel(),sub.ravel())' –

+0

@ali_m很好!這也行得通!添加這個解決方案?希望以任何方式將其添加到基準測試中。 – Divakar

+0

與einsum_based相比,np.dot(sub.ravel(),sub.ravel())有多快? – Funktiona