使用Parakeet優化Python函數

我需要此功能進行優化，因爲我試圖讓我的OpenGL模擬運行速度更快。我想使用Parakeet，但我無法完全理解我需要如何修改下面的代碼才能這樣做。你能看到我應該做什麼嗎？使用Parakeet優化Python函數

def distanceMatrix(self,x,y,z): 
    " ""Computes distances between all particles and places the result in a matrix such that the ij th matrix entry corresponds to the distance between particle i and j"" " 
    xtemp = tile(x,(self.N,1)) 
    dx = xtemp - xtemp.T 
    ytemp = tile(y,(self.N,1)) 
    dy = ytemp - ytemp.T 
    ztemp = tile(z,(self.N,1)) 
    dz = ztemp - ztemp.T 

    # Particles 'feel' each other across the periodic boundaries 
    if self.periodicX: 
     dx[dx>self.L/2]=dx[dx > self.L/2]-self.L 
     dx[dx<-self.L/2]=dx[dx < -self.L/2]+self.L 
    if self.periodicY: 
     dy[dy>self.L/2]=dy[dy>self.L/2]-self.L 
     dy[dy<-self.L/2]=dy[dy<-self.L/2]+self.L 
    if self.periodicZ: 
     dz[dz>self.L/2]=dz[dz>self.L/2]-self.L 
     dz[dz<-self.L/2]=dz[dz<-self.L/2]+self.L 

    # Total Distances 
    d = sqrt(dx**2+dy**2+dz**2) 

    # Mark zero entries with negative 1 to avoid divergences 
    d[d==0] = -1 

    return d, dx, dy, dz

從我可以告訴，鸚鵡應該能夠使用上面的功能沒有改變 - 它僅使用NumPy的和數學。但是，我總是從鸚鵡JIT包裝調用函數時出現以下錯誤：

AssertionError: Unsupported function: <bound method Particles.distanceMatrix of <particles.Particles instance at 0x04CD8E90>>

來源

2013-11-23 bogdan

你也可以做的一種可能的優化方法是用一個局部變量替換'self.L/2'的12個用法。這可能會做點什麼。 :-)（雖然這對鸚鵡問題沒有幫助......抱歉） –

我猜布爾索引不支持，但我不確定。我知道分配像xtemp和dx L/2這樣的中間數組是浪費的。儘管你的向量化代碼可能是常規numpy中的快速代碼，但使用JIT編譯器，最好使用鈍的for-loops！ – 2013-11-24 14:49:55

鸚鵡還很年輕，其NumPy的支持是不完整的，並且你的代碼倒是幾個特點，還不能正常工作。

1）你正在包裝一個方法，而Parakeet到目前爲止只知道如何處理函數。常用的解決方法是創建一個@jit包裝的幫助函數，並將您的方法調用到所有必需的成員數據中。方法不起作用的原因是將有意義的類型分配給「自我」是不平凡的。這不是不可能的，但棘手的是，方法不會進入長尾鸚鵡，直到下垂果實被採摘爲止。說起低掛的水果......

2）布爾索引。尚未實施，但將在下一個版本中發佈。

3）np.tile：也行不通，也可能會在下一個版本。如果您想查看哪些內置函數和NumPy庫函數可以工作，請查看Parakeet的mappings模塊。

我重寫你的代碼是一個小友好給鸚鵡：

@jit 
def parakeet_dist(x, y, z, L, periodicX, periodicY, periodicZ): 
    # perform all-pairs computations more explicitly 
    # instead of tile + broadcasting 
    def periodic_diff(x1, x2, periodic): 
    diff = x1 - x2 
    if periodic: 
     if diff > (L/2): diff -= L 
     if diff < (-L/2): diff += L 
    return diff 
    dx = np.array([[periodic_diff(x1, x2, periodicX) for x1 in x] for x2 in x]) 
    dy = np.array([[periodic_diff(y1, y2, periodicY) for y1 in y] for y2 in y]) 
    dz = np.array([[periodic_diff(z1, z2, periodicZ) for z1 in z] for z2 in z]) 
    d= np.sqrt(dx**2 + dy**2 + dz**2) 

    # since we can't yet use boolean indexing for masking out zero distances 
    # have to fall back on explicit loops instead 
    for i in xrange(len(x)): 
    for j in xrange(len(x)): 
     if d[i,j] == 0: d[i,j] = -1 
    return d, dx, dy, dz

在我的機器本只運行〜3倍的速度比NumPy的爲N = 2000（0.39s爲NumPy的對比0.14s的鸚鵡）。如果我重寫遍歷數組更明確地使用循環則表現上升到6倍〜比NumPy的更快（鸚鵡運行在〜0.06S）：

@jit 
def loopy_dist(x, y, z, L, periodicX, periodicY, periodicZ): 
    N = len(x) 
    dx = np.zeros((N,N)) 
    dy = np.zeros((N,N)) 
    dz = np.zeros((N,N)) 
    d = np.zeros((N,N)) 

    def periodic_diff(x1, x2, periodic): 
    diff = x1 - x2 
    if periodic: 
     if diff > (L/2): diff -= L 
     if diff < (-L/2): diff += L 
    return diff 

    for i in xrange(N): 
    for j in xrange(N): 
     dx[i,j] = periodic_diff(x[j], x[i], periodicX) 
     dy[i,j] = periodic_diff(y[j], y[i], periodicY) 
     dz[i,j] = periodic_diff(z[j], z[i], periodicZ) 
     d[i,j] = dx[i,j] ** 2 + dy[i,j] ** 2 + dz[i,j] ** 2 
     if d[i,j] == 0: d[i,j] = -1 
     else: d[i,j] = np.sqrt(d[i,j]) 
    return d, dx, dy, dz

隨着一點點的創意重寫你也可以得到上面的代碼在Numba運行，但它只比NumPy（0.25秒）快1.5倍。編譯時間是Parakeet w/comprehensions：1秒，Parakeet w /循環：0.5秒，Numba w /循環：0.9秒。

希望接下來的幾個版本能夠更多地使用NumPy庫函數，但現在理解或循環往往是要走的路。

來源

2013-11-24 19:34:47

謝謝！儘管你也在reddit上回應了這個問題，但我恢復了這個帳戶以提高你的答案。遲到總比不到好。 – bogdan

使用Parakeet優化Python函數

回答

相關問題