沒有複製陣列的numpy roll的替代

我正在做類似下面的代碼，我對np.roll（）函數的性能不滿意。我總結了baseArray和其他數組，其中baseArray在每次迭代中由一個元素滾動。但我不需要baseArray的副本，當我滾動它時，我寧願選擇一個視圖，例如當我將baseArray和其他數組和和，然後baseArray的第二個元素與第0個元素相加otherArray，baseArray的第3個元素與其他Array的第1個元素相加。沒有複製陣列的numpy roll的替代

IE以獲得與np.roll（）相同的結果，但不復制數組。

import numpy as np 
from numpy import random 
import cProfile 

def profile(): 
    baseArray = np.zeros(1000000) 
    for i in range(1000): 
     baseArray= np.roll(baseArray,1) 
     otherArray= np.random.rand(1000000) 
     baseArray=baseArray+otherArray 

cProfile.run('profile()')

輸出（注3行 - 滾動功能）：

  9005 function calls in 26.741 seconds 

    Ordered by: standard name 

    ncalls tottime percall cumtime percall filename:lineno(function) 
     1 5.123 5.123 26.740 26.740 <ipython-input-101-9006a6c0d2e3>:5(profile) 
     1 0.001 0.001 26.741 26.741 <string>:1(<module>) 
    1000 0.237 0.000 8.966 0.009 numeric.py:1327(roll) 
    1000 0.004 0.000 0.005 0.000 numeric.py:476(asanyarray) 
     1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 
    1000 12.650 0.013 12.650 0.013 {method 'rand' of 'mtrand.RandomState' objects} 
    1000 0.005 0.000 0.005 0.000 {method 'reshape' of 'numpy.ndarray' objects} 
    1000 6.390 0.006 6.390 0.006 {method 'take' of 'numpy.ndarray' objects} 
    2000 1.345 0.001 1.345 0.001 {numpy.core.multiarray.arange} 
    1000 0.001 0.000 0.001 0.000 {numpy.core.multiarray.array} 
    1000 0.985 0.001 0.985 0.001 {numpy.core.multiarray.concatenate} 
     1 0.000 0.000 0.000 0.000 {numpy.core.multiarray.zeros} 
     1 0.000 0.000 0.000 0.000 {range}

來源

2016-03-10 Marcel

我敢肯定這是不可能避免的副本due to the way in which numpy arrays are represented internally。一個數組由一個連續的內存地址塊和一些元數據組成，這些元數據包括數組維度，項目大小以及每個維度（「步幅」）的元素之間的分隔。向前或向後「滾動」每個元素需要沿着相同的維度具有不同的長度跨度，這是不可能的。

這就是說，你可以用切片的索引避免複製所有，但一個元素baseArray：

import numpy as np 

def profile1(seed=0): 
    gen = np.random.RandomState(seed) 
    baseArray = np.zeros(1000000) 
    for i in range(1000): 
     baseArray= np.roll(baseArray,1) 
     otherArray= gen.rand(1000000) 
     baseArray=baseArray+otherArray 
    return baseArray 

def profile2(seed=0): 
    gen = np.random.RandomState(seed) 
    baseArray = np.zeros(1000000) 
    for i in range(1000): 
     otherArray = gen.rand(1000000) 
     tmp1 = baseArray[:-1]    # view of the first n-1 elements 
     tmp2 = baseArray[-1]    # copy of the last element 
     baseArray[1:]=tmp1+otherArray[1:] # write the last n-1 elements 
     baseArray[0]=tmp2+otherArray[0]  # write the first element 
    return baseArray

這會給相同的結果：

In [1]: x1 = profile1() 

In [2]: x2 = profile2() 

In [3]: np.allclose(x1, x2) 
Out[3]: True

在實踐中不存在在性能上差別很大：

In [4]: %timeit profile1() 
1 loop, best of 3: 23.4 s per loop 

In [5]: %timeit profile2() 
1 loop, best of 3: 17.3 s per loop

來源

2016-03-10 14:31:38

謝謝。只是一個評論：實際上在性能上存在差異，因爲您測量的23.4和17.3秒包括生成隨機數（我真的不用真實世界算法），如果您只比較np.roll（）性能，例如通過在for循環之前放置其他數組創建，那麼對我來說，時間是14比4秒。 – Marcel

沒有複製陣列的numpy roll的替代

回答

相關問題