如何加快矩陣代碼

我有以下簡單的代碼，它可以估計h乘n二進制矩陣具有某個特性的概率。它在指數時間（這是不好下手）運行，但我很驚訝它甚至對於N = 12和h這麼慢= 9如何加快矩陣代碼

#!/usr/bin/python 

import numpy as np 
import itertools 

n = 12 
h = 9 

F = np.matrix(list(itertools.product([0,1],repeat = n))).transpose() 

count = 0 
iters = 100 
for i in xrange(iters): 
    M = np.random.randint(2, size=(h,n)) 
    product = np.dot(M,F) 
    setofcols = set() 
    for column in product.T: 
     setofcols.add(repr(column)) 
    if (len(setofcols)==2**n): 
     count = count + 1 
print count*1.0/iters

我一直在使用它異型N = 10和h = 7 。產量相當長，但這是需要更多時間的線路。

 23447867 function calls (23038179 primitive calls) in 35.785 seconds 

    Ordered by: standard name 

    ncalls tottime percall cumtime percall filename:lineno(function) 
     2 0.002 0.001 0.019 0.010 __init__.py:1(<module>) 
     1 0.001 0.001 0.054 0.054 __init__.py:106(<module>) 
     1 0.001 0.001 0.022 0.022 __init__.py:15(<module>) 
     2 0.003 0.002 0.013 0.006 __init__.py:2(<module>) 
     1 0.001 0.001 0.003 0.003 __init__.py:38(<module>) 
     1 0.001 0.001 0.001 0.001 __init__.py:4(<module>) 
     1 0.001 0.001 0.004 0.004 __init__.py:45(<module>) 
     1 0.001 0.001 0.002 0.002 __init__.py:88(<module>) 
    307200 0.306 0.000 1.584 0.000 _methods.py:24(_any) 
    102400 0.026 0.000 0.026 0.000 arrayprint.py:22(product) 
    102400 1.345 0.000 32.795 0.000 arrayprint.py:225(_array2string) 
307200/102400 1.166 0.000 33.350 0.000 arrayprint.py:335(array2string) 
    716800 0.820 0.000 1.162 0.000 arrayprint.py:448(_extendLine) 
204800/102400 1.699 0.000 5.090 0.000 arrayprint.py:456(_formatArray) 
    307200 0.651 0.000 22.510 0.000 arrayprint.py:524(__init__) 
    307200 11.783 0.000 21.859 0.000 arrayprint.py:538(fillFormat) 
    1353748 1.920 0.000 2.537 0.000 arrayprint.py:627(_digits) 
    102400 0.576 0.000 2.523 0.000 arrayprint.py:636(__init__) 
    716800 2.159 0.000 2.159 0.000 arrayprint.py:649(__call__) 
    307200 0.099 0.000 0.099 0.000 arrayprint.py:658(__init__) 
    102400 0.163 0.000 0.225 0.000 arrayprint.py:686(__init__) 
    102400 0.307 0.000 13.784 0.000 arrayprint.py:697(__init__) 
    102400 0.110 0.000 0.110 0.000 arrayprint.py:713(__init__) 
    102400 0.043 0.000 0.043 0.000 arrayprint.py:741(__init__) 
     1 0.003 0.003 0.003 0.003 chebyshev.py:87(<module>) 
     2 0.001 0.000 0.001 0.000 collections.py:284(namedtuple) 
     1 0.277 0.277 35.786 35.786 counterfeit.py:3(<module>) 
    205002 0.222 0.000 0.247 0.000 defmatrix.py:279(__array_finalize__) 
    102500 0.747 0.000 1.077 0.000 defmatrix.py:301(__getitem__) 
    102400 0.322 0.000 34.236 0.000 defmatrix.py:352(__repr__) 
    102400 0.100 0.000 0.508 0.000 fromnumeric.py:1087(ravel) 
    307200 0.382 0.000 2.829 0.000 fromnumeric.py:1563(any) 
     271 0.004 0.000 0.005 0.000 function_base.py:3220(add_newdoc) 
     1 0.003 0.003 0.003 0.003 hermite.py:59(<module>) 
     1 0.003 0.003 0.003 0.003 hermite_e.py:59(<module>) 
     1 0.001 0.001 0.002 0.002 index_tricks.py:1(<module>) 
     1 0.003 0.003 0.003 0.003 laguerre.py:59(<module>) 
     1 0.003 0.003 0.003 0.003 legendre.py:83(<module>) 
     1 0.001 0.001 0.001 0.001 linalg.py:10(<module>) 
     1 0.001 0.001 0.001 0.001 numeric.py:1(<module>) 
    102400 0.247 0.000 33.598 0.000 numeric.py:1365(array_repr) 
    204800 0.321 0.000 1.143 0.000 numeric.py:1437(array_str) 
    614400 1.199 0.000 2.627 0.000 numeric.py:2178(seterr) 
    614400 0.837 0.000 0.918 0.000 numeric.py:2274(geterr) 
    102400 0.081 0.000 0.186 0.000 numeric.py:252(asarray) 
    307200 0.259 0.000 0.622 0.000 numeric.py:322(asanyarray) 
     1 0.003 0.003 0.004 0.004 polynomial.py:54(<module>) 
    513130 0.134 0.000 0.134 0.000 {isinstance} 
    307229 0.075 0.000 0.075 0.000 {issubclass} 
5985327/5985305 0.595 0.000 0.595 0.000 {len} 
306988 0.120 0.000 0.120 0.000 {max} 
    102400 0.061 0.000 0.061 0.000 {method '__array__' of 'numpy.ndarray' objects} 
    102406 0.027 0.000 0.027 0.000 {method 'add' of 'set' objects} 
    307200 0.241 0.000 1.824 0.000 {method 'any' of 'numpy.ndarray' objects} 
    307200 0.482 0.000 0.482 0.000 {method 'compress' of 'numpy.ndarray' objects} 
    204800 0.035 0.000 0.035 0.000 {method 'item' of 'numpy.ndarray' objects} 
    102451 0.014 0.000 0.014 0.000 {method 'join' of 'str' objects} 
    102400 0.222 0.000 0.222 0.000 {method 'ravel' of 'numpy.ndarray' objects} 
    921176 3.330 0.000 3.330 0.000 {method 'reduce' of 'numpy.ufunc' objects} 
    102405 0.057 0.000 0.057 0.000 {method 'replace' of 'str' objects} 
    2992167 0.660 0.000 0.660 0.000 {method 'rstrip' of 'str' objects} 
    102400 0.041 0.000 0.041 0.000 {method 'splitlines' of 'str' objects} 
     6 0.003 0.000 0.003 0.001 {method 'sub' of '_sre.SRE_Pattern' objects} 
    307276 0.090 0.000 0.090 0.000 {min} 
     100 0.013 0.000 0.013 0.000 {numpy.core._dotblas.dot} 
    409639 0.473 0.000 0.473 0.000 {numpy.core.multiarray.array} 
    1228800 0.239 0.000 0.239 0.000 {numpy.core.umath.geterrobj} 
    614401 0.352 0.000 0.352 0.000 {numpy.core.umath.seterrobj} 
    102475 0.031 0.000 0.031 0.000 {range} 
    102400 0.076 0.000 0.102 0.000 {reduce} 
204845/102445 0.198 0.000 34.333 0.000 {repr}

矩陣的乘法似乎只佔很小的一部分時間。可以加快其餘的速度嗎？

結果

現在有三個答案，但一個似乎有一個bug當前。我測試了其餘的兩個，分別是n = 18，h = 11和iters = 10。

氣泡 - 21秒，185MB的RAM。「排序」16秒。
hpaulj - 7.5秒，130MB的RAM。「tolist」3秒。「numpy.core.multiarray.array」1.5秒，「genexpr」（'set'線）1.5秒。

有趣的是，矩陣乘法的時間仍然是總體時間的一小部分。

來源

2013-12-12 marshall

顯然是可以懷疑的加快速度。如果你看到arrayprint，你認爲哪部分是緩慢的部分;）。 – seberg

你應該閱讀我在以前的問題中提到的鏈接http://stackoverflow.com/questions/16970982/find-unique-rows-in-numpy-array爲更有效的唯一身份的方法計算。 'arrapyprint.fillFormat'就是'repr（column）'代碼中的一部分，所以它是最慢的部分。 – alko

設定的方法可能非常好。但是如果你知道dtype/row是安全的（包括連續性和字節順序），你可以使用'arr.tostring（）'。 – seberg

嘗試用

setofcols.add(tuple(column.A1.tolist()))

set替換repr(col)接受tuple。 column.A1是矩陣轉換爲1d陣列。元組是事遂所願(0, 1, 0)，這set可以輕鬆地比較。

就免去了昂貴的repr格式的LOP掉大量的時間（25X加速）。

編輯

通過創建和填充set在一個聲明中我得到了10倍的進一步加快。在我的測試中，它比矢量化的bubble's快兩倍。

count = 0 
for i in xrange(iters): 
    M = np.random.randint(2, size=(h,n)) 
    product = np.dot(M,F) 
    setofcols = set(tuple(x) for x in product.T.tolist()) 
    # or {tuple(x) for x in product.T.tolist()} if new enough Python 
    if (len(setofcols)==2**n): 
     count += 1 
     # print M # to see the unique M 
print count*1.0/iters

編輯

這裏的東西甚至更快 - 變換的9個整數每列進1，使用dot([1,10,100,...],column)。然後應用np.unique（或set）爲整數的列表。這是一個2-3倍的進一步加速。

count = 0 
X = 10**np.arange(h) 
for i in xrange(iters): 
    M = np.random.randint(2, size=(h,n)) 
    product = np.dot(M,F) 
    setofcols = np.unique(np.dot(X,product).A1) 
    if (setofcols.size==2**n): 
     count += 1 
print count*1.0/iters

有了這個頂部電話是

200 0.201 0.001 0.204 0.001 {numpy.core._dotblas.dot} 
    100 0.026 0.000 0.026 0.000 {method 'sort' of 'numpy.ndarray' objects} 
    100 0.007 0.000 0.035 0.000 arraysetops.py:93(unique)

來源

2013-12-12 22:18:29 hpaulj

謝謝。它仍然花費一小部分時間乘以矩陣。我得到了不同的參數n和h和iters = 2，「39 6.465 0.166 6.465 0.166 {numpy.core.multiarray.array}」，「2 3.803 1.902 3.803 1.902 {'numpy.ndarray'objects} 「，」2 1.072 0.536 1.072 0.536 {numpy.core._dotblas.dot}「 – marshall

'np.dot' usng BLAS是'numpy'中最高效的（時間明智的）操作之一。所以我並不感到驚訝，'tolist'需要更長的時間。即使你刪除了'set'操作，我懷疑仍有大量'array'調用。 – hpaulj

IPython中的分析顯示'multiarray.array'在生成'F'時被調用，而不是在'iters'循環中。 – hpaulj

爲了加速上面的代碼，你應該避免循環。

import numpy as np 
import itertools 

def unique_rows(a): 
    a = np.ascontiguousarray(a) 
    unique_a = np.unique(a.view([('', a.dtype)]*a.shape[1])) 
    return unique_a.view(a.dtype).reshape((unique_a.shape[0], a.shape[1])) 


n = 12 
h = 9 
iters=100 
F = np.matrix(list(itertools.product([0,1],repeat = n))).transpose() 
M = np.random.randint(2, size=(h*iters,n)) 
product = np.dot(M,F) 
counts = map(lambda x: len(unique_rows(x.T))==2**n, np.split(product,iters,axis=0)) 
prob=float(sum(counts))/iters 

#All unique submatrices M (hxn) with the sophisticated property... 
[np.split(M,iters,axis=0)[j] for j in range(len(counts)) if counts[j]==True]

來源

2013-12-12 12:45:36 bubble

代碼**中的瓶頸是**緩慢的unique_rows實現。 – alko

這些解決方案很慢？ http://stackoverflow.com/questions/16970982/find-unique-rows-in-numpy-array – bubble

不，這些鏈接我指的是太多，但你的回答不包含任何方式的，所以它不會增加價值OP問題 – alko

由於ALKO和seberg指出，你正在失去大量的時間轉換您的陣列到大串將它們存儲在您的一組列。

如果我正確理解了您的代碼，您試圖查找product矩陣中不同列的數量是否等於此矩陣的長度。您可以通過排序，並尋找差異從一列到下做到這一點很容易：

D = (np.diff(np.sort(product.T, axis=0), axis=0) == 0)

這會給你一個布爾值D的矩陣。然後，您可以看到無論是在至少一種元素的變化從一個列到下一個：

C = (1 - np.prod(D, axis=1)) # i.e. 'not all(D[i,:]) for all i'

然後，您只需把看到的值是否all是不同的：

hasproperty = np.all(C)

，讓你完整代碼：

def f(n, h, iters): 
    F = np.array(list(itertools.product([0,1], repeat=n))).T 
    counts = [] 
    for _ in xrange(iters): 
     M = np.random.randint(2, size=(h,n)) 
     product = M.dot(F) 
     D = (np.diff(np.sort(product.T, axis=1), axis=0) == 0) 
     C = (1 - np.prod(D, axis=1)) 
     hasproperty = np.all(C) 
     counts.append(1. if hasproperty else 0.) 
    return np.mean(counts)

大約需要8s的f(12, 9, 100)。

如果你喜歡滑稽的表情緊湊：

def g(n, h, iters): 
    F = np.array(list(itertools.product([0,1], repeat=n))).T 
    return np.mean([np.all(1 - np.prod(np.diff(np.sort(np.random.randint(2,size=(h,n)).dot(F).T, axis=1), axis=0)==0, axis=1)) for _ in xrange(iters)])

時序它給：

>>> setup = """import numpy as np 
def g(n, h, iters): 
    F = np.array(list(itertools.product([0,1], repeat=n))).T 
    return np.mean([np.all(1 - np.prod(np.diff(np.sort(np.random.randint(2,size=(h,n)).dot(F).T, axis=1), axis=0)==0, axis=1)) for _ in xrange(iters)]) 
""" 
>>> timeit.timeit('g(10, 7, 100)', setup=setup, number=10) 
17.358669997900734 
>>> timeit.timeit('g(10, 7, 100)', setup=setup, number=50) 
83.06966196163967

或者近似1。每通電話至g(10,7,100)。

來源

2013-12-12 14:02:07 val

謝謝，但我認爲f至少有一些問題。當你做f（12,8,100）時，你得到0.0。但是使用@ bubble的代碼你沒有。 – marshall

的確，有一個bug：'product.T'必須沿'axis = 1'分類。這是糾正，感謝您的反饋:) – val

恐怕還是有一個錯誤。嘗試n = 5，h = 4和iters = 100。輸出應該大致爲0.07。 – marshall

如何加快矩陣代碼

回答

相關問題