在3D

1

更快的矩陣運算更換順序積和在我目前的theano腳本的瓶頸是下面的代碼:在3D

import numpy as np 

axis = 0 
prob = np.random.random((1, 1000, 50)) 
cases = np.random.random((1000, 1000, 50)) 

start = time.time() 
for i in xrange(1000): 
    result = (cases * prob).sum(axis=1-axis, keepdims=True) 
print '3D naive method took {} seconds'.format(time.time() - start) 
print result.shape 
print 

我曾在2D情況下看到以點產品替代的elementwise +總和了我5倍加速。在這種情況下是否有任何矩陣操作可以幫助我?

編輯

Divakar給了我一個版本的基礎上einsum。但是,我的意圖是將其移植到theanoeinsum不支持theano。因此,歡迎使用theano

回答

1

我們可以用np.einsum -

result = np.einsum('ijk,ijk->ik', prob, cases)[:,None,:] 

另外一個與np.matmul -

result = np.matmul(prob.transpose(2,0,1), cases.T).T 

運行測試 -

In [70]: axis = 0 
    ...: prob = np.random.random((1, 1000, 50)) 
    ...: cases = np.random.random((1000, 1000, 50)) 
    ...: 

In [71]: out1 = (cases * prob).sum(axis=1-axis, keepdims=True) 

In [72]: out2 = np.einsum('ijk,ijk->ik', prob, cases)[:,None,:] 

In [73]: out3 = np.matmul(prob.transpose(2,0,1), cases.T).T 

In [74]: np.allclose(out1, out2) 
Out[74]: True 

In [75]: np.allclose(out1, out3) 
Out[75]: True 

In [76]: %timeit (cases * prob).sum(axis=1-axis, keepdims=True) 
10 loops, best of 3: 101 ms per loop 

In [77]: %timeit np.einsum('ijk,ijk->ik', prob, cases)[:,None,:] 
10 loops, best of 3: 44.1 ms per loop 

In [78]: %timeit np.matmul(prob.transpose(2,0,1), cases.T).T 
10 loops, best of 3: 44 ms per loop 
+0

Divakar,感謝你們答覆。我並不知道_einsum_,在我的電腦上加速更高。不過,我打算將此操作移植到_theano_上,但似乎_einsum_是少數幾個未實現的矩陣操作之一。有沒有其他的選擇? – Sharapolas

+0

@ Sharapolas ['Theano'中的'numpy.matmul](http://stackoverflow.com/questions/42169776/numpy-matmul-in-theano)與第二種方法? – Divakar

+0

PR中的解決方案似乎只是CPU? – Sharapolas