Scipy稀疏矩陣特殊減法

我正在做一個項目，我正在做很多矩陣計算。Scipy稀疏矩陣特殊減法

我正在尋找一種智能方式來加速我的代碼。在我的項目中，我正在處理大小爲100Mx1M的稀疏矩陣，其中包含大約10M的非零值。下面的例子只是爲了看到我的觀點。

比方說，我有：

大小（2）
大小的矢量c的矢量V（3）

大小的稀疏矩陣X（2,3）

v = np.asarray([10, 20]) 
c = np.asarray([ 2, 3, 4]) 
data = np.array([1, 1, 1, 1]) 
row = np.array([0, 0, 1, 1]) 
col = np.array([1, 2, 0, 2]) 
X = coo_matrix((data,(row,col)), shape=(2,3)) 
X.todense() 
# matrix([[0, 1, 1], 
#   [1, 0, 1]])

目前我正在做：

result = np.zeros_like(v) 
d = scipy.sparse.lil_matrix((v.shape[0], v.shape[0])) 
d.setdiag(v) 
tmp = d * X 

print tmp.todense() 
#matrix([[ 0., 10., 10.], 
#  [ 20., 0., 20.]]) 
# At this point tmp is csr sparse matrix 

for i in range(tmp.shape[0]): 
    x_i = tmp.getrow(i) 
    result += x_i.data * (c[x_i.indices] - x_i.data) 
    # I only want to do the subtraction on non-zero elements  

print result 
# array([-430, -380])

而我的問題是for循環，特別是減法。我想找到一種方法來通過僅減去非零元素來向量化此操作。

東西直接獲得減法稀疏矩陣：

matrix([[ 0., -7., -6.], 
     [ -18., 0., -16.]])

是否有辦法巧妙地做到這一點？

來源

2013-09-26 ThiS

減法？ – denis

你不需要遍歷行來做你已經做的事情。你也可以使用類似的伎倆第一向量執行行的乘法：

import scipy.sparse as sps 

# number of nonzero entries per row of X 
nnz_per_row = np.diff(X.indptr) 
# multiply every row by the corresponding entry of v 
# You could do this in-place as: 
# X.data *= np.repeat(v, nnz_per_row) 
Y = sps.csr_matrix((X.data * np.repeat(v, nnz_per_row), X.indices, X.indptr), 
        shape=X.shape) 

# subtract from the non-zero entries the corresponding column value in c... 
Y.data -= np.take(c, Y.indices) 
# ...and multiply by -1 to get the value you are after 
Y.data *= -1

一看就知道它的工作原理，設置一些假數據

rows, cols = 3, 5 
v = np.random.rand(rows) 
c = np.random.rand(cols) 
X = sps.rand(rows, cols, density=0.5, format='csr')

和後運行上面的代碼：

>>> x = X.toarray() 
>>> mask = x == 0 
>>> x *= v[:, np.newaxis] 
>>> x = c - x 
>>> x[mask] = 0 
>>> x 
array([[ 0.79935123, 0.  , 0.  , -0.0097763 , 0.59901243], 
     [ 0.7522559 , 0.  , 0.67510109, 0.  , 0.36240006], 
     [ 0.  , 0.  , 0.72370725, 0.  , 0.  ]]) 
>>> Y.toarray() 
array([[ 0.79935123, 0.  , 0.  , -0.0097763 , 0.59901243], 
     [ 0.7522559 , 0.  , 0.67510109, 0.  , 0.36240006], 
     [ 0.  , 0.  , 0.72370725, 0.  , 0.  ]])

積累結果的方式要求每行都有相同數量的非零條目，這似乎是一件很奇怪的事情。你確定這是你的後？如果這真的是你想要的，你可以拿到價值的東西，如：

result = np.sum(Y.data.reshape(Y.shape[0], -1), axis=0)

，但我很難相信這是真的，你是什麼後...在標題

來源

2013-09-26 02:15:42 Jaime

對於稀疏矩陣執行減列操作或按列標準進行縮放來說，這是一個多麼美麗的解決方案。非常專業和知識豐富的迴應！你拯救我的一天！ –

Scipy稀疏矩陣特殊減法

回答

相關問題