2016-07-22 70 views
3

我發現scipy.sparse.csr_matrix中出現了意外的行爲,這對我來說似乎是一個bug。任何人都可以證實這是不正常的?我不是稀疏結構的專家,所以我可能會誤解正確的用法。Scipy sparse csr matrix在0.0/1.0時返回nan

>>> import scipy.sparse 
>>> a=scipy.sparse.csr_matrix((1,1)) 
>>> b=scipy.sparse.csr_matrix((1,1)) 
>>> b[0,0]=1 
/home/marco/anaconda3/envs/py35/lib/python3.5/site-packages/scipy/sparse/compressed.py:730: SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient. 
    SparseEfficiencyWarning) 
>>> a/b 
matrix([[ nan]]) 

在另一方面,numpy的妥善處理這樣的:

>>> import numpy as np 
>>> a=np.zeros((1,1)) 
>>> b=np.ones((1,1)) 
>>> a/b 
array([[ 0.]]) 

感謝

+1

你試過用'(a/b).toarray()'嗎? –

+0

對我來說看起來像一個bug。 –

+0

'(a/b).tolist()'返回'[[nan]]'。 'a/b'是矩陣類型的,所以沒有'toarray'或'todense'。 – marcotama

回答

1

對於稀疏矩陣/稀疏矩陣中,

SciPy的/稀疏/ compressed.py

if np.issubdtype(r.dtype, np.inexact): 
     # Eldiv leaves entries outside the combined sparsity 
     # pattern empty, so they must be filled manually. They are 
     # always nan, so that the matrix is completely full. 
     out = np.empty(self.shape, dtype=self.dtype) 
     out.fill(np.nan) 
     r = r.tocoo() 
     out[r.row, r.col] = r.data 
     out = np.matrix(out) 

該操作在本節中介紹。

與稍大矩陣

In [69]: a=sparse.csr_matrix([[1.,0],[0,1]]) 
In [70]: b=sparse.csr_matrix([[1.,1],[0,1]]) 
In [72]: (a/b) 
Out[72]: 
matrix([[ 1., nan], 
     [ nan, 1.]]) 

那麼,曾經a有0(無疏值),劃分爲nan試試這個。它返回一個密集的矩陣,並填入nan

如果沒有此代碼,稀疏元素除以元素除法會產生一個稀疏矩陣,其中的'空'離開對角線時隙。

In [73]: a._binopt(b,'_eldiv_') 
Out[73]: 
<2x2 sparse matrix of type '<class 'numpy.float64'>' 
    with 2 stored elements in Compressed Sparse Row format> 
In [74]: a._binopt(b,'_eldiv_').A 
Out[74]: 
array([[ 1., 0.], 
     [ 0., 1.]]) 

逆可能是有益的

In [76]: b/a 
Out[76]: 
matrix([[ 1., inf], 
     [ nan, 1.]]) 
In [77]: b._binopt(a,'_eldiv_').A 
Out[77]: 
array([[ 1., inf], 
     [ 0., 1.]]) 

它看起來像combined sparsity pattern是由分子決定。在進一步的測試看起來像這樣eliminate_zeros之後。

In [138]: a1=sparse.csr_matrix(np.ones((2,2))) 
In [139]: a1 
Out[139]: 
<2x2 sparse matrix of type '<class 'numpy.float64'>' 
    with 4 stored elements in Compressed Sparse Row format> 
In [140]: a1[0,1]=0 
In [141]: a1 
Out[141]: 
<2x2 sparse matrix of type '<class 'numpy.float64'>' 
    with 4 stored elements in Compressed Sparse Row format> 
In [142]: a1/b 
Out[142]: 
matrix([[ 1., nan], 
     [ inf, 1.]]) 
+0

是的,這是錯誤的原因。我在這裏提交了一個修復:https://github.com/scipy/scipy/pull/6405 – perimosocordiae