避免在numpy操作中隱式轉換爲矩陣

是否有辦法避免matrix出現在numpy計算的任何結果中？例如，當前如果您有x作爲numpy.ndarray和y作爲scipy.sparse.csc_matrix，並且您說x += y,x之後將成爲matrix。有沒有辦法來防止這種情況的發生，即保持x爲ndarray，更一般地說，在生產matrix的所有地方都繼續使用ndarray？避免在numpy操作中隱式轉換爲矩陣

來源

2016-02-22 shaoyl85

我加了scipy標籤，這是一個scipy.sparse的問題，而不是np.matrix之一。

In [250]: y=sparse.csr_matrix([[0,1],[1,0]]) 
In [251]: x=np.arange(2) 
In [252]: y+x 
Out[252]: 
matrix([[0, 2], 
     [1, 1]])

稀疏+陣列=>矩陣

（作爲邊注，np.matrix是np.ndarray一個子類。sparse.csr_matrix不是一個子類。它有許多numpy的像的操作，但它在實現這些其自己的代碼）。

In [255]: x += y 
In [256]: x 
Out[256]: 
matrix([[0, 2], 
     [1, 1]])

從技術上講這不應該發生;實際上它正在做x = x+y爲x分配一個新值，而不僅僅是修改x。

如果我首先將y轉換爲常規緻密matrix，則會出現錯誤。允許該操作會將1d數組更改爲2d數組。

In [258]: x += y.todense() 
... 
ValueError: non-broadcastable output operand with shape (2,) doesn't match the broadcast shape (2,2)

更改x到2D允許除了繼續 - 在不改變陣列以矩陣：

In [259]: x=np.eye(2) 
In [260]: x 
Out[260]: 
array([[ 1., 0.], 
     [ 0., 1.]]) 
In [261]: x += y.todense() 
In [262]: x 
Out[262]: 
array([[ 1., 1.], 
     [ 1., 1.]])

通常，進行加法/減法稀疏矩陣是棘手的。它們被設計用於矩陣乘法。乘法不會像添加一樣改變稀疏性。例如y+1使它變得緻密。如果沒有深入瞭解稀疏加法編碼的細節，我會說 - 不要嘗試這個x+=...操作，而不要先將y轉換爲密集版本。我想不出有一個很好的理由不這樣做。

（我應該檢查scipygithub在這方面的錯誤問題）。

scipy/sparse/compressed.py有csr附加代碼。 x+y使用x.__add__(y)但有時會翻轉到y.__add__(x)。 x+=y使用x.__iadd__(y)。所以我可能還需要檢查__iadd__的ndarray。

但基本除了換了稀疏矩陣是：

def __add__(self,other): 
    # First check if argument is a scalar 
    if isscalarlike(other): 
     if other == 0: 
      return self.copy() 
     else: # Now we would add this scalar to every element. 
      raise NotImplementedError('adding a nonzero scalar to a ' 
             'sparse matrix is not supported') 
    elif isspmatrix(other): 
     if (other.shape != self.shape): 
      raise ValueError("inconsistent shapes") 

     return self._binopt(other,'_plus_') 
    elif isdense(other): 
     # Convert this matrix to a dense matrix and add them 
     return self.todense() + other 
    else: 
     return NotImplemented

所以y+x變得y.todense() + x。而x+y使用相同的東西。

不管+=的詳細信息，很明顯，向密集（數組或np.matrix）添加稀疏涉及將稀疏轉換爲密集。沒有代碼可以遍歷稀疏值並將這些選擇性地添加到密集數組中。

只有在數組都稀疏時才執行特殊的稀疏加法。 y+y工作，返回一個稀疏。 y+=y由sparse.base.__iadd__以NotImplmenentedError失敗。

這是我想出來的，試圖增加y到(2,2)陣列的不同方式的最好的診斷過程。

In [348]: x=np.eye(2) 
In [349]: x+y 
Out[349]: 
matrix([[ 1., 1.], 
     [ 1., 1.]]) 
In [350]: x+y.todense() 
Out[350]: 
matrix([[ 1., 1.], 
     [ 1., 1.]])

加成產生的矩陣，但值可被寫入到x而不改變x類（或形狀）

In [351]: x[:] = x+y 
In [352]: x 
Out[352]: 
array([[ 1., 1.], 
     [ 1., 1.]])

+=具有致密矩陣不相同：

In [353]: x += y.todense() 
In [354]: x 
Out[354]: 
array([[ 1., 2.], 
     [ 2., 1.]])

但+=sparse中的東西改變了類x

In [355]: x += y 
In [356]: x 
Out[356]: 
matrix([[ 1., 3.], 
     [ 3., 1.]])

進一步的測試和尋找id(x)和x.__array_interface__很顯然，x += y取代x。即使x以np.matrix開頭，情況也是如此。所以稀疏+=不是就地操作。 x += y.todense()是一個就地操作。

來源

2016-02-22 07:45:16 hpaulj

謝謝！當稀疏矩陣非常稀疏時，使矩陣密集可能會導致更多的計算。在我的情況下，我反覆在稠密矩陣上添加大量稀疏矩陣來累積最終稠密矩陣。如果每個添加費用更高，總費用將非常耗時。 – shaoyl85

查看我的編輯;任何與密集的加法將涉及'todense'轉換。爲了避免將所有稀疏添加到一起，並在最後將它們添加到密集。 – hpaulj

是的，這是一個錯誤;但https://github.com/scipy/scipy/issues/7826說

我真的沒有辦法改變這種情況。

的 X += c * Y沒有 todense如下。
一些 inc(various array/matrix, various sparse) 已經過測試，但肯定不是全部。

def inc(X, Y, c=1.): 
    """ X += c * Y, X Y sparse or dense """ 
    if (not hasattr(X, "indices") # dense += sparse 
    and hasattr(Y, "indices")): 
     # inc an ndarray view, because ndarry += sparse -> matrix -- 
     X = getattr(X, "A", X).squeeze() 
     X[Y.indices] += c * Y.data 
    else: 
     X += c * Y # sparse + different sparse: SparseEfficiencyWarning 
    return X

來源

2017-09-13 10:05:39 denis

避免在numpy操作中隱式轉換爲矩陣

回答

相關問題