2017-08-07 79 views
3

我想解析一些真實的數據到一個.mat對象來加載我的腳本。Python創建一個空的稀疏矩陣

我收到此錯誤:

TypeError: 'coo_matrix' object does not support item assignment

我發現coo_matrix。但是,我無法爲其分配值。

的data.txt

10 45 
11 12 
4 1 

我想獲得尺寸100×100 稀疏矩陣。並指定1對

Mat(10, 45) = 1 
Mat(11, 12) = 1 
Mat(4, 1) = 1 

CODE

import numpy as np 
from scipy.sparse import coo_matrix 

def pdata(pathToFile): 
    M = coo_matrix(100, 100) 
    with open(pathToFile) as f: 
     for line in f: 
      s = line.split() 
      x, y = [int(v) for v in s] 
      M[x, y] = 1  
    return M 

if __name__ == "__main__": 
    M = pdata('small.txt') 

任何建議嗎?

+0

'coo_matrix'獲取數據參數。檢查它的文檔。 – hpaulj

回答

2

構建這個矩陣coo_matrix,使用(數據,(行的cols))`參數格式:

In [2]: from scipy import sparse 
In [3]: from scipy import io 
In [4]: data=np.array([[10,45],[11,12],[4,1]]) 
In [5]: data 
Out[5]: 
array([[10, 45], 
     [11, 12], 
     [ 4, 1]]) 
In [6]: rows = data[:,0] 
In [7]: cols = data[:,1] 
In [8]: data = np.ones(rows.shape, dtype=int) 
In [9]: M = sparse.coo_matrix((data, (rows, cols)), shape=(100,100)) 
In [10]: M 
Out[10]: 
<100x100 sparse matrix of type '<class 'numpy.int32'>' 
    with 3 stored elements in COOrdinate format> 
In [11]: print(M) 
    (10, 45) 1 
    (11, 12) 1 
    (4, 1) 1 

如果你將它保存到MATLAB中使用的.MAT文件,它將它保存在csc格式(已經從coo將其轉換):

In [13]: io.savemat('test.mat',{'M':M}) 
In [14]: d = io.loadmat('test.mat') 
In [15]: d 
Out[15]: 
{'M': <100x100 sparse matrix of type '<class 'numpy.int32'>' 
    with 3 stored elements in Compressed Sparse Column format>, 
'__globals__': [], 
'__header__': b'MATLAB 5.0 MAT-file Platform: posix, Created on: Mon Aug 7 08:45:12 2017', 
'__version__': '1.0'} 

coo格式不落實的項目分配。 csrcsc執行它,但會抱怨。但它們是計算的正常格式。 lildok是迭代分配的最佳格式。

+0

感謝所有的信息。幫了很多:)。 –

4

使用稀疏格式,它支持高效的索引,如dok_matrix

This is an efficient structure for constructing sparse matrices incrementally.

...

Allows for efficient O(1) access of individual elements. Duplicates are not allowed. Can be efficiently converted to a coo_matrix once constructed.

最後一句可以概括爲:如果需要,可以有效地轉換到所有其他常見的格式。

from scipy.sparse import dok_matrix 

M = dok_matrix((100, 100)) # extra brackets needed as mentioned in comments 
          # thanks Daniel! 
M[0,3] = 5 
+1

你想'M = dok_matrix((100,100))'。這也是一個錯誤。 –

+0

@DanielF當然!謝謝! – sascha

+1

呈現我的答案已過時。 :/另外,如果你絕對需要'coo'表示,你可以在末尾 – Uvar