如何讓NumPy使用字符串和浮點數創建矩陣

好的，我對這個主題做了相當多的研究，並且我知道NumPy只支持同質矩陣。如何讓NumPy使用字符串和浮點數創建矩陣

我正在Python中使用NLTK包來處理一些語料庫語言學數據，並且只是想用不同的字符串作爲'列名'和實際數據值（浮點數）作爲其餘部分的矩陣的矩陣。

到目前爲止，我製作了兩個矩陣，一個是字符串，一個是浮點數，然後用vstack把它們放在一起。直到我嘗試使用NumPy的savetxt（）方法和堆疊矩陣的這個新「矩陣」，但它不會寫入.csv文件，因爲矩陣不是「類矩陣」，因爲它不是同質的。 FML。

我真的希望能夠使用NumPy處理實際數據點的所有真棒方法，但是我無法得到一個令人討厭的'數組字符串來放在矩陣的頂部變成一個.csv。有任何想法嗎？我真的很喜歡不必再次通過將Python的list-of-list方法應用於多維數組來嘗試這一切。

下面是代碼：

import os.path 
import sys 
import nltk 
from numpy import * 
from nltk.corpus.reader import CHILDESCorpusReader 
from nltk.probability import ConditionalFreqDist, FreqDist 

n_rows = 12 
n_cols = 19 
init_row = 0 
init_col = 0 
neg_words = ["Age", "MLU", "All Tokens","no","not","don't","can't","won't","isn't","wasn't","wouldn't","shouldn't","couldn't","didn't","haven't","aren't","haven't","hasn't","doesn't"] 

Matrix_headers = array(range(len(neg_words)), dtype='a12') 
Matrix_values = zeros(n_rows*n_cols).reshape((n_rows, n_cols)) #the matrix with the data points (floats) 

for entry in range(len(neg_words)): 
    Matrix_headers[entry] = neg_words[entry] 

p = neg_words 
q = Matrix_values 
Matrix = vstack([p,q]) 


out_name = "/Users/nicholasmoores/Documents/Research/neg_table.csv" 
savetxt(out_name, Matrix, fmt='%.3e',delimiter = "\t") 

raw_input("\n\nPress the enter key to exit.")

來源

2013-06-21 Nick Moores

怎麼樣'pandas'」'DataFrame'？ –

是的，你應該使用熊貓這個 –

我只是最終能夠下載和安裝熊貓，所以我會嘗試熊貓DataFrame。我的觀點是，我不想將這個輸入到R中的DataFrame中，所以我對Pandas存在感到非常興奮 –

你可以使用一個structured array

如：

>>> ym = np.zeros(len(neg_words), dtype=[('heads','a14'),('vals','f4',(n_rows,))]) 

array([('', [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), 
     ('', [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), 
     ('', [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), 
     ('', [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), 
     ('', [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), 
     ('', [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), 
     ('', [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), 
     ('', [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), 
     ('', [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), 
     ('', [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), 
     ('', [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), 
     ('', [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), 
     ('', [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), 
     ('', [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), 
     ('', [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), 
     ('', [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), 
     ('', [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), 
     ('', [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), 
     ('', [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0])], 
     dtype=[('heads', 'S14'), ('vals', '<f4', (12,))])

要設置標頭值：

ym['heads'] = neg_words

要訪問標題：

>>> ym['heads'] 
array(['Age', 'MLU', 'All Tokens', 'no', 'not', "don't", "can't", 
    "won't", "isn't", "wasn't", "wouldn't", "shouldn't", "couldn't", 
    "didn't", "haven't", "aren't", "haven't", "hasn't", "doesn't"], 
    dtype='|S14')

同樣，訪問值

ym['vals']

來源

2013-06-21 09:16:00 atomh33ls

如何讓NumPy使用字符串和浮點數創建矩陣

回答

相關問題