2014-04-23 78 views
0

我有一個samples.csv文件和該歸檔具有這種結構:解析CSV到2名維陣列

354,174,27c,20,5c,287,382,a3 
59,359,152,115,19d,15a,143,113 
8f,1e6,291,55,b1,3f9,39b,ba 
3cf,77,20c,316,164,e2,2cb,3c9 
72,171,167,a9,3e5,2dc,34f,191 
2ad,8c,1f1,1bd,175,3fd,28,2f5 
3b1,11f,ab,8b,282,284,192,1c8 
310,24b,240,1fe,20e,251,1d5,305 
3f1,14b,381,210,1b4,25f,116,228 
ba,175,1c2,342,259,de,359,369 

它的一個8×1000個樣本,並且我想此CSV轉換爲2名維陣列來提取這種形式的數據:

[i,j] 

354 = [0,0] 
174 = [0,1] 

這是怎麼沒有numpy?

感謝

+0

將該文件存儲爲字符串...將該字符串拆分爲新行字符...將該結果中的字符串拆分爲逗號 –

+0

是否要覆蓋Python的matrix [0] [0]的正常矩陣訪問? '是矩陣[0,0]'?這不是在Python中訪問2維列表的標準方式。它可以完成 - 但也許先習慣Python的方式? – dawg

回答

0

Python有一個內置的模塊,用於解析CSV文件,稱爲csv。下面是來自official docs一個例子:

>>> import csv 
>>> with open('eggs.csv', 'rb') as csvfile: 
...  spamreader = csv.reader(csvfile, delimiter=' ', quotechar='|') 
...  for row in spamreader: 
...   print ', '.join(row) 
Spam, Spam, Spam, Spam, Spam, Baked Beans 
Spam, Lovely Spam, Wonderful Spam 

可以存儲作爲一個列表的列表:

rows = [] 
for row in spamreader: 
    rows.append(row) 
print rows[5][3] 
0
data = file('samples.csv').read() 
table = [row.split(',') for row in data.split('\n')] 

print table[0][0] 
0

轉換每一行到一個列表,然後將它們添加到另一個列表讓你的矩陣:

import csv 

matrix = [] 

with open('somefile.csv') as f: 
    reader = csv.reader(f, delimiter=',') 
    matrix.append(list(reader)) 

print(matrix[0][1]) # Output: 174 
0
f = open("file.txt", "r") 
array = [] 
for line in f: 
    array.append(line.split(',')) 
f.close() 
0

您可以使用CSV和列表理解:

import csv 

with open(ur_file) as f: 
    reader=csv.reader(f) 
    data=[row for row in reader] 

print data 
# [['354', '174', '27c', '20', '5c', '287', '382', 'a3'], ['59', '359', '152', '115', '19d', '15a', '143', '113'], ['8f', '1e6', '291', '55', 'b1', '3f9', '39b', 'ba'], ['3cf', '77', '20c', '316', '164', 'e2', '2cb', '3c9'], ['72', '171', '167', 'a9', '3e5', '2dc', '34f', '191'], ['2ad', '8c', '1f1', '1bd', '175', '3fd', '28', '2f5'], ['3b1', '11f', 'ab', '8b', '282', '284', '192', '1c8'], ['310', '24b', '240', '1fe', '20e', '251', '1d5', '305'], ['3f1', '14b', '381', '210', '1b4', '25f', '116', '228'], ['ba', '175', '1c2', '342', '259', 'de', '359', '369']] 
print data[0][0] 
# 354 

你的問題意味着你想用一個元組data[0,1]訪問數據VS,你會使用data[0][1]

語法如果這是一個多維的列表不是筆誤,你可以使用一個字典,兩個元素的元組作爲鍵:

>>> data={} 
>>> data[0,0]='354' 
>>> data 
{(0, 0): '354'} 
>>> data[0,0] 
'354' 

你的循環讀取,則CSV變爲:

data={} 
with open(ur_file) as f: 
    reader=csv.reader(f) 
    for i, line in enumerate(reader): 
     for j, v in enumerate(line): 
      data[(i, j)]=v 

print data 
# {(7, 3): '1fe', (4, 7): '191', (1, 3): '115', (9, 1): '175', (6, 4): '282', (3, 0): '3cf', (8, 0): '3f1', (5, 4): '175', (0, 7): 'a3', (5, 6): '28', (2, 6): '39b', (1, 6): '143', (9, 4): '259', (5, 1): '8c', (3, 7): '3c9', (2, 5): '3f9', (8, 5): '25f', (0, 3): '20', (7, 2): '240', (4, 0): '72', (1, 2): '152', (9, 0): 'ba', (9, 5): 'de', (6, 7): '1c8', (3, 3): '316', (2, 0): '8f', (8, 1): '14b', (7, 6): '1d5', (4, 4): '3e5', (6, 3): '8b', (1, 5): '15a', (3, 6): '2cb', (2, 2): '291', (7, 7): '305', (5, 7): '2f5', (5, 3): '1bd', (4, 1): '171', (1, 1): '359', (9, 7): '369', (2, 7): 'ba', (3, 2): '20c', (0, 0): '354', (6, 6): '192', (5, 0): '2ad', (7, 1): '24b', (4, 5): '2dc', (0, 4): '5c', (5, 5): '3fd', (1, 4): '19d', (6, 0): '3b1', (7, 5): '251', (2, 3): '55', (2, 1): '1e6', (8, 7): '228', (8, 6): '116', (9, 3): '342', (4, 2): '167', (1, 0): '59', (9, 6): '359', (6, 5): '284', (3, 5): 'e2', (0, 1): '174', (8, 3): '210', (7, 0): '310', (4, 6): '34f', (9, 2): '1c2', (5, 2): '1f1', (6, 1): '11f', (3, 1): '77', (8, 2): '381', (0, 2): '27c', (7, 4): '20e', (0, 6): '382', (6, 2): 'ab', (4, 3): 'a9', (1, 7): '113', (0, 5): '287', (3, 4): '164', (2, 4): 'b1', (8, 4): '1b4'} 
print data[0,1] 
# 174