2015-07-21 216 views
1

列塊隨着文件結構:讀入從CSV文件PYTHON

A B C D 
1 x y z 
2 x y z 
3 x y z 
4 x y z 
5 i j k 
6 i j k 
7 .......etc. 

我要跳過的標題,則每行的第一個元素。

真正的多汁數據是x,y,z,i,j,k值。

這些值是ADC值,需要排列成列表。

my_list = [0] [x,x,x,x] 
      [1] [y,y,y,y] 
      [2] [z,z,z,z] 
      [3] [i,i,i,i] etc. 

我可以很容易地迭代出整列,但棘手的部分是遍歷每列的某些行。

我試過到目前爲止:

def readin(myfile): 

import csv 
with open(myfile, 'r') as f: # Open Results File 

    next(f) # skip headings 

    data = csv.reader(f, delimiter="\t") 
    temp = [] 
    temp2=[] 
    my_list=[] 

    for i in range(13): #my_list will be 12 lists long 
     print i 
     for x in range(1,4): 
     for row in data: 
     temp.append(row[x]) 
    return my_list 

我只是得到一列迭代出來。我不知道如何輕鬆切列(單獨X的,我的等

+0

什麼是您預期的輸出? –

+0

@omri_saadon「my_list」(修改後) – cc6g11

+0

@omri_saadon ...忽略文件中的1-7等,因此每行中的元素[1:3] – cc6g11

回答

2

轉置數據和切片:

from itertools import izip 
data = csv.reader(f, delimiter="\t") 
trans = izip(*data) 
A = next(trans) # skip first col 
+0

這很好,但如何忽略轉置數據中的第一行? – cc6g11

+0

@ cc6g11,使用itertools.zip,在izip對象上調用next來跳過第一個列 –

1

這是代碼,你可以看到我用熊貓來操縱我的數據

import pandas as pd 

df = pd.read_csv("te.txt") 
df.drop(df.columns[[0]], axis=1, inplace=True) # delete the first column as you wished 
li = [] 
for col in df.columns: 
    li.append(list(df[col])) 
print li 

輸出:

[['x', 'x', 'x', 'x', 'i', 'i'], 
['y', 'y', 'y', 'y', 'j', 'j'], 
['z', 'z', 'z', 'z', 'k', 'k']] 

這是csv文件 「te.txt」:

A,B,C,D 
1,x,y,z 
2,x,y,z 
3,x,y,z 
4,x,y,z 
5,i,j,k 
6,i,j,k 
+0

快速問題,如何用del函數刪除?我不明白你通過它的論點。 – cc6g11

+0

@ cc6g11,我改變了將列刪除到更多'熊貓'的方式。 –

0

無需外接模塊,但csv的一種方法:

import csv 

with open('blocks.csv') as infile: 
    reader = csv.reader(infile) 
    out_list = [] 

    # skip first line 
    next(reader) 

    while True: 
     block = [] 
     try: 
      # read four lines 
      for i in range(4): 
       block.append(next(reader)) 
     except StopIteration: 
      break 

     # transpose the block and skip the index column 
     transposed_block = zip(*block)[1:] 
     out_list += transposed_block 

這將產生以下out_list

>>> out_list 
[('x', 'x', 'x', 'x'), 
('y', 'y', 'y', 'y'), 
('z', 'z', 'z', 'z'), 
('i', 'i', 'i', 'i'), 
('j', 'j', 'j', 'j'), 
('k', 'k', 'k', 'k')] 
0

使用熊貓作爲初級講座:

from pandas import DataFrame as df 

d = df.read_csv("text.txt") 

d.drop(d.columns[[0]], axis=1, inplace=True) 
k_list = [d.loc[:3,k].tolist() for k in d.columns()] 

print k_list 

輸出:

[['x', 'x', 'x', 'x'], 
['y', 'y', 'y', 'y'], 
['z', 'z', 'z', 'z']] 
0

下面會給你你問的結果。它使用一次讀取四行輕微的替代方法,並且還刪除第一列:

import csv 

def readin(myfile): 
    my_list = [] 

    with open(myfile, 'r') as f:  # Open Results File 
     csv_input = csv.reader(f, delimiter=" ", skipinitialspace=True) 
     headings = next(csv_input)  # Skip headings 

     try: 
      while True: 
       my_list.extend(zip(next(csv_input), next(csv_input), next(csv_input), next(csv_input))[1:]) 
     except StopIteration: 
      pass 

    return my_list 

result = readin("results_file.csv") 

print result[0] 
print result 

輸出是:

('x', 'x', 'x', 'x') 

[('x', 'x', 'x', 'x'), ('y', 'y', 'y', 'y'), ('z', 'z', 'z', 'z'), ('i', 'i', 'i', 'i'), ('j', 'j', 'j', 'j'), ('k', 'k', 'k', 'k')]