從CSV文件中選擇特定列

我的代碼能夠獲取文本文件的28列並格式化/刪除一些數據。我如何選擇特定的列？我想要的列是0到25和列28.什麼是最好的方法？從CSV文件中選擇特定列

在此先感謝！

import csv 
import os 

my_file_name = os.path.abspath('NVG.txt') 
cleaned_file = "cleanNVG.csv" 
remove_words = ['INAC-EIM','-INAC','TO-INAC','TO_INAC','SHIP_TO-inac','SHIP_TOINAC'] 


with open(my_file_name, 'r', newline='') as infile, open(cleaned_file, 'w',newline='') as outfile: 
    writer = csv.writer(outfile) 
    cr = csv.reader(infile, delimiter='|') 
    writer.writerow(next(cr)[:28]) 
    for line in (r[0:28] for r in cr): 

     if not any(remove_word in element for element in line for remove_word in remove_words): 
     line[11]= line[11][:5] 

     writer.writerow(line) 
infile.close() 
outfile.close()

來源

2017-02-28 Cesar

看看pandas。

import pandas as pd 

usecols = list(range(26)) + [28] 
data = pd.read_csv(my_file_name, usecols=usecols)

您還可以方便的使用數據寫入filter()返回到一個新的文件

with open(cleaned_file, 'w') as f: 
    data.to_csv(f)

來源

2017-02-28 20:24:30 Ohjeah

'Pandas'使得數據操作如此簡單並可行。從我+1。 –

排除列26和column27從行：

for row in cr: 
    content = list(filter(lambda x: row.index(x) not in [25,26], row)) 
    # work with the selected columns content

來源

2017-02-28 20:26:02 haifzhan

如果你不得不調用列表，爲什麼不在這裏使用列表理解：'content = [x for x in cr if cr.index（x）not in [25,26]]' – Ohjeah

您可能是想過濾排，而不是讀者。現在，您會在for循環的第一次迭代中耗盡讀者。使用find也是浪費的，爲什麼不'enumerate（）'？ –

@IljaEverilä是的，'排'，修正了錯字。謝謝！ – haifzhan

從CSV文件中選擇特定列

回答

相關問題