在Python中使用過濾器函數

我正嘗試使用Python的內置過濾器函數來從CSV中的某些列中提取數據。這是很好的使用過濾功能嗎？我必須首先在這些列中定義數據，還是Python會以某種方式知道哪些列包含哪些數據？在Python中使用過濾器函數

2011-11-28 mantissa45

你能提供一個你的輸入數據和請求輸出數據的例子嗎？ – six8

你能更詳細地解釋你想做什麼嗎？也許舉個例子吧？我不清楚...... –

當然。假設我的CSV有第1,2和3列。我想忽略第2列中的所有數據，並僅提取第1列和第3列中的內容。可以使用過濾器函數來實現嗎？ – mantissa45

的filter功能意在從列表中選擇（或一般地，任何可迭代）那些滿足特定條件的元素。它並非真正用於基於索引的選擇。因此，儘管您可以使用來挑選CSV文件的指定列，但我不會推薦它。相反，你應該使用這樣的事：

with open(filename, 'rb') as f: 
    for record in csv.reader(f): 
     do_something_with(record[0], record[2])

根據您與記錄做什麼，它可能是更好的在感興趣的列來創建一個迭代器：

with open(filename, 'rb') as f: 
    the_iterator = ((record[0], record[2]) for record in csv.reader(f)) 
    # do something with the iterator

，或者如果你需要非順序處理，也許一個列表：

with open(filename, 'rb') as f: 
    the_list = [(record[0], record[2]) for record in csv.reader(f)] 
    # do something with the list

我不知道你在列定義數據的意思。數據由CSV文件定義。

相比之下，這裏是在你想使用filter的情況下：假設你的CSV文件包含數值數據，你需要建立的記錄列表，其中的數字是在內部嚴格按升序該行。你可以寫一個函數來確定號碼的列表是否嚴格按升序：

def strictly_increasing(fields): 
    return all(int(i) < int(j) for i,j in pairwise(fields))

（見itertools documentation爲pairwise的定義）。然後你就可以在filter以此爲條件：

with open(filename, 'rb') as f: 
    the_list = filter(strictly_increasing, csv.reader(f)) 
    # do something with the list

當然，同樣的事情，而且通常會被實現爲一個列表理解：

with open(filename, 'rb') as f: 
    the_list = [record for record in csv.reader(f) if strictly_increasing(record)] 
    # do something with the list

所以沒有什麼理由使用filter在實踐中。

2011-11-28 10:05:09

由於python吹噓「電池包括」，對於大多數的日常情況，有人可能已經提供了一個解決方案。 CSV是其中之一，有built-in csv module

另外tablib是一個非常好的第三方模塊，特別是你正在處理非ASCII數據。

對於在註釋中所描述的行爲，這會做：

import csv 
with open('some.csv', 'rb') as f: 
    reader = csv.reader(f) 
    for row in reader: 
     row.pop(1) 
     print ", ".join(row)

2011-11-28 05:02:36 number5

回答