2013-11-21 125 views
2

我是python的新手,我正在用這段代碼掙扎。有2個文件,第一個文件是包含電子郵件地址的文本文件(每行一個),第二個文件是具有5-6列的csv文件。腳本應該從file1中搜索輸入並在文件2中搜索,輸出應該存儲在另一個csv文件(僅前3列)中,請參閱下面的示例。我也複製了一個我正在編寫的腳本。如果有更好的/有效的腳本,請讓我知道。謝謝,感謝你的幫助。Python從輸入文本文件搜索csv文件

File1 (output.txt) 
[email protected] 
[email protected] 
[email protected] 

File2 (final.csv) 
Sam,Smith,[email protected],admin 
Eric,Smith,[email protected],finance 
Joe,Doe,[email protected],telcom 
Chase,Li,[email protected],IT 

output (out_name_email.csv) 
Eric,Smith,[email protected] 
Chase,Li,[email protected] 

這裏是腳本

import csv 
outputfile = 'C:\\Python27\\scripts\\out_name_email.csv' 
inputfile = 'C:\\Python27\\scripts\\output.txt' 
datafile = 'C:\\Python27\\scripts\\final.csv' 

names=[] 

with open(inputfile) as f: 
    for line in f: 
     names.append(line) 

with open(datafile, 'rb') as fd, open(outputfile, 'wb') as fp_out1: 
    writer = csv.writer(fp_out1, delimiter=",") 
    reader = csv.reader(fd, delimiter=",") 
    headers = next(reader) 
    for row in fd: 
     for name in names: 
      if name in line: 
       writer.writerow(row) 

回答

3

裝入郵件到set爲O(1)查找:

with open(inputfile) as fin: 
    emails = set(line.strip() for line in fin) 

然後遍歷所有的行一次,並檢查它存在於emails - 不需要遍歷每行可能的每個匹配:

# ... 
for row in reader: 
    if row[1] in emails: 
     writer.writerow(row) 
你不使用csv.reader對象 reader

writer.writerows(row for row in reader if row[1] in emails) 

有兩點要注意,在你原來的代碼 - 你循環:

如果你沒有做其他事情,那麼你可以把它通過fd,您似乎有一些命名問題nameslinerow ...