我有一個zip文件(大約10,000個小文件)的目錄,每個文件都是一個CSV文件,我試圖讀取並分割成多個不同的CSV文件。從Zip文件中讀取CSV
我設法編寫代碼以從CSV目錄中拆分CSV文件,如下所示,它讀取CSV的第一個atttribute,並且取決於將它寫入相關的CSV。
import csv
import os
import sys
import re
import glob
reader = csv.reader(open("C:/Projects/test.csv", "rb"), delimiter=',', quotechar='"')
write10 = csv.writer(open('ouput10.csv', 'w'), delimiter=',', lineterminator='\n', quotechar='"', quoting=csv.QUOTE_NONNUMERIC)
write15 = csv.writer(open('ouput15.csv', 'w'), delimiter=',', lineterminator='\n', quotechar='"', quoting=csv.QUOTE_NONNUMERIC)
headings10=["RECORD_IDENTIFIER","CUSTODIAN_NAME","LOCAL_CUSTODIAN_NAME","PROCESS_DATE","VOLUME_NUMBER","ENTRY_DATE","TIME_STAMP","VERSION","FILE_TYPE"]
write10.writerow(headings10)
headings15=["RECORD_IDENTIFIER","CHANGE_TYPE","PRO_ORDER","USRN","STREET_DESCRIPTION","LOCALITY_NAME","TOWN_NAME","ADMINSTRATIVE_AREA","LANGUAGE"]
write15.writerow(headings15)
for row in reader:
type = row[0]
if "10" in type:
write10.writerow(row)
elif "15" in type:
write15.writerow(row)
因此,我現在試圖讀取Zip文件,而不是浪費時間先提取它們。
這就是我,因爲我已經找到
import glob
import os
import csv
import zipfile
import StringIO
for name in glob.glob('C:/Projects/abase/*.zip'):
base = os.path.basename(name)
filename = os.path.splitext(base)[0]
datadirectory = 'C:/Projects/abase/'
dataFile = filename
archive = '.'.join([dataFile, 'zip'])
fullpath = ''.join([datadirectory, archive])
csv = '.'.join([dataFile, 'csv'])
filehandle = open(fullpath, 'rb')
zfile = zipfile.ZipFile(filehandle)
data = StringIO.StringIO(zfile.read(csv))
reader = csv.reader(data)
for row in reader:
print row
但是和錯誤被拋出
下儘可能多的教程後,至今AttributeError的:「海峽」對象有沒有屬性「讀者」
希望有人可以告訴我如何更改我的CSV閱讀代碼,用於閱讀Zip文件。
非常感謝
添
也許這是你如何粘貼代碼,但幾乎沒有什麼是你的名字循環。這個錯誤指的是什麼? – 2012-02-18 18:27:47