2017-08-09 57 views
0

我試圖用utf-8文本格式加載一個.csv文件,並用管道分隔符將其寫入cp1252(ansi)格式。以下代碼在Python 3.6中工作,但我需要它在Python 2.6中工作。但是,'open'函數不允許Python 2.6中的編碼關鍵字。無法將csv從utf-8轉換爲使用csv writer python 2.6的ansi

import datetime 
import csv 

# Define what filenames to read 
filenames = ["FILE1","FILE2"] 
infilenames = [filename+".csv" for filename in filenames] 
outfilenames = [filename+"_out_.csv" for filename in filenames] 

# Read filenames in utf-8 and write them in cp1252 
for infilename,outfilename in zip(infilenames,outfilenames): 
    infile = open(infilename, "rt",encoding="utf8") 
    reader = csv.reader(infile,delimiter=',',quotechar='"',quoting=csv.QUOTE_MINIMAL) 

    outfile = open(outfilename, "wt",encoding="cp1252") 
    writer = csv.writer(outfile, delimiter='|', quotechar='"', quoting=csv.QUOTE_NONE,escapechar='\\') 
    for row in reader: 
     writer.writerow(row)  

infile.close() 
outfile.close() 

我嘗試了幾種解決方案:

  • 沒有定義編碼。某些Unicode字符錯誤結果
  • 使用io庫(io.open而不是打開)。結果在「類型錯誤:不能將str寫入文本流中的文本」。

有沒有人知道在Python 2.X中的正確解決方案?

+0

Python 2中的'csv'不喜歡' unicode'字符串,所以在標準庫中沒有簡單的修復。但是,有第三方解決方案。例如,查看[這個問題]的答案(https://stackoverflow.com/questions/904041/reading-a-utf8-csv-file-with-python)。 – lenz

回答

0

有可能這裏會有一些多餘的代碼,但我得到這個做以下工作:

  • 首先我沒有使用.decode和.encode funtion使「CP1252」的enconding。
    • 然後我讀從CP1252編碼文件的CSV和它寫了一個新的CSV

...

import datetime 
import csv 

# Define what filenames to read 
filenames = ["FILE1","FILE2"] 


infilenames = [filename+".csv" for filename in filenames] 
outfilenames = [filename+"_out_.csv" for filename in filenames] 
midfilenames = [filename+"_mid_.csv" for filename in filenames] 

# Iterate over each file 
for infilename,outfilename,midfilename in zip(infilenames,outfilenames,midfilenames): 

    # Open file and read utf-8 text, then encode in cp1252 
    infile = open(infilename, "r") 
    infilet = infile.read() 
    infilet = infilet.decode("utf-8") 
    infilet = infilet.encode("cp1252","ignore") 

    #write cp1252 encoded file 
    midfile = open(midfilename,"w") 
    midfile.write(infilet) 
    midfile.close() 

    # read csv with new cp1252 encoding 
    midfile = open(midfilename,"r") 
    reader = csv.reader(midfile,delimiter=',', quotechar='"',quoting=csv.QUOTE_MINIMAL) 

    # define output 
    outfile = open(outfilename, "w") 
    writer = csv.writer(outfile, delimiter='|', quotechar='"',quoting=csv.QUOTE_NONE,escapechar='\\') 

    #write output to new csv file 
    for row in reader: 
     writer.writerow(row) 

    print("written file",outfilename) 
    infile.close() 
    midfile.close() 
    outfile.close()