2013-12-12 32 views
0

我有如下的Python代碼爲utf8:從encodeing ANSI轉換所有CSV文件使用python

import os 
from os import listdir 

def find_csv_filenames(path_to_dir, suffix=".csv"): 
    filenames = listdir(path_to_dir) 
    return [ filename for filename in filenames if filename.endswith(suffix) ] 
    #always got the error this below code 
filenames = find_csv_filenames('C:\casperjs\project\teleservices\csv') 
for name in filenames: 
    print name 

我遇到了錯誤:

filenames = find_csv_filenames('C:\casperjs\project\teleservices\csv') 
Error message: `TabError: inconsistent use of tabs and spaces in indentation` 

我需要:我想讀的所有CSV文件並將其從編碼ansi轉換爲utf8,但上面的代碼只是每個csv文件的讀取路徑。我不知道它有什麼問題嗎?

+0

格式化您的代碼併發布完整的錯誤消息,請。 – graphite

+0

好的,現在我已經告訴你錯誤信息。 – user3024562

+2

你應該首先修復[indentation](https://en.wikipedia.org/wiki/Python_syntax_and_semantics#Indentation)。 – graphite

回答

1

以下會將每一行轉換爲ASCII文件:

import os 
from os import listdir 

def find_csv_filenames(path_to_dir, suffix=".csv"): 
    path_to_dir = os.path.normpath(path_to_dir) 
    filenames = listdir(path_to_dir) 
    #Check *csv directory 
    fp = lambda f: not os.path.isdir(path_to_dir+"/"+f) and f.endswith(suffix) 
    return [path_to_dir+"/"+fname for fname in filenames if fp(fname)] 

def convert_files(files, ascii, to="utf-8"): 
    for name in files: 
     print "Convert {0} from {1} to {2}".format(name, ascii, to) 
     with open(name) as f: 
      for line in f.readlines(): 
       pass 
       print unicode(line, "cp866").encode("utf-8")  

csv_files = find_csv_filenames('/path/to/csv/dir', ".csv") 
convert_files(csv_files, "cp866") #cp866 is my ascii coding. Replace with your coding. 
0

您的代碼只是列出了csv文件。它沒有做任何事情。如果您需要閱讀它,可以使用csv模塊。如果你需要管理編碼,你可以這樣做:

import csv, codecs 
def safe_csv_reader(the_file, encoding, dialect=csv.excel, **kwargs): 
    csv_reader = csv.reader(the_file, dialect=dialect, **kwargs) 
    for row in csv_reader: 
     yield [codecs.decode(cell, encoding) for cell in row] 

reader = safe_csv_reader(csv_file, "utf-8", delimiter=',') 
for row in reader: 
    print row