2014-08-28 52 views
1

我試圖從文件中拆分行並將它們放入Excel文件(xlsx)中。根據PS PAD,該文件的編碼是'cp1250'。所以,有適當的字符XLSX文件,我從CP1250解碼該行 - line = line.decode("cp1250")編碼錯誤 - xlsxwriter - Python

的問題是,從12000點返回該錯誤CCA 3000行:

'charmap' codec can't decode byte 0x81 in position 25: character maps to <undefined> 

因此,作爲未來的事情我試着解碼(「UTF-8」),我不知道爲什麼,但它更好。只有330線返回錯誤:

'utf8' codec can't decode byte 0x8e in position 0: invalid start byte 

你們有什麼想法我做錯了什麼?

編輯:錯誤大多發生在線路中包含「Z」或「S」

下面是代碼:(在PY文件的頂部,我已經把「# - - 編碼: UTF-8 - - 「)

def toXls(file): 
workbook = xlsxwriter.Workbook(file) 
worksheet = workbook.add_worksheet() 
a=0 
with open("filtrovane.txt") as f: 
    x=0 
    for line in f: 

     try: 
      line = line[:-1].decode("utf-8") """It should be "cp1250" according to PSPAD editor""" 
      # line = line.encode("ISO 8859-2") 
      splitted = line.split("::") 

      if len(splitted)==7: 
       try: 
        a=a+1 
        worksheet.write(a,0,splitted[0]) 
        worksheet.write(a,1,splitted[1]) 
        worksheet.write(a,2,splitted[2]) 
        worksheet.write(a,3,splitted[3]) 
        worksheet.write(a,4,splitted[4]) 
        worksheet.write(a,5,splitted[5]) 
        worksheet.write(a,6,splitted[6]) 
       except Exception as e: 
        print "!!"+line+" "+a + e 
     except Exception as e: 
      print e 
      x=x+1 
print x 
workbook.close() 
+0

當您嘗試將其保存到文本文件時會發生什麼情況,是否會發生同樣的問題? – diek 2014-08-29 02:16:02

回答

0

裏有XlsxWriter文檔/回購兩個例子,說明如何閱讀UTF-8Shift JIS文件並將它們轉換成XLSX文件。

它應該適用於cp1250