python：轉換損壞的xls文件

我已經從SAP應用程序下載了幾個銷售數據集。 SAP已自動將數據轉換爲.XLS文件。每當我打開使用Pandas庫我收到以下錯誤是：python：轉換損壞的xls文件

XLRDError: Unsupported format, or corrupt file: Expected BOF record; found '\xff\xfe\r\x00\n\x00\r\x00'

當我打開使用msexcel的是.xls文件顯示一個彈出說，file is corrupt or unsupported extension do you want to continue當我點擊「是」它顯示了正確的數據。當我使用msexcel將文件再次保存爲.xls時，我可以使用Pandas。

所以，我試圖使用os.rename()重命名文件，但它的工作。我試圖打開該文件並刪除\xff\xfe\r\x00\n\x00\r\x00，但後來它也努力工作。

解決方案是打開MSEXCEL並手動將文件另存爲.xls，有沒有什麼辦法可以自動執行此操作。請幫助。

來源

2017-05-15 Jeril

我檢查了這個問題，我的問題是關於將其轉換爲另一種格式。 – Jeril

@downshift否在MS Excel中未打開該文件。 – Jeril

我只想做一些類似於MS Excel'save-as'的東西，但不是手動的。有什麼辦法嗎？ – Jeril

最後，我將損壞的.xls轉換爲正確的.xls文件。以下是代碼：

# Changing the data types of all strings in the module at once 
from __future__ import unicode_literals 
# Used to save the file as excel workbook 
# Need to install this library 
from xlwt import Workbook 
# Used to open to corrupt excel file 
import io 

filename = r'SALEJAN17.xls' 
# Opening the file using 'utf-16' encoding 
file1 = io.open(filename, "r", encoding="utf-16") 
data = file1.readlines() 

# Creating a workbook object 
xldoc = Workbook() 
# Adding a sheet to the workbook object 
sheet = xldoc.add_sheet("Sheet1", cell_overwrite_ok=True) 
# Iterating and saving the data to sheet 
for i, row in enumerate(data): 
    # Two things are done here 
    # Removeing the '\n' which comes while reading the file using io.open 
    # Getting the values after splitting using '\t' 
    for j, val in enumerate(row.replace('\n', '').split('\t')): 
     sheet.write(i, j, val) 

# Saving the file as an excel file 
xldoc.save('myexcel.xls') 

import pandas as pd 
df = pd.ExcelFile('myexcel.xls').parse('Sheet1')

沒有錯誤。

來源

2017-05-16 07:30:25 Jeril

python：轉換損壞的xls文件

回答

相關問題