我目前正試圖在一個非常大的.txt文件(幾百萬行文本)上使用一些簡單的正則表達式。最簡單的代碼引起該問題:python3 UnicodeDecodeError
file = open("exampleFileName", "r")
for line in file:
pass
錯誤消息:
Traceback (most recent call last):
File "example.py", line 34, in <module>
example()
File "example.py", line 16, in example
for line in file:
File "/usr/lib/python3.4/codecs.py", line 319, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xed in position 7332: invalid continuation byte
我怎樣才能解決這個問題? utf-8是錯誤的編碼?如果是這樣,我怎麼知道哪一個是對的?
謝謝,最好的問候!
可能與http://stackoverflow.com/questions/5552555/unicodedecodeerror-invalid-continuation-byte – Jeff
發佈'file -bi [your_filename]'的輸出。你會得到一個編碼。之後,將'encoding'參數提供給'open()'。 – light2yellow
file -bi命令有什麼作用? –