可以通過字符串查找來更改python字符串嗎？

doc = open("1.html").read().strip() 
doc = doc.decode("utf-8","ignore")

這個例子可以。我可以得到正確的unicode字符串文檔。可以通過字符串查找來更改python字符串嗎？

doc = open("1.html").read().strip() 
if u"charset=utf" in doc or u"charset=\"utf" in doc: 
    doc = doc.decode("utf-8","ignore")

有錯誤「的UnicodeDecodeError：‘ASCII’編解碼器不能在289位置解碼字節0xe7：順序不在範圍內（128）」任何人都可以解釋？字符串文檔可以通過字符串查找來更改？忘了說，1.html包含中文單詞。

問題是，您正在比較從文件讀取的字節字符串與您的Unicode字符串u"charset=utf"和u"charset=\"utf"。爲了比較它們，Python必須在此處將字節字符串轉換爲unicode - 在手動調用decode之前 - 它使用默認的ASCII編解碼器執行。

的解決方案是總是字節串比較字節字符串：

if "charset=utf" in doc or "charset=\"utf" in doc:

2016-04-05 08:40:07

回答