的代碼是自我解釋...字節STR轉換失敗python3
$ python3
Python 3.4.0 (default, Apr 11 2014, 13:05:18)
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib.request as req
>>> url = 'http://bangladeshbrands.com/342560550782-44083.html'
>>> res = req.urlopen(url)
>>> html = res.read()
>>> type(html)
<class 'bytes'>
>>> html = html.decode('utf-8') # bytes -> str
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x92 in position 66081: invalid start byte
爲什麼不使用知道如何通過HTTP正確處理HTML的模塊? – 2014-10-30 05:08:57
@ IgnacioVazquez-Abrams,你能解釋一下嗎? read()方法適用於大多數url。 – Dewsworld 2014-10-30 05:10:20
'read()'方法不會告訴你有關服務器告訴你HTML的字符集的任何信息。 – 2014-10-30 05:10:59