我試圖獲取包含變音符號(í,č...)的頁面的html。問題是urllib2.quote
似乎沒有按我的預期工作。urllib2.quote無法正常工作
就我而言,報價應該將包含變音符號的url轉換爲正確的url。
下面是一個例子:
url = 'http://www.example.com/vydavatelství/'
print urllib2.quote(url)
>> http%3A//www.example.com/vydavatelstv%C3%AD/
的問題是,它改變http//
字符串出於某種原因。然後urllib2.urlopen(req)
返回錯誤:
response = urllib2.urlopen(req)
File "C:\Python27\lib\urllib2.py", line 154, in urlopen return opener.open(url, data, timeout) File "C:\Python27\lib\urllib2.py", line 437, in open response = meth(req, response)
File "C:\Python27\lib\urllib2.py", line 550, in http_response 'http', request, response, code, msg, hdrs)
File "C:\Python27\lib\urllib2.py", line 475, in error return self._call_chain(*args)
File "C:\Python27\lib\urllib2.py", line 409, in _call_chain result = func(*args)
File "C:\Python27\lib\urllib2.py", line 558, in http_error_default raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 400: Bad Request
您是否試過在腳本的頂部放置# - * - coding:utf-8 - * - ? – thefragileomen