我試圖打開並解析一個html頁面。在蟒蛇2.7.8我沒有問題：Python 3.4 urllib.request錯誤（http 403）

import urllib 
url = "https://ipdb.at/ip/66.196.116.112" 
html = urllib.urlopen(url).read()

和一切都很好。不過，我想移動到Python 3.4，那裏我得到HTTP錯誤403（禁止）。我的代碼：

import urllib.request 
html = urllib.request.urlopen(url) # same URL as before 

File "C:\Python34\lib\urllib\request.py", line 153, in urlopen 
return opener.open(url, data, timeout) 
File "C:\Python34\lib\urllib\request.py", line 461, in open 
response = meth(req, response) 
File "C:\Python34\lib\urllib\request.py", line 574, in http_response 
'http', request, response, code, msg, hdrs) 
File "C:\Python34\lib\urllib\request.py", line 499, in error 
return self._call_chain(*args) 
File "C:\Python34\lib\urllib\request.py", line 433, in _call_chain 
result = func(*args) 
File "C:\Python34\lib\urllib\request.py", line 582, in http_error_default 
raise HTTPError(req.full_url, code, msg, hdrs, fp) 
urllib.error.HTTPError: HTTP Error 403: Forbidden

它適用於其他不使用https的URL。

url = 'http://www.stopforumspam.com/ipcheck/212.91.188.166'

沒問題。

來源

2015-02-08 Belial

參見https://stackoverflow.com/questions/3336549/pythons-urllib2-why-do-i-get-error -403-when-i-urlopen-a-wikipedia- – Trilarion 2017-12-10 20:48:55

看起來這個網站並不喜歡Python 3.x的用戶代理。

指定User-Agent將解決你的問題：

import urllib.request 
req = urllib.request.Request(url, headers={'User-Agent': 'Mozilla/5.0'}) 
html = urllib.request.urlopen(req).read()

注意的Python 2.x的版本的urllib還接收403個狀態，但不同的Python 2.x中的urllib2和Python 3.X的urllib，它不會提高例外。

可以確認，通過下面的代碼：

print(urllib.urlopen(url).getcode()) # => 403

來源

2015-02-08 16:34:29 falsetru

謝謝。有效！ – Belial 2015-02-08 16:49:12

謝謝！爲我工作 – DenisFLASH 2015-08-27 10:20:55

不工作..仍然禁止 – Martian2049 2017-05-11 15:39:45

這裏有一些注意事項我收集的關於urllib當我在學習中的python-3：
我讓他們在情況下，他們可能會派上用場或幫助別人否則出去。

如何導入`urllib.request`和`urllib.parse`：

import urllib.request as urlRequest 
import urllib.parse as urlParse

如何使一個GET請求：

url = "http://www.example.net" 
# open the url 
x = urlRequest.urlopen(url) 
# get the source code 
sourceCode = x.read()

如何使一個POST請求：

url = "https://www.example.com" 
values = {"q": "python if"} 
# encode values for the url 
values = urlParse.urlencode(values) 
# encode the values in UTF-8 format 
values = values.encode("UTF-8") 
# create the url 
targetUrl = urlRequest.Request(url, values) 
# open the url 
x = urlRequest.urlopen(targetUrl) 
# get the source code 
sourceCode = x.read()

如何使POST請求（`403 forbidden`響應）：

url = "https://www.example.com" 
values = {"q": "python urllib"} 
# pretend to be a chrome 47 browser on a windows 10 machine 
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36"} 
# encode values for the url 
values = urlParse.urlencode(values) 
# encode the values in UTF-8 format 
values = values.encode("UTF-8") 
# create the url 
targetUrl = urlRequest.Request(url = url, data = values, headers = headers) 
# open the url 
x = urlRequest.urlopen(targetUrl) 
# get the source code 
sourceCode = x.read()

如何使一個GET請求（`403 forbidden`響應）：

url = "https://www.example.com" 
# pretend to be a chrome 47 browser on a windows 10 machine 
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36"} 
req = urlRequest.Request(url, headers = headers) 
# open the url 
x = urlRequest.urlopen(req) 
# get the source code 
sourceCode = x.read()

來源

2016-04-10 17:18:57

我對我的舊回答中的錯誤表示歉意，我已經做了一些調查並修正了這些錯誤。這些錯誤激勵我回去檢查我的筆記是否正確：D – 2016-04-10 17:19:19

爲什麼它在沒有任何問題的情況下爲第二個鏈接工作？ – Sudheer1990 2016-10-17 11:03:13

Python 3.4 url​​lib.request錯誤（http 403）

回答

如何導入urllib.request和urllib.parse：

如何使一個GET請求：

如何使一個POST請求：

如何使POST請求（403 forbidden響應）：

如何使一個GET請求（403 forbidden響應）：

相關問題

Python 3.4 urllib.request錯誤（http 403）

如何導入`urllib.request`和`urllib.parse`：

如何使POST請求（`403 forbidden`響應）：

如何使一個GET請求（`403 forbidden`響應）：