BeautifulSoup爲什麼不工作？（Python的2.7.10）

from bs4 import BeautifulSoup 
import urllib.request 
r = urllib.request.urlopen('http://www.aflcio.org/Legislation-and-Politics/Legislative-Alerts').read() 
soup = BeautifulSoup(r) 
print type(soup)

我得到的消息「urllib.error.HTTPError：HTTP錯誤403：禁止訪問」BeautifulSoup爲什麼不工作？（Python的2.7.10）

我是個初學者，當談到模塊，所以我不知道是什麼我正在做。抱歉。

來源

2017-03-04 frightenedeyes

。顯然，該網站不喜歡程序化訪問。 –

所以你說這是因爲我想看的網站？無論我選擇哪個網址，我都會收到錯誤。 – frightenedeyes

這些錯誤與你從這個錯誤中得到的結果有什麼關係？這是因爲網站：錯誤的重要部分是** Forbidden **。 –

你可能想指定的UserAgent

：因爲它被稱爲前引發錯誤，用'的urlopen ... read`

import requests 
from bs4 import BeautifulSoup 

ret = requests.request(
    'GET', 
    'http://www.aflcio.org/Legislation-and-Politics/Legislative-Alerts', 
    headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/602.4.8 (KHTML, like Gecko) Version/10.0.3 Safari/602.4.8'} 
) 

soup = BeautifulSoup(ret.text, "html.parser") 
print type(soup)

來源

2017-03-04 23:22:32 ewcz

BeautifulSoup爲什麼不工作？ （Python的2.7.10）

回答

相關問題

BeautifulSoup爲什麼不工作？（Python的2.7.10）