from bs4 import BeautifulSoup
import requests
import urllib
from requests import session
import http.cookiejar
mainLink = "http://infoweb.newsbank.com.proxy.lib.uiowa.edu/iw-search/we/InfoWeb?p_product=AWNB&p_theme=aggregated5&p_action=doc&p_docid=14D12E120CD13C18&p_docnum=2&p_queryname=4"
def articleCrawler(mainUrl):
response = urllib.request.urlopen(mainUrl)
soup = BeautifulSoup(response)
linkList = []
for link in soup.find_all('a'):
<title>Cookie Required</title>
This is cookie.htm from the doc subdirectory.
Licensing agreements for these databases require that access be extended
only to authorized users. Once you have been validated by this system,
a "cookie" is sent to your browser as an ongoing indication of your authorization to
access these databases. It will only need to be set once during login.
As you access databases, they may also use cookies. Your ability to use those databases
may depend on whether or not you allow those cookies to be set.
To login again, click <a href="login">here</a>.
<a href="login">here</a>
我使用http.cookiejar嘗試過,但我不熟悉的圖書館。我正在使用Python 3.有誰知道如何接受cookie並訪問文章?謝謝。
好的,我會檢查出來並在以後回覆。謝謝。 –