2
我一直在閱讀數十個類似問題的例子,但是我無法獲得我見過的任何解決方案或其變體運行。我是屏幕抓取,我只想忽略404錯誤(跳過頁面)。我得到如何在urllib中捕獲404錯誤? (python 3)
'AttributeError:'module'object has no attribute'HTTPError'。
我試過'URLError'。我已經看到接近相同的語法作爲工作答案。有任何想法嗎?下面是我得到了什麼:
import urllib
import datetime
from bs4 import BeautifulSoup
class EarningsAnnouncement:
def __init__(self, Company, Ticker, EPSEst, AnnouncementDate, AnnouncementTime):
self.Company = Company
self.Ticker = Ticker
self.EPSEst = EPSEst
self.AnnouncementDate = AnnouncementDate
self.AnnouncementTime = AnnouncementTime
webBaseStr = 'http://biz.yahoo.com/research/earncal/'
earningsAnnouncements = []
dayVar = datetime.date.today()
for dte in range(1, 30):
currDay = str(dayVar.day)
currMonth = str(dayVar.month)
currYear = str(dayVar.year)
if (len(currDay)==1): currDay = '0' + currDay
if (len(currMonth)==1): currMonth = '0' + currMonth
dateStr = currYear + currMonth + currDay
webString = webBaseStr + dateStr + '.html'
try:
#with urllib.request.urlopen(webString) as url: page = url.read()
page = urllib.request.urlopen(webString).read()
soup = BeautifulSoup(page)
tbls = soup.findAll('table')
tbl6= tbls[6]
rows = tbl6.findAll('tr')
rows = rows[2:len(rows)-1]
for earn in rows:
earningsAnnouncements.append(EarningsAnnouncement(earn.contents[0], earn.contents[1],
earn.contents[3], dateStr, earn.contents[3]))
except urllib.HTTPError as err:
if err.code == 404:
continue
else:
raise
dayVar += datetime.timedelta(days=1)
謝謝凱爾......但是,這產生的錯誤'模塊'對象沒有屬性'錯誤' – StatsViaCsh
你可以嘗試明確導入'urllib.error'模塊? – Kyle
我在生意上,凱爾。在雄鹿縣順便說一句。 :) 謝謝 – StatsViaCsh