我使用美麗的湯模塊來刮取保存在csv中的網頁列表的標題。該腳本看起來做工精細,但一旦到達82域它產生以下錯誤:美麗的湯錯誤
Traceback (most recent call last):
File "soup.py", line 31, in <module>
print soup.title.renderContents() # 'Google'
AttributeError: 'NoneType' object has no attribute 'renderContents'
我是相當新的蟒蛇,所以我不知道我理解錯誤,會有人能夠澄清出了什麼問題?
我的代碼是:
import csv
import socket
from urllib2 import Request, urlopen, URLError, HTTPError
from BeautifulSoup import BeautifulSoup
debuglevel = 0
timeout = 5
socket.setdefaulttimeout(timeout)
domains = csv.reader(open('domainlist.csv'))
f = open ('souput.txt', 'w')
for row in domains:
domain = row[0]
req = Request(domain)
try:
html = urlopen(req).read()
print domain
except HTTPError, e:
print 'The server couldn\'t fulfill the request.'
print 'Error code: ', e.code
except URLError, e:
print 'We failed to reach a server.'
print 'Reason: ', e.reason
else:
# everything is fine
soup = BeautifulSoup(html)
print soup.title # '<title>Google</title>'
print soup.title.renderContents() # 'Google'
f.writelines(domain)
f.writelines(" ")
f.writelines(soup.title.renderContents())
f.writelines("\n")
謝謝!似乎在做這項工作。 – 2011-12-20 13:29:29