以下是獲取div屬性值的小代碼。所有div名稱都與相同的attr名稱相同。獲取div屬性val和div文本體
redditFile = urllib2.urlopen("http://www.bing.com/videos?q=owl")
redditHtml = redditFile.read()
redditFile.close()
soup = BeautifulSoup(redditHtml)
productDivs = soup.findAll('div', attrs={'class' : 'dg_u'})
for div in productDivs:
print div.find('div', {"class":"vthumb"})['smturl']
#print div.find("div", {"class":"tl text-body"}) This print none rather then div text
第一次印刷了一些網址(有時4,6,8等),然後
KeyError Traceback (most recent call last)
<ipython-input-34-cc950a8a84f7> in <module>()
26 productDivs = soup.findAll('div', attrs={'class' : 'dg_u'})
27 for div in productDivs:
---> 28 print div.find('div', {"class":"vthumb"})['smturl']
29 print div.find("div", {"class":"tl text-body"})
/usr/local/lib/python2.7/dist-packages/bs4/element.pyc in __getitem__(self, key)
903 """tag[key] returns the value of the 'key' attribute for the tag,
904 and throws an exception if it's not there."""
--> 905 return self.attrs[key]
906
907 def __iter__(self):
KeyError: 'smturl'
所有div名稱相同與相同smturl
ATTR的名字,爲什麼它給KeyError
任何幫助嗎?
並非所有'div'都具有'smturl'屬性。有一種方法可以找到:'for div in productDivs:if'smturl'not in div.find('div',{「class」:「vthumb」})。attrs:print(div)' – styvane