2014-04-01 43 views
0

我想抓住從這個網頁上的所有類別的獲獎者特定文本: http://www.chicagoreader.com/chicago/BestOf?category=4053660&year=2013查找使用BeautifulSoup

我在寫的崇高此:

import urllib2 
from bs4 import BeautifulSoup 
url = "http://www.chicagoreader.com/chicago/BestOf?category=4053660&year=2013" 
page = urllib2.urlopen(url) 
soup_package = BeautifulSoup(page) 
page.close() 

#find everything in the div class="bestOfItem). This works. 
all_categories = soup_package.findAll("div",class_="bestOfItem") 
# print(all_categories) 

#this part breaks it: 
soup = BeautifulSoup(all_categories) 
winner = soup.a.string 
print(winner) 

當我在終端運行此,我收到以下錯誤:

Traceback (most recent call last): 
    File "winners.py", line 12, in <module> 
    soup = BeautifulSoup(all_categories) 
    File "build/bdist.macosx-10.9-intel/egg/bs4/__init__.py", line 193, in __init__ 
    File "build/bdist.macosx-10.9-intel/egg/bs4/builder/_lxml.py", line 99, in prepare_markup 
    File "build/bdist.macosx-10.9-intel/egg/bs4/dammit.py", line 249, in encodings 
    File "build/bdist.macosx-10.9-intel/egg/bs4/dammit.py", line 304, in find_declared_encoding 
TypeError: expected string or buffer 

任何人都知道那裏發生了什麼?

回答

2

您正在嘗試從元素的列表中創建新的BeautifulSoup對象。

soup = BeautifulSoup(all_categories) 

這裏絕對沒有必要這樣做;只是循環每場比賽,而不是:

for match in all_categories: 
    winner = match.a.string 
    print(winner) 
+0

@ user1922698:對不起,我誤讀失敗,更正。 –

+0

工作正常!謝謝。 – user1922698