Beautifulsoup內容返回列表索引超出範圍

我正在關注這些教程http://importpython.blogspot.com/2009/12/how-to-get-beautifulsoup-to-filter.html和http://importpython.blogspot.com/2009/12/how-to-screen-scrape-craigslist-using.html甚至複製粘貼的代碼我似乎無法獲得打印鏈接的標題，因爲我得到一個列表索引超出範圍分別是第11行和第8行。如果我複製代碼，我做錯了什麼。我試過其他變化，如只返回鏈接和運作完全正常，所以我不認爲這是一個地方問題Beautifulsoup內容返回列表索引超出範圍

編輯

的問題是下面的代碼（從http://importpython.blogspot.com/2009/12/how-to-screen-scrape-craigslist-using.html）：

from BeautifulSoup import BeautifulSoup #1 
from urllib2 import urlopen    #2 

site = "http://sfbay.craigslist.org/rea/" #3 
html = urlopen(site)      #4 
soup = BeautifulSoup(html)    #5 
postings = soup('p')      #6 

for post in postings:      #7 
    print post('a')[0].contents[0]  #8 
    print post('a')[0]['href']   #9

給出了錯誤：

Traceback (most recent call last): 
    File "<stdin>", line 2, in <module> 
IndexError: list index out of range

來源

2014-06-10 omriki

請包括一個[最小示例]（http://sscce.org）代碼，在您的實際問題中演示問題，而不僅僅是場外鏈接。 – jonrsharpe

這是依靠Craigslist網站的HTML結構尿，這已經改變了。你會得到你的「正確」的結果在第二個「一」標籤：

print post('a')[1].contents[0] 
print post('a')[1]['href']

來源

2014-06-10 08:33:16 cchristelis

BeautifulSoup是非常強大的......所以不要偷懶和使用其所有力量：

soup = BeautifulSoup(html) 
postings = soup.find_all('p', {'class': 'row'}) 

for post in postings: 
    info_container = post.find('span', {'class':'pl'}).find('a') 
    print info_container.text 
    print info_container['href']

我總是儘量避免在我的代碼中硬編碼數組大小。並使用查找功能，這是最直觀的

來源

2014-06-10 16:31:48 Curro

Beautifulsoup內容返回列表索引超出範圍

回答

相關問題