美麗的湯錯誤：列表索引超出範圍

我是一個**非常新的Python程序員。使用urllib和beautifulsoup在webcrawler上工作。請忽略頂部的while循環和我的增量，我只是運行這個測試版本，併爲一頁，但它最終將包括一整套。我的問題是，這會得到湯，但會產生一個錯誤。我不確定我是否正確收集表格數據，但我希望這段代碼可以忽略鏈接並將文本寫入.csv文件。現在我專注於將文本正確地打印到屏幕上。美麗的湯錯誤：列表索引超出範圍

line 17, in <module> 
    uspc = col[0].string 
IndexError: list index out of range

這裏是代碼：

for row in table.findAll('tr')[1:]:

到：

for row in table.findAll('tr')[2:]:

的

import urllib 
from bs4 import BeautifulSoup 

i=125 
while i==125: 
    url = "http://www.uspto.gov/web/patents/classification/cpc/html/us" + str(i) + "tocpc.html" 
    print url + '\n' 
    i += 1 
    data = urllib.urlopen(url).read() 
    print data 
    #get the table data from dump 
    #append to csv file 
    soup = BeautifulSoup(data) 
    table = soup.find("table", width='80%') 
    for row in table.findAll('tr')[1:]: 
     col = row.findAll('td') 
     uspc = col[0].string 
     cpc1 = col[1].string 
     cpc2 = col[2].string 
     cpc3 = col[3].string 
     record = (uspc, cpc1, cpc2, cpc3) 
     print "|".join(record)

來源

2013-04-09 Super-cluser

[Beautifulsoup for row loop只能運行一次？]（http://stackoverflow.com/questions/15908604/beautifulsoup-for-row-loop-only-runs-once） – gauden 2013-04-09 18:10:16

最後，我通過改變以下行解決了這個問題錯誤是因爲表格的第一行有分割colu mns

來源

2013-04-12 17:39:59

美麗的湯錯誤：列表索引超出範圍

回答

相關問題