2017-06-19 25 views
0

我無法從下面的county列表中填充循環結果。當我將每次迭代的結果與列表中項目的索引一起打印出來時,我發現每次都得到一個0的索引,表明每次循環後數據都不會保留在列表中。因此,當我在循環完成後嘗試索引county循環時,當然沒有數據,所以我得到'列表索引超出範圍錯誤'。具有循環的Python列表人口

我研究了「列表索引超出範圍」錯誤,我不斷收到,我明白我得到它,因爲county列表爲空,但爲什麼它是空的?

的HTML源代碼,構成了在target_divs列表中的一個條目看起來是這樣的:

<div class="school-type-list-text"> 
<div class="table_cell_county"><a href='/alabama/autauga-county'>Autauga County</a></div> 
<div class="change_div"></div> 
<div class="table_cell_other">7<span> Schools</span></div> 
<div class="table_cell_other">1,587<span> Students</span></div> 
<div class="table_cell_other">8%<span> Minority</span></div> 
<div class="break"></div> 

這裏是我的腳本:

import urllib2 
from bs4 import BeautifulSoup 
import pandas 
import csv 

page1 = 'https://www.privateschoolreview.com/alabama' 
alabama = urllib2.urlopen(page1) 
soup = BeautifulSoup(alabama, "lxml") 
target_divs = soup.find_all("div", class_= "school-type-list-text") 

for i in target_divs: 
    county = i.find_all("div", class_= "table_cell_county") 
    for i in county: 
     print i.text 
     print county.index(i) 

print county 
print county[0] 

更新後@軟件2勸改變環路光標,但我仍然得到相同的錯誤:

import urllib2 
from bs4 import BeautifulSoup 
import pandas 
import csv 

page1 = 'https://www.privateschoolreview.com/alabama' 

alabama = urllib2.urlopen(page1) 

soup = BeautifulSoup(alabama, "lxml") 

target_divs = soup.find_all("div", class_= "school-type-list-text") 

for div in target_divs: 
    counties = div.find_all("div", class_= "table_cell_county") 
    for county in counties: 
     print county.text 
     print counties.index(county) 

print counties 
+0

你有兩個'for'循環,即參考'i' – depperm

+1

的OP已經粘貼從代碼的輸出。請不要編輯它。 –

回答

0

我可能會你可以試試這個。看來你正在使用相同的我在嵌套循環

for i in target_divs: 
    county = i.find_all("div", class_= "table_cell_county") 
    for j in county: 
     print j.text 
     print county.index(j) 
0

您正在使用嵌套循環相同的變量i作爲兩個不同的東西。所以第一個被覆蓋。更改第二個變量名稱。

理想情況下,像i這樣的變量名稱不是非常具有描述性,並且很容易發生這樣的錯誤。嘗試這樣的:

for div in target_divs: 
    counties = div.find_all("div", class_= "table_cell_county") 
    for county in counties: 
     print county.text 
     print counties.index(county) 
+0

作出了改變,但'縣'仍然沒有填充。任何額外的想法?我已經在上面的文章中更新了我的代碼,以便確保我遵循了您的建議。 – SFarkas

0

我假設你想在counties縣的名單。在我看來,問題是返回值爲div.find_all(),返回最多隻有一個縣的數組。要填充縣,試試下面的辦法:

counties = [] 
for div in target_divs: 
    county = div.find_all('div', class_= 'table_cell_county') 
    for c in county: 
     counties.append(c.text.encode('utf-8')) 

print counties # Returns: ['Autauga County', 'Baldwin County', 'Barbour County', 'Bibb County', 'Blount County', 'Bullock County', 'Butler County', 'Calhoun County', 'Chambers County', 'Chilton County', 'Choctaw County', 'Clarke County', 'Clay County', 'Coffee County', 'Colbert County', 'Conecuh County', 'Covington County', 'Crenshaw County', 'Cullman County', 'Dale County', 'Dallas County', 'Dekalb County', 'Elmore County', 'Escambia County', 'Etowah County', 'Greene County', 'Hale County', 'Henry County', 'Houston County', 'Jackson County', 'Jefferson County', 'Lauderdale County', 'Lee County', 'Limestone County', 'Lowndes County', 'Macon County', 'Madison County', 'Marengo County', 'Marion County', 'Marshall County', 'Mobile County', 'Monroe County', 'Montgomery County', 'Morgan County', 'Perry County', 'Pickens County', 'Pike County', 'Randolph County', 'Russell County', 'Saint Clair County', 'Shelby County', 'Sumter County', 'Talladega County', 'Tallapoosa County', 'Tuscaloosa County', 'Walker County', 'Wilcox County', 'Winston County'] 
print counties[0] # Returns: 'Autauga County' 
+0

就是這樣!謝謝@root !!! – SFarkas

+0

@SFarkas沒問題!此外,如果您可以將其註冊或標記爲答案,它也會幫助其他人:) – root