2017-08-02 56 views
2

我試圖從這website從房地產經紀人名稱。Python Web刮:類問題

我的代碼:

containers = page_soup.findAll("div",{"class":"team-details"}) 

for container in containers: 
    agent_name = container.findAll("a", {"class":"team-name_link"}) 
    name = agent_name[0].text 


    print("name: " + name) 

然而,當我運行該腳本,我收到只有前兩個名字,緊隨其後的錯誤消息:

name: Michael Stavrianos 
name: Kristalla Stavrianos 
Traceback (most recent call last): 
    File "C:\Users\Toby\Desktop\Webscrape\LjHooker - mark1.py", line 16, in <module> 
    name = agent_name[0].text 
IndexError: list index out of range 

我發現,前兩劑名字在班級「team-name_link」下,但其餘的都在班級「團隊名稱」下。我不確定如何在同一時間從兩組課程中刪除名稱。

回答

2

我想你理解錯了,所有的名字都是所需的標籤內,但實際上你需要尋找div

from bs4 import BeautifulSoup 
import requests 

html = requests.get("https://woollahra.ljhooker.com.au/our-team").text 
soup = BeautifulSoup(html, 'html.parser') 
containers = soup.findAll("div",{"class":"team-details"}) 

for container in containers: 
    agent_name = container.find("div", {"class":"team-name"}) 
    name = agent_name.text 
    print(name) 

上面的代碼輸出:

Michael Stavrianos 
       Licensee 



Kristalla Stavrianos 
       Principal 



Jade Marshall 
       Property Management Associate 


Emma Phelan 
       Property Management Associate 


Isabella Marechal - Ross 
       Property Management Associate 


Victoria Empson 
       Property Investment Manager