2017-08-08 39 views
0

我想刮名稱從該公司的會員目錄網頁&地址數據:刮名稱和地址到字典(Python的BeautifulSoup4)

http://mfda.ca/members/directory-of-members/

我想輸出存儲在字典中,以關鍵字作爲成員的名稱(即3i Financial Investment Services Inc.)和價值作爲他們的地址。

我能夠追加到字典中的名字,但由於某種原因,我不能附上他們的地址作爲關鍵。任何人都可以指導我如何做到這一點?

import requests 

from bs4 import BeautifulSoup 

import requests 

url = "http://mfda.ca/members/directory-of-members/" 

r = requests.get(url) 

data = r.text 

soup = BeautifulSoup(data) 

#name 
letters= soup.find_all("div", class_="col-sm-6 col-md-6") 

lobbying={} 
for element in letters: 
    lobbying[element.b.get_text()]={} 
print(lobbying)  

#addr 
Addr= soup.find_all("div", class_="col-sm-6 col-md-6 p-marg") 
for element in Addr: 
    address=element.p.get_text() 
    lobbying[element.p.get_text()]["addr"]=address 
+0

字母標記和地址標籤的數量不匹配。 –

回答

0

我會建議刮的名稱和地址在一起,並同時建立字典:

lobbying = {} 
rows = soup.find_all('div', {'class' : 'row member-name'}) 

for row in rows: 
    try: 
     name = row.find('div', {'class' : 'col-sm-6 col-md-6'}) 
     addr = row.find('div', {'class' : 'col-sm-6 col-md-6 p-marg'}) 
     lobbying[name.a.b.text] = {'addr' : addr.p.text} 
    except AttributeError: 
     pass 

print(lobbying) 

輸出:

{ 
    '3i Financial Investment Services Inc.': { 
     'addr': 'Suite #221, 9040 Leslie Street\nRichmond Hill, ON L4B 3M4\nPhone: (905) 597-5000\nFax: (905) 597-8366' 
    }, 
    'ARTECH Asset Advisory Services Inc.': { 
     'addr': '209 - 3993 Henning Drive\nBurnaby, BC\xa0V5C 6P7\nPhone: (604) 434-3863\nFax: (604) 434-3873' 
    } 
... 
}