2
我試圖抓取https://www.wellstar.org/locations/pages/default.aspx的位置數據,當我查看源代碼時,我注意到醫院地址的類有時拼寫有額外的'd' - 'adddress'和'address' 。有沒有辦法來解決以下代碼中的這種差異?我試圖加入一個if
語句來測試address
對象的長度,但我只能得到與'adddress'類關聯的地址。我覺得我很接近但沒有想法。BeautifulSoup - 拼錯類
import urllib
import urllib.request
from bs4 import BeautifulSoup
import re
def make_soup(url):
thepage = urllib.request.urlopen(url)
soupdata = BeautifulSoup(thepage,"html.parser")
return soupdata
soup = make_soup("https://www.wellstar.org/locations/pages/default.aspx")
for table in soup.findAll("table",class_="s4-wpTopTable"):
for type in table.findAll("h3"):
type = type.get_text()
for name in table.findAll("div",class_="PurpleBackgroundHeading"):
name = name.get_text()
address=""
for address in table.findAll("div",class_="WS_Location_Adddress"):
address = address.get_text(separator=" ")
if len(address)==0:
for address in table.findAll("div",class_="WS_Location_Address"):
address = address.get_text(separator = " ")
print(type, name, address)
兩個很好的選擇 - 我很好奇/正則表達式嚇倒,是誠實的。這可能是花點時間學習操作員的理由。 – Daniel