我使用漂亮的soup4
進行網頁抓取,但是find_all('tables')
導致無。Wb刮 - find_all沒有得到任何值
下面是我的代碼:
#import the library used to query a website
import urllib.request
#specify the url
wiki="https://en.wikipedia.org/wiki/List_of_state_and_union_territory_capitals_in_India"
#Query the website and return the html to the variable 'page'
page = urllib.request.urlopen(wiki)
#import the Beautiful soup functions to parse the data returned from the website
from bs4 import BeautifulSoup
#Parse the html in the 'page' variable, and store it in Beautiful Soup format
soup = BeautifulSoup(page)
print (soup.prettify())
soup.title
soup.title.string
soup.a
soup.find_all("a")
all_links = soup.find_all("a")
for link in all_links:
print (link.get("href"))
all_tables = soup.find_all('tables')
LOG:all_tables = soup.find_all('tables')
。
請建議
有跡象表明,內頁沒有''標籤,也許你的意思'soup.find_all ( '表')'。 –
metatoaster
試着用'all_tables = soup.find_all('table')' – PRMoureu
表格不是html標籤。你想要這個all_tables包含什麼? – Nenad