在beautifulsoup python中查找所有（「a」）

Python新手，有人可以解釋findAll("a")在以下代碼中的含義嗎？我可以放置其他任何信件嗎？像g，h，m？ 'a'是否意味着在文章中找到「a」？在beautifulsoup python中查找所有（「a」）

和href=re.compile("^(/wiki/)((?!:).)*$"))是否意味着找到那些具有wiki名稱的鏈接？

from urllib.request import urlopen 
from bs4 import BeautifulSoup 
import re 
html = urlopen("http://en.wikipedia.org/wiki/Kevin_Bacon") 
bsObj = BeautifulSoup(html) 
for link in bsObj.find("div", {"id":"bodyContent"}).findAll("a", 
href=re.compile("^(/wiki/)((?!:).)*$")): 
    if 'href' in link.attrs: 
     print(link.attrs['href'])

有人可以請建議一些很好的書籍來學習網頁抓取在python 3.6，初學者可以輕鬆學習？

來源

2017-08-08 Prince Bhatia

[查看文檔]開始時，所有鏈接（https://www.crummy.com/software/BeautifulSoup/bs3/documentation.html） – Mangohero1

findAll("a")意味着搜索所有「A」（錨）標籤

，是的，你可以使用 'H'， 'B'， '強' 和任何其他有效的HTML標記名代替 'A'

您可以瞭解更多的BeautifulSoup here

而且re.compile("^(/wiki/)((?!:).)*$"))將得到與wiki

來源

2017-08-08 12:51:53

images = bsObj.findAll（「img」，{「src」：re.compile（「\。\。\/img \/gifts/img。* \。jpg」）}）圖像中的圖像： print （image [「src」]）但在這段代碼中，我們使用（「img」）查找所有內容後？ –

在beautifulsoup python中查找所有（「a」）

回答

相關問題