如何獲得在Python中div標籤中存在的標籤？

我正在使用python抓取一個網站http://i.cantonfair.org.cn/en/expexhibitorlist.aspx?categoryno=411。我希望得到禮物div標籤裏的鏈接，其中有兩個標籤，如：如何獲得在Python中div標籤中存在的標籤？

<div id="main_category"> 
    <div class="tit1"><a href="#" onclick="ExpandStage(1);"><strong>Phase 1</strong><br />April 15 - 19</a></div> 
    <ul id="phase1"> 
    <li><a href="expexhibitorlist.aspx?categoryno=411">Consumer Electronics and Information Products</a></li> 
    <li><a href="expexhibitorlist.aspx?categoryno=412">Electronic and Electrical Products</a></li>

，我只希望所有的標籤就像

<a href="expexhibitorlist.aspx?categoryno=411">Consumer Electronics and Information Products</a>

。還有我如何使用正則表達式找到那些網址？

我想這樣

from bs4 import BeautifulSoup 
import re 
import urllib.request 
r = urllib.request.urlopen('http://i.cantonfair.org.cn/en/expexhibitorlist.aspx?categoryno=410').read() 
soup = BeautifulSoup(r, "html.parser") 
letters = soup.find_all("div",{"id":"main_category"}) 
for element in letters: 
categories = element.a.get_text() 
print (categories)

來源

2016-02-24 Aman Kumar

我使用Python 2.7版，對我下面的作品。 Python 3可以使用相同的方法。希望它有幫助：

from bs4 import BeautifulSoup as bs 
from urllib2 import urlopen 
r = urlopen('http://i.cantonfair.org.cn/en/expexhibitorlist.aspx?categoryno=410').read() 
soup = bs(r, "lxml") 
lis = soup.find_all("li") 
hrefs = [c.a['href'] for c in lis] 
print hrefs

來源

2016-02-24 14:38:48 Quinn

如何獲得在Python中div標籤中存在的標籤？

回答

相關問題