BeautifulSoup - 獲得由逗號

試圖讓所有的標籤在網站分隔的所有標籤，我有這樣的一段代碼：BeautifulSoup - 獲得由逗號

results=[] 

all_links = soup.find_all('article') 
     for link in all_links: 
      print link.find('div', class_="cb-category cb-byline-element")

這樣，我得到刮以下方式顯示的數據（與','，分離<a>標籤）：

<div class="cb-category cb-byline-element"><i class="fa fa-folder-o"></i> <a href="http://ridethetempo.com/category/canadian/" title="View all posts in Canadian">Canadian</a>, <a href="http://ridethetempo.com/category/music/garage-rock/" title="View all posts in Garage">Garage</a>, <a href="http://ridethetempo.com/category/listen-2/" title="View all posts in Listen">Listen</a>, <a href="http://ridethetempo.com/category/music/" title="View all posts in Music">Music</a>, <a href="http://ridethetempo.com/category/music/psychedelic/" title="View all posts in Psychedelic">Psychedelic</a>, <a href="http://ridethetempo.com/category/under-2000/" title="View all posts in Under 2000">Under 2000</a></div>

但是，如果我做到以下幾點：

results.append(link.find('div', class_="cb-category cb-byline-element")) 
for link in results: 
    link.find('a', href=True)['href']

我只得到了第一<a>爲<div>每個塊，像這樣：

http://ridethetempo.com/category/canadian/

如何遞歸地檢索所有<a>標籤，所以我結束了這樣的結果？

http://ridethetempo.com/category/canadian/ 
http://ridethetempo.com/category/music/garage-rock/ 
http://ridethetempo.com/category/listen-2/ 
http://ridethetempo.com/category/music/ 
http://ridethetempo.com/category/music/psychedelic/ 
http://ridethetempo.com/category/under-2000/

來源

2017-04-20 outkast

嘗試的findAll（「A」，HREF = TRUE） – datawrestler

是的，它的工作原理，謝謝 – outkast

for link in soup.find_all('a'): 
    print(link.get('href'))

將打印所有的 'A' 標記的元素

來源

2017-04-20 04:37:50

BeautifulSoup - 獲得由逗號

回答

相關問題