2017-04-18 68 views
2

我想刮(https://en.wikiquote.org/wiki/Remember_the_Titans#Coach_Boone),我想從除了對話,標語和外部鏈接的所有部分獲得報價。我可以去ul > li,但它是取得一切。我怎樣才能在下面的HTML後取ul > liPython:如何挑選相鄰的元素?

<h2><span class="mw-headline" id="Coach_Boone">Coach Boone</span><span class="mw-editsection"><span class="mw-editsection-bracket">[</span><a href="/w/index.php?title=Remember_the_Titans&amp;action=edit&amp;section=1" title="Edit section: Coach Boone">edit</a><span class="mw-editsection-bracket">]</span></span></h2> 

回答

2

一旦你所在的h2元素,使用.find_next_siblings()方法得到以下ul同級元素:

h2 = soup.find("span", id="Coach_Boone").find_parent('h2') 
for ul in h2.find_next_siblings("ul"): 
    for li in ul.find_all("li"): 
     print(li) 
+0

我發現它的唯一問題獲取下一節的'ul'。 – Volatil3