2017-05-25 33 views
0

我只想用BeautifulSoup查找表由H1頭之前

<a name="playerlist"></a> 
 
<div class="navbuttons"> 
 
<a href="#toc" class="linkbutton">up</a><a class="linkbutton" href="#players">next</a> 
 
</div> 
 
<h1>Participants</h1> 
 
<table class="main"> 
 
<thead> 
 
<tr> 
 
<th>Name </th><th>Major</th><th>Class of</th><th>Ranking</th></tr> 
 
</thead> 
 
<tbody> 
 
<tr> 
 
<td>Mike Finge</td><td>Applied Maths</td><td>2015</td><td>155</td> 
 
</tr> 
 
</tbody> 
 
</table>

在這個例子之前找到使用H1一個HTML表格上面,我想找到桌下h1? 我如何用BeautifulSoup做到這一點? 在此先感謝

+0

用於表格margin-top:-30px;或任何其他合適的-ve值 –

+0

只需使用'h1 + table'選擇器 –

回答

2

我認爲你應該在BeautifulSoup使用h1+table如表略低於上半年

0

由於table元素是h1可以做到這一點的兄弟姐妹,也就是說,你可以使用可用的~操作爲select方法。

>>> HTML = '''\ 
... <a name="playerlist"></a> 
... <div class="navbuttons"> 
... <a href="#toc" class="linkbutton">up</a><a class="linkbutton" href="#players">next</a> 
... </div> 
... <h1>Participants</h1> 
... <table class="main"> 
... <thead> 
... <tr> 
... <th>Name </th><th>Major</th><th>Class of</th><th>Ranking</th></tr> 
... </thead> 
... <tbody> 
... <tr> 
... <td>Mike Finge</td><td>Applied Maths</td><td>2015</td><td>155</td> 
... </tr> 
... </tbody> 
... </table> 
... ''' 
>>> from bs4 import BeautifulSoup 
>>> soup = BeautifulSoup(HTML, 'lxml') 
>>> soup.select('h1 ~ table') 
[<table class="main"> 
<thead> 
<tr> 
<th>Name </th><th>Major</th><th>Class of</th><th>Ranking</th></tr> 
</thead> 
<tbody> 
<tr> 
<td>Mike Finge</td><td>Applied Maths</td><td>2015</td><td>155</td> 
</tr> 
</tbody> 
</table>]