Beautifulsoup =提取內容中的內容

我想提取內容「Hello world」。請注意，頁面上也有多個<table>和<td colspan="2">。Beautifulsoup =提取內容中的內容

我嘗試了以下內容：

hello = soup.find(text='Name: ') 
hello.findPreviousSiblings

但它返回任何內容。

下面的代碼的片段：

<table border="0" cellspacing="2" width="800"> 
<tr> 
<td colspan="2"><b>Name: </b>Hello world</td> 
</tr> 
<tr>

此外，我也有問題，以下提取「我的家庭地址」：

<td><b>Address:</b></td> 

<td>My home address</td>

我還使用搜索text =「Address：」的方法相同，但是如何導航到下一行並提取<td>的內容？

來源

2011-05-14 ready

下次使用，而不是

>>> s = '<table border="0" cellspacing="2" width="800"><tr><td colspan="2"><b>Name: </b>Hello world</td></tr><tr>' 
>>> soup = BeautifulSoup(s) 
>>> hello = soup.find(text='Name: ') 
>>> hello.next 
u'Hello world'

下一個和以前讓你通過他們解析器處理的順序文檔元素移動，而同級中的方法解析樹

來源

2011-05-14 02:26:53

它沒有返回。 hello = soup.find（text ='Name：'） hello.next – ready 2011-05-14 02:35:07

'Name：'是否出現在文檔的其他地方？ – 2011-05-14 02:45:27

對不起，這是我的錯誤。現在它可以工作。 – ready 2011-05-14 03:04:27

工作contents運營商很適合從<tag>text</tag>中提取text。

<td>My home address</td>例如：

s = '<td>My home address</td>' 
soup = BeautifulSoup(s) 
td = soup.find('td') #<td>My home address</td> 
td.contents #My home address

<td><b>Address:</b></td>例如：

s = '<td><b>Address:</b></td>' 
soup = BeautifulSoup(s) 
td = soup.find('td').find('b') #<b>Address:</b> 
td.contents #Address:

來源

2013-01-09 18:21:05 solvingPuzzles

Beautifulsoup =提取內容中的內容

回答

相關問題