使用beautifulsoup</br>標籤之間的數據提取

我有這個html數據，我需要解析從中提取數據。但它有這麼多的標籤和數據也很難通過我。從下面的Html數據我需要創建Python字典列表看起來像：使用beautifulsoup</br>標籤之間的數據提取

[{ 「學校」：「童車戲」}，{ 「地方」：「紐約」}，{ 「級別」：「四」}，{」國家「：」 USA 「}，{」級當然了「：」易「}]

<div class="quick"> 
<strong>School</strong><br /> Childs play <br /><br /> 
<strong>Place</strong><br /> 
<a href="Search.aspx?Menu=new&amp;Me=">newyork</a><br /><br /> 
<strong>Level</strong><br />four<br /><br /> 
<strong>Country</strong><br />USA<br /><br /> 
<strong>Level Of Course</strong><br />Easy<br /><br /> 
</div>

我嘗試使用beautifulsoup，但沒有得到成功。請幫忙

來源

2012-04-18 Anshul

不幸的是，HTML不是理想的解析，但它可以提取數據到一個有意義的Python字典。

from BeautifulSoup import BeautifulSoup 
soup = BeautifulSoup(htmlString) 

raw_data = soup.find(**{"class": "quick"}).contents 
data = [x for x in raw_data if not hasattr(x, "name") or not x.name == "br"]

使用if not hasattr(x, "name") or not x.name == "br"首先檢查，以確保該項目的NavigableString一個實例，然後檢查該元素是不是<BR>標籤。

data然後將是[<KEY>, <VALUE>, <KEY>, <VALUE>]的格式，從中提取數據應該是相當無足輕重的。

來源

2012-04-18 07:59:38

哇謝謝........ – Anshul 2012-04-18 09:04:28

使用beautifulsoup</br>標籤之間的數據提取

回答

相關問題