Python Beautifulsoup找到正確的標籤

我有一個問題，試圖找出如何抓住我需要的特定標籤。Python Beautifulsoup找到正確的標籤

<div class="meaning"><span class="hinshi">［名］</span><span class="hinshi">(スル)</span></div>, <div class="meaning"><b>１</b> 今まで経験してきた仕事・身分・地位・學業などの事柄。履歴。「―を偽る」</div>,

現在我有它，所以它找到所有的意義類，但我需要進一步縮小它，以獲得我想要的。以上是一個例子。我只需要抓取

"<div class="meaning"><b>".

並忽略所有「hinshi」類。

編輯：它似乎顯示的數字，我猜是什麼，但我需要旁邊的文本。有任何想法嗎？

來源

2015-03-19 Dominic4774

您可以使用find方法的關鍵字參數找到特定的屬性。在你的情況下，你需要匹配class_關鍵字。有關class_關鍵字，請參見documentation。

假設要篩選不包含任何兒童的「hinshi」類的元素，你可以嘗試這樣的事：

soup = BeautifulSoup(data) 
potential_matches = soup.find_all(class_="meaning") 

matches = [] 
for match in potential_matches: 
    bad_children = match.find_all(class_="hinshi") 
    if not bad_children: 
    matches.append(match) 

return matches

如果你願意，你可以把它矮一點，例如：

matches = soup.find_all(class_="meaning") 
return [x for x in matches if not x.find_all(class_="hinshi")]

，或者根據您的Python版本，即2.X：

matches = soup.find_all(class_="meaning") 
return filter(matches, lambda x: not x.find_all(class_="hinshi"))

編輯：如果你想在你的例子中找到數字旁邊的外國字符，你應該先刪除b元素，然後使用get_text方法。例如

# Assuming `element` is one of the matches from above 
element.find('b').extract() 
print(element.get_text())

來源

2015-03-19 01:46:59

你可以嘗試使用.select功能，這需要一個CSS選擇器：

soup.select('.meaning b')

來源

2015-03-19 01:35:42 Xymostech

只要你能做到這樣，

for s in soup.findAll("div {class:meaning}"): 
    for b in s.findAll("b"): 
    # b.getText("<b>")

而且在 '＃' 行，你應該給予修復它的結果。

來源

2015-03-19 01:36:07 xiaohen

Python Beautifulsoup找到正確的標籤

回答

相關問題