如何找到所有意見與美麗的湯

This question被問到四年前，但現在的答案是過時的BS4。如何找到所有意見與美麗的湯

我想刪除我的html文件中使用美麗的湯的所有評論。由於BS4使每comment as a special type of navigable string，我認爲這代碼將工作：

for comments in soup.find_all('comment'): 
    comments.decompose()

所以沒有工作....我如何找到使用BS4所有評論？

來源

2015-10-15 Joseph

This [answer]（http://stackoverflow.com/a/3507360/771848）應該仍然工作，我想。 – alecxe

我得到的「全球名稱」評論「沒有定義」 – Joseph

我意識到這是舊的，但@約瑟夫，如果你導入從bs4評論它應該工作 – atarw

你可以通過一個函數來find_all （）來幫助它檢查字符串是否是Comment。

例如，我有以下的html：

<body> 
    <!-- Branding and main navigation --> 
    <div class="Branding">The Science &amp; Safety Behind Your Favorite Products</div> 
    <div class="l-branding"> 
     <p>Just a brand</p> 
    </div> 
     <!-- test comment here --> 
     <div class="block_content"> 
      <a href="https://www.google.com">Google</a> 
    </div> 
</body>

代碼：

from bs4 import BeautifulSoup as BS 
from bs4 import Comment 
.... 
soup=BS(html,'html.parser') 
comments=soup.find_all(string=lambda text:isinstance(text,Comment)) 
for c in comments: 
    print c 
    print "===========" 
    c.decompose()

輸出將是：

Branding and main navigation 
============ 
test comment here 
============

順便說一句，我認爲之所以find_all('Comment')不工作是從（從BeautifulSoup文件）：

傳入名稱的值，你會告訴美麗的湯只考慮具有特定名稱的標籤。 文本字符串將被忽略，因爲標籤的名稱不匹配。

來源

2015-10-15 03:39:12 Flickerlight

我很高興我找到了你的答案，謝謝！任何想法如何我們可以不使用lambda寫它？ – JinSnow

兩件事情我需要做的：

首先，進口美麗的湯

二時，這裏的代碼以抽取註釋

for comments in soup.findAll(text=lambda text:isinstance(text, Comment)): 
    comments.extract()

來源

2015-10-15 03:26:53 Joseph

如何找到所有意見與美麗的湯

回答

相關問題