如何刪除嵌套標記中的內容BeautifulSoup
?這些職位表現出相反的檢索中嵌套的標籤內容:How to get contents of nested tag using BeautifulSoup,並BeautifulSoup: How do I extract all the <li>s from a list of <ul>s that contains some nested <ul>s?如何使用BeautifulSoup刪除嵌套標記中的內容?
我試圖.text
,但它僅刪除標籤
>>> from bs4 import BeautifulSoup as bs
>>> html = "<foo>Something something <bar> blah blah</bar> something</foo>"
>>> bs(html).find_all('foo')[0]
<foo>Something something <bar> blah blah</bar> something else</foo>
>>> bs(html).find_all('foo')[0].text
u'Something something blah blah something else'
所需的輸出:
東西什麼東西否則
那麼......在這個例子中,你想刪除'bar'的內容嗎? –
在第二行代碼中是否應該有「else」? –