0
說我有這樣的事情:的屬性值,而不管屬性的BeautifulSoup找到
<div class="cake">1</div>
<h2 id="cake">1</div>
<sometag someattribute="cake">1</div>
我想搜索的關鍵詞「蛋糕」,讓所有的人。
說我有這樣的事情:的屬性值,而不管屬性的BeautifulSoup找到
<div class="cake">1</div>
<h2 id="cake">1</div>
<sometag someattribute="cake">1</div>
我想搜索的關鍵詞「蛋糕」,讓所有的人。
使用lambda查找所有內容並搜索給定的屬性值,或者如果某個類包含所需的值。
from bs4 import BeautifulSoup
example = """<div class="cake">1</div>
<h2 id="cake">1</div>
<sometag someattribute="cake">1</div>"""
soup = BeautifulSoup(example, "html.parser")
print (soup.find_all(lambda tag: [a for a in tag.attrs.values() if a == "cake" or "cake" in tag.get("class")]))
輸出:
[<div class="cake">1</div>, <h2 id="cake">1</h2>, <sometag someattribute="cake">1</sometag>]
你可以使用正則表達式和BeautifulSoup在一起。這是我的可怕的腳本:
r = '''<div class="cake">1</div>
<h2 id="cake">1</div>
<sometag someattribute="cake">1</div>'''
import re
from bs4 import BeautifulSoup
soup = BeautifulSoup(r, 'lxml')
for i in range(len(re.findall(r'(\w+)="cake"',str(soup)))-1):
print(soup.find_all(re.compile(r'(\w+)'), {(re.findall(pattern,str(soup)))[i]:'cake'}))
輸出:
[<div class="cake">1</div>]
[<h2 id="cake">1 </div>
<sometag someattribute="cake">1</sometag></h2>]