我已經做了這樣的事情,其中變量html是您的代碼<html><body>word-one word-two word-one</body></html>
,我分開了文本和代碼,然後將它們添加在一起。
soup = BeautifulSoup(html,'html.parser')
text = soup.text # Only the text from the soup
soup.body.clear() #Clear the text between the body tags
new_text = text.split() # Split beacuse of the spaces much easier
for i in new_text:
new_tag = soup.new_tag('span') #Create a new tag
new_tag.append(i) #Append i to it (from the list that's split between spaces)
#example new_tag('a') when we append 'word' to it it will look like <a>word</a>
soup.body.append(new_tag) #Append the whole tag e.g. <span>one-word</span)
我們也可以用正則表達式來匹配某個詞。
soup = BeautifulSoup(html, 'html.parser')
text = soup.text # Only the text from the soup
soup.body.clear() # Clear the text between the body tags
theword = re.search(r'\w+', text) # Match any word in text
begining, end = theword.start(), theword.end()
soup.body.append(text[:begining]) # We add the text before the match
new_tag = soup.new_tag('span') # Create a new tag
new_tag.append(text[begining:end])
# We add the word that we matched in between the new tag
soup.body.append(new_tag) # We append the whole text including the tag
soup.body.append(text[end:]) # Append everything that's left
我確定我們可以用類似的方式使用.insert
。
你還在使用['lxml'](http://lxml.de/)嗎?請參閱[另一個元素之後的python lxml append元素](https://stackoverflow.com/questions/7474972/python-lxml-append-element-after-another-element) –
沒有隻是嘗試BS,因爲我發現它更容易 – Nishant