0
這裏分解代碼的HTML部分是輸入HTML片段:我試圖用BeautifulSoup
<p>both before<img src="it is free"style="width:590px;height:228px;"> and after img tag</p>
下面是所需的輸出片段:
<p> both before </p><img src="it is free"style="width:590px;height:228px;"><p>after img tag</p>
這裏是我的代碼片段:
from bs4 import BeautifulSoup
from bs4 import NavigableString
while len(p_tag.contents) != 0:
item = p_tag.contents.pop(0)
if isinstance(item, NavigableString):
new_tag = doc.new_tag('p')
new_tag.string = item
current_tag.insert_after(new_tag)
current_tag = current_tag.next_sibling
else:
new_tag = item
current_tag.insert_after(new_tag)
current_tag = current_tag.next_sibling
但我得到以下錯誤,雖然我很確定我有標記內容:
raise ValueError("Tag.index: element not in tag")
ValueError: Tag.index: element not in tag
請使用 'html5lib' 作爲BeautifulSoup前解析器:
doc = BeautifulSoup(open(input_doc), 'html5lib')
有什麼辦法來擺脫這種錯誤的? 在此先感謝。