如何遍歷具有特定屬性值的特定值的所有標籤?例如,假設我們只需要data1,data2等。 <html>
<body>
<invalid html here/>
<dont care> ... </dont care>
<invalid html here too/>
<interesting attrib1="naah, it is not
In [1]: from lxml import etree
我有一個HTML文檔丟失的文檔類型: In [2]: root = etree.fromstring(u'''<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">\n<HTML></HTML>''', etree.HTMLParser())
它的DOCTYPE被正確解析: In [3]: