蟒蛇feedparser

-1

你會如何解析XML數據與Python feedparser蟒蛇feedparser

<Book_API> 
<Contributor_List> 
<Display_Name>Jason</Display_Name> 
</Contributor_List> 
<Contributor_List> 
<Display_Name>John Smith</Display_Name> 
</Contributor_List> 
</Book_API>

來源

2011-02-17 bob

正如Lennart Regebro所說，它似乎不是RSS/Atom提要，而只是XML文檔。 Python標準庫中有幾個XML解析工具（SAX和DOM兩者）。我建議你ElementTree。另外lxml是第三方庫中最好的一個（這是對ElementTree的直接替換）。

try: 
    from lxml import etree 
except ImportError: 
    try: 
     from xml.etree.cElementTree as etree 
    except ImportError: 
     from xml.etree.ElementTree as etree 

doc = """<Book_API> 
<Contributor_List> 
<Display_Name>Jason</Display_Name> 
</Contributor_List> 
<Contributor_List> 
<Display_Name>John Smith</Display_Name> 
</Contributor_List> 
</Book_API>""" 
xml_doc = etree.fromstring(doc)

來源

2011-02-17 11:30:21 minhee

這看起來並不像任何形式的RSS/ATOM源如下。我根本不會使用feedparser，我會使用lxml。實際上，Feedparser無法理解它，並在您的示例中刪除「Jason」貢獻者。

from lxml import etree 

data = <fetch the data somehow> 
root = etree.parse(data)

現在你有了一個xml對象樹。更具體地說，如何在lxml中做到這一點是不可能的，直到你真正提供有效的XML數據。 ;）

來源

2011-02-17 10:41:51

回答

相關問題