feedparser，缺項值

我試圖解析從NOAA以下RSS提要：http://www.nhc.noaa.gov/rss_examples/gis-ep-20130530.xml feedparser，缺項值

除了這一部分它的偉大工程：

<item> 
    <title>Summary - Remnants of BARBARA (EP2/EP022013)</title> 
    <guid isPermaLink="false">summary-ep022013-201305302032</guid> 
    <pubDate>Thu, 30 May 2013 20:32:00 GMT</pubDate> 
    <author>[email protected] (NHC Webmaster)</author> 
    <link> 
    http://www.nhc.noaa.gov/text/refresh/MIATCPEP2+shtml/302031.shtml 
    </link> 
    <description> 
    ...BARBARA DISSIPATES... ...THIS IS THE LAST ADVISORY... As of 2:00 PM PDT Thu May   30 the center of BARBARA was located at 18.5, -94.5 with movement NNW at 3 mph. The minimum   central pressure was 1005 mb with maximum sustained winds of about 25 mph. 
    </description> 
    <gml:Point> 
    <gml:pos>18.5 -94.5</gml:pos> 
    </gml:Point> 
    **<nhc:Cyclone> 
      <nhc:center>18.5, -94.5</nhc:center> 
      <nhc:type>REMNANTS OF</nhc:type> 
      <nhc:name>BARBARA</nhc:name> 
      <nhc:wallet>EP2</nhc:wallet> 
      <nhc:atcf>EP022013</nhc:atcf> 
      <nhc:datetime>2:00 PM PDT Thu May 30</nhc:datetime> 
      <nhc:movement>NNW at 3 mph</nhc:movement> 
      <nhc:pressure>1005 mb</nhc:pressure> 
      <nhc:wind>25 mph</nhc:wind> 
      <nhc:headline> 
      ...BARBARA DISSIPATES... ...THIS IS THE LAST ADVISORY... 
      </nhc:headline> 
    </nhc:Cyclone>** 
    </item>

以粗體顯示的部分沒有被feedparser解析。有沒有辦法確保解析中包含自定義標籤？

驗證：

輸出的

>>> import feedparser 
>>> f = feedparser.parse('http://www.nhc.noaa.gov/rss_examples/gis-ep-20130530.xml') 
>>> f.entries[1]['description'] 
u'Shapefile last updated Thu, 30 May 2013 15:03:01 GMT' 
>>> f.entries[1]['nhc_cyclone'] 
Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
    File "feedparser.py", line 375, in __getitem__ 
    return dict.__getitem__(self, key) 
KeyError: 'nhc_cyclone'

>>> f：https://gist.github.com/mustafa0x/6199452

來源

2013-06-03 code base 5000

我遇到了同樣的問題，不知道OP是否解決了這個問題。 –

在當前飼料XML，你將看到自定義標籤實際上是進入3，不要進入。另外，雖然feedparser可以使用自定義標籤，它們被重命名。這在http://pythonhosted.org/feedparser/namespace-handling.html中描述。

試試這個（我使用feedparser的5.1.2版本）：

>>> f.entries[3].title 
u'Summary - Remnants of BARBARA (EP2/EP022013)' 
>>> f.entries[3].nhc_center 
u'18.5, -94.5' 
>>> f.entries[3].nhc_type 
u'REMNANTS OF' 
>>> f.entries[3].nhc_name 
u'BARBARA'

...同樣地，對於NHC的其他孩子：旋風。

來源

2013-08-10 11:38:00 Glenn

感謝您的回答。它實際上是入門4，而不是3。 –

非常奇怪..即使它是一箇舊日期的示例（自3以前工作以來），飼料似乎已經改變。無論如何，很高興答案爲你工作。 – Glenn

您的代碼有效;數組是0索引的。是的，它的工作原理，我很感激，但後來我發現我遇到的真正問題是一個不固定的錯誤：https://code.google.com/p/feedparser/issues/detail?id=256 –

feedparser，缺項值

回答

相關問題