如何使用Python解析YouTube XML？

我想解析下面代碼中嵌入的來自YouTube的xml。我試圖顯示所有的標題。但是，當我嘗試打印「標題」時，我遇到了麻煩，只有輸入的行出現。有什麼建議？如何使用Python解析YouTube XML？

#import library to do http requests: 
import urllib2 

#import easy to use xml parser called minidom: 
from xml.dom.minidom import parseString 
#all these imports are standard on most modern python implementations 

#download the file: 
file = urllib2.urlopen('http://gdata.youtube.com/feeds/api/users/buzzfeed/uploads?v=2&max-results=50') 
#convert to string: 
data = file.read() 
#close file because we dont need it anymore: 
file.close() 

#parse the xml you downloaded 
dom = parseString(data) 
entry=dom.getElementsByTagName('entry') 
for node in entry: 
    video_title=node.getAttribute('title') 
    print video_title

來源

2012-10-08 sharataka

請添加要解析的XML摘錄。 – 2012-10-08 14:27:41

請不要使用'minidom'。該文檔告訴您使用['ElementTree' API]（http://docs.python.org/library/xml.etree.elementtree.html）。您可以在標準庫中使用包含的版本，或使用擴展該API的優秀外部['lxml'庫]（http://lxml.de/）。 –

http://lxml.de/ – sean

標題不是一個屬性，它是一個條目的子元素。

這裏是一個例子，如何將其解壓：

for node in entry: 
    video_title = node.getElementsByTagName('title')[0].firstChild.nodeValue 
    print video_title

來源

2012-10-08 16:47:58 asciimoo

LXML可以是一個有點難以弄清楚，所以這裏是一個真正簡單美麗的湯液（這就是所謂的beautifulsoup的一個原因）。你也可以設置美麗的湯來使用lxml解析器，所以速度大致相同。

from bs4 import BeautifulSoup 
soup = BeautifulSoup(data) # data as is seen in your code 
soup.findAll('title')

返回列表title元素。在這種情況下，您也可以使用soup.findAll('media:title')返回media:title元素（實際視頻名稱）。

來源

2012-10-08 16:23:25 kreativitea

有一個小蟲子在你的代碼。您訪問標題作爲屬性，但它是條目的子元素。您的代碼可以通過以下方式修復：

dom = parseString(data) 
for node in dom.getElementsByTagName('entry'): 
    print node.getElementsByTagName('title')[0].firstChild.data

來源

2012-10-08 17:05:20 Skovhus

如何使用Python解析YouTube XML？

回答

相關問題