2009-04-30 73 views

回答

10

ElementTree的被提供作爲標準Python庫的一部分。 ElementTree的是純粹的Python和cElementTree是更快的C實現:

# Try to use the C implementation first, falling back to python 
try: 
    from xml.etree import cElementTree as ElementTree 
except ImportError, e: 
    from xml.etree import ElementTree 

下面是一個例子使用,在那裏我從一個RESTful Web服務消費的xml:

def find(*args, **kwargs): 
    """Find a book in the collection specified""" 

    search_args = [('access_key', api_key),] 
    if not is_valid_collection(kwargs['collection']): 
     return None 
    kwargs.pop('collection') 
    for key in kwargs: 
     # Only the first keword is honored 
     if kwargs[key]: 
      search_args.append(('index1', key)) 
      search_args.append(('value1', kwargs[key])) 
      break 

    url = urllib.basejoin(api_url, '%s.xml' % 'books') 
    data = urllib.urlencode(search_args) 
    req = urllib2.urlopen(url, data) 
    rdata = [] 
    chunk = 'xx' 
    while chunk: 
     chunk = req.read() 
     if chunk: 
      rdata.append(chunk) 
    tree = ElementTree.fromstring(''.join(rdata)) 
    results = [] 
    for i, elem in enumerate(tree.getiterator('BookData')): 
     results.append(
       {'isbn': elem.get('isbn'), 
       'isbn13': elem.get('isbn13'), 
       'title': elem.find('Title').text, 
       'author': elem.find('AuthorsText').text, 
       'publisher': elem.find('PublisherText').text,} 
      ) 
    return results 
+0

vezult,怎麼來的,有時你使用elem.get(),有時你使用elem.find()的文本? – rick 2009-05-07 00:52:18

+0

@rick:elem.get()獲取元素屬性的值,而elem.find()則搜索elem元素中包含的元素。 – vezult 2009-05-08 02:49:56

0

還有BeautifulSoup,其中有一些API可能更喜歡。這裏是你如何能提取已經從Twitter的公共時間軸收藏所有鳴叫一個例子:

from BeautifulSoup import BeautifulStoneSoup 
import urllib 

url = urllib.urlopen('http://twitter.com/statuses/public_timeline.xml').read() 
favorited = [] 

soup = BeautifulStoneSoup(url) 
statuses = soup.findAll('status') 

for status in statuses: 
    if status.find('favorited').contents != [u'false']: 
     favorited.append(status) 
相關問題