Python XMLParser：什麼時候是data（）方法調用

我在學習Python，並且對XML解析器（ElementTree-XMLParser）行爲有一些困難的理解。Python XMLParser：什麼時候是data（）方法調用

我修改的例子在documentation

class MaxDepth:      # The target object of the parser 
    path = "" 
    def start(self, tag, attrib): # Called for each opening tag. 
     self.path += "/"+ tag 
     print '>>> Entering - ' + self.path 
    def end(self, tag):    # Called for each closing tag. 
     print '<<< Leaving - ' + self.path 
     if self.path.endswith('/'+tag): 
      self.path = self.path[:-(len(tag)+1)] 
    def data(self, data): 
     if data: 
      print '... data called ...' 
      print data , 'length -' , len(data) 
    def close(self): # Called when all data has been parsed. 
     return self

它輸出下面輸出

>>> Entering - /a 
... data called ... 

length - 1 
... data called ... 
    length - 2 
>>> Entering - /a/b 
... data called ... 

length - 1 
... data called ... 
    length - 2 
<<< Leaving - /a/b 
... data called ... 

length - 1 
... data called ... 
    length - 2 
>>> Entering - /a/b 
... data called ... 

length - 1 
... data called ... 
    length - 4 
>>> Entering - /a/b/c 
... data called ... 

length - 1 
... data called ... 
     length - 6 
>>> Entering - /a/b/c/d 
... data called ... 

length - 1 
... data called ... 
     length - 6 
<<< Leaving - /a/b/c/d 
... data called ... 

length - 1 
... data called ... 
    length - 4 
<<< Leaving - /a/b/c 
... data called ... 

length - 1 
... data called ... 
    length - 2 
<<< Leaving - /a/b 
... data called ... 

length - 1 
<<< Leaving - /a 
<__main__.MaxDepth instance at 0x10e7dd5a8>

我的問題是

當是（）方法調用的數據。
爲什麼在開始標記之前調用兩次
我無法找到api文檔以獲取有關data方法的更多詳細信息。我在哪裏可以找到類似XMLParser類的api參考javadoc。

來源

2012-06-11 bsr

如果您的使用不需要事件解析，則使用'.parse（）'http://www.doughellmann.com/PyMOTW/xml/etree/ElementTree/parse.html更容易。否則，他的事件示例可能會有所幫助：http://www.doughellmann.com/PyMOTW/xml/etree/ElementTree/parse.html#watching-events-while-parsing – ninMonkey

如果你要修改數據的方法，像這樣：

def data(self, data): 
    if data: 
     print '... data called ...' 
     print repr(data), 'length -' , len(data)

，你就會明白爲什麼有對數據的方法多次調用;它被稱爲爲標籤之間的文本每一行數據：

>>> Entering - /a 
... data called ... 
'\n' length - 1 
... data called ... 
' ' length - 2 
>>> Entering - /a/b 
... data called ... 
'\n' length - 1 
... data called ... 
' ' length - 2 
<<< Leaving - /a/b 
... data called ... 
'\n' length - 1 
... data called ... 
' ' length - 2 
>>> Entering - /a/b 
... data called ... 
'\n' length - 1 
... data called ... 
' ' length - 4 
# ... etc ...

的XMLParser的方法是基於Expat解析器。

根據我的經驗，任何流式XML解析器都會將文本數據視爲一系列塊，並且必須將任何和所有數據事件連接在一起，直到您觸及下一個starttag或endtag事件。解析器經常在空白邊界處分塊，但這不是給定的。

來源

2012-06-11 16:05:08

Python XMLParser：什麼時候是data（）方法調用

回答

相關問題