腳本無法訪問內部標籤中的內容

它具有以下結構。

<merchandiser> 
    <header></header> 
    <product> 
    <name></name> 
    <URL> 
     <info> 
     </info> 
     <product> 
     </product> 
    </URL> 
    </product> 

    ............ 

    <product> 
    <name></name> 
    <URL> 
     <info> 
     </info> 
     <product> 
     </product> 

    </URL> 
    </product> 
    </merchandiser>

我使用python-lxml庫中的iter.parse（）。

for event , element in etree.iterparse(xmlfile,tag='product'): 

     if element.tag=="product" and event == "end": 
      if element.findall("..")[0].tag=='merchandiser': 
         print element.xpath('./URL/product/text()') 
         print element.xpath('./URL/info/text()') 
     element.clear()

該腳本打印標籤內的文本，但無法打印標籤內的文本。

我覺得它是因爲相同的標籤名稱。

請告訴我我做錯了什麼？

來源

2013-07-17 R Simon

「腳本打印標籤內的文本，但無法打印標籤內的文本。」？你能編輯你的問題嗎？ –

for循環遍歷所有product元素，並調用clear()，刪除所有文本和子元素。由於您在外部product元素的end事件上打印，因此在打印之前，將刪除內部product元素的文本。

來源

2013-07-17 13:52:05

謝謝！有效。 –

@RaviSimon：如果你喜歡它，你爲什麼不接受這個答案？ – refi64

這個XPath表達式：./URL/product/text()會發現一個product標籤，它是一個URL標籤內內的文本，但不是product標籤，它是一個product標籤，它是一個URL標籤內內。

也考慮使用./URL/product/product/text()或//product/text()。

來源

2013-07-17 13:54:05

腳本無法訪問內部標籤中的內容

回答

相關問題