使用python/elementree

我需要尋找指定，但不使用命名空間XML解析XML：使用python/elementree

<WRMHEADER xmlns="http://schemas.microsoft.com/DRM/2007/03/PlayReadyHeader" version="4.0.0.0"> 
    <DATA> 
     <PROTECTINFO> 
      <KEYLEN>16</KEYLEN> 
      <ALGID>AESCTR</ALGID> 
     </PROTECTINFO> 

     <LA_URL>http://192.168.8.33/license/rightsmanager.asmx</LA_URL> 
     <LUI_URL>http://192.168.8.33/license/rightsmanager.asmx</LUI_URL> 

     <DS_ID></DS_ID> 
     <KID></KID> 
     <CHECKSUM></CHECKSUM> 

    </DATA> 
</WRMHEADER>

我想閱讀的各個領域，例如值數據/ protectinfo/KEYLEN等

root = ET.fromstring(sMyXml) 
keylen = root.findall('./DATA/PROTECTINFO/KEYLEN') 

print root 
print keylen

此代碼打印如下：

<Element {http://schemas.microsoft.com/DRM/2007/03/PlayReadyHeader}WRMHEADER at 0x7f2a7c35be60> 
[]

root.find和root.findall返回None或[]對於這個查詢。我一直無法指定默認名稱空間，是否有解決方案來查詢這些值？感謝

來源

2016-06-21 stack user

創建一個命名空間字典：

x = """<WRMHEADER xmlns="http://schemas.microsoft.com/DRM/2007/03/PlayReadyHeader" version="4.0.0.0"> 
    <DATA> 
     <PROTECTINFO> 
      <KEYLEN>16</KEYLEN> 
      <ALGID>AESCTR</ALGID> 
     </PROTECTINFO> 

     <LA_URL>http://192.168.8.33/license/rightsmanager.asmx</LA_URL> 
     <LUI_URL>http://192.168.8.33/license/rightsmanager.asmx</LUI_URL> 

     <DS_ID></DS_ID> 
     <KID></KID> 
     <CHECKSUM></CHECKSUM> 

    </DATA> 
</WRMHEADER>""" 
from xml.etree import ElementTree as ET 

root = ET.fromstring(x) 
ns = {"wrm":"http://schemas.microsoft.com/DRM/2007/03/PlayReadyHeader"} 
keylen = root.findall('wrm:DATA', ns) 

print root 
print keylen

現在你應該得到的東西，如：

<Element '{http://schemas.microsoft.com/DRM/2007/03/PlayReadyHeader}WRMHEADER' at 0x7fd0a30d45d0> 
[<Element '{http://schemas.microsoft.com/DRM/2007/03/PlayReadyHeader}DATA' at 0x7fd0a30d4610>]

要獲得/DATA/PROTECTINFO/KEYLEN：

In [17]: root = ET.fromstring(x) 

In [18]: ns = {"wrm":"http://schemas.microsoft.com/DRM/2007/03/PlayReadyHeader"} 
In [19]: root.find('wrm:DATA/wrm:PROTECTINFO/wrm:KEYLEN', ns).text 
Out[19]: '16'

來源

2016-06-21 11:48:41

不用擔心，如果你是在Python做了很多工作，使用XML，你可能會發現有用LXML http://lxml.de/ –

我想知道，如果這能也工作。請發表您對這種方法的優缺點的評論。

from xml.dom.minidom import parse 
import xml.dom.minidom 

# Open XML document using minidom parser 
DOMTree = xml.dom.minidom.parse("xmlquestion.xml") 
tn = DOMTree.documentElement 
print tn.namespaceURI 
#print tn.childNodes 

data = tn.getElementsByTagName('DATA')[0] 
protectinfo = data.getElementsByTagName('PROTECTINFO')[0] 
keylen = protectinfo.getElementsByTagName('KEYLEN')[0] 
print keylen.childNodes[0].data 

http://schemas.microsoft.com/DRM/2007/03/PlayReadyHeader 
16

來源

2016-06-21 12:50:58

這是偉大的。因爲我的數據源來自網絡請求，所以必須稍微修改才能導入parseString。我只是在尋找一種快速驗證xml內容的方法。我想和ET一起去，因爲它似乎被更廣泛地使用，儘管我發現這個問題令人沮喪，因爲文檔似乎不足，並且它似乎是這樣一個基本問題。 –

使用python/elementree

回答

相關問題