我該如何解析python中的這個xml字符串？

我的XML字符串 -我該如何解析python中的這個xml字符串？

xmlData = """<SMSResponse xmlns="http://example.com" xmlns:i="http://www.w3.org/2001/XMLSchema-instance"> 
      <Cancelled>false</Cancelled> 
      <MessageID>00000000-0000-0000-0000-000000000000</MessageID> 
      <Queued>false</Queued> 
      <SMSError>NoError</SMSError> 
      <SMSIncomingMessages i:nil="true"/> 
      <Sent>false</Sent> 
      <SentDateTime>0001-01-01T00:00:00</SentDateTime> 
      </SMSResponse>"""

我試圖解析並獲得標籤的值 - 已取消，的MessageId，SMSError，等我使用python的Elementtree庫。到目前爲止，我已經嘗試過的事情一樣 -

root = ET.fromstring(xmlData) 
print root.find('Sent') // gives None 
for child in root: 
    print chil.find('MessageId') // also gives None

雖然，我能與打印標籤 -

for child in root: 
    print child.tag 
    //child.tag for the tag Cancelled is - {http://example.com}Cancelled

和各自的價值 -

for child in root: 
    print child.text

我如何得到類似的東西 -

print child.Queued // will print false

贊在PHP中，我們可以用root訪問它們 -

$xml = simplexml_load_string($data); 
$status = $xml->SMSError;

來源

2013-01-04 Hussain

您的文檔上有一個命名空間，你需要搜索時包括命名空間：

root = ET.fromstring(xmlData) 
print root.find('{http://example.com}Sent',) 
print root.find('{http://example.com}MessageID')

輸出：

<Element '{http://example.com}Sent' at 0x1043e0690> 
<Element '{http://example.com}MessageID' at 0x1043e0350>

find()和findall()方法也採用名稱空間映射;你可以搜索任意前綴，並且前綴將在地圖中查找，以節省打字：

nsmap = {'n': 'http://example.com'} 
print root.find('n:Sent', namespaces=nsmap) 
print root.find('n:MessageID', namespaces=nsmap)

來源

2013-01-04 09:14:13

所以基本上我每次要訪問標籤文本時都必須指定「{http://example.com}」？ – Hussain

@HussainTamboli：'find'和'findall'也有一個'namespaces = mapping'參數，但是當有一個默認名稱空間時，這似乎沒有用處。 'lxml'處理這一切好得多。 –

查看@ eclaird的回答。我想你也是這樣做的。+1 – Hussain

您可以創建一個字典，並直接獲取值出來吧......

tree = ET.fromstring(xmlData) 

root = {} 

for child in tree: 
    root[child.tag.split("}")[1]] = child.text 

print root["Queued"]

來源

2013-01-04 09:05:53 ATOzTOA

嗨，看我的編輯。「//child.tag對於已取消標記爲 - {http://example.com}已取消」，因此難以將其與「已取消」相匹配。有沒有更好的方法？ – Hussain

更新回答，立即嘗試... – ATOzTOA

嘿。它有效，但這只是一個調整。如何以標籤是關鍵字而文本是值的方式訪問標籤的文本。 – Hussain

如果你在Python標準XML庫設置，你可以使用這樣的事情：

root = ET.fromstring(xmlData) 
namespace = 'http://example.com' 

def query(tree, nodename): 
    return tree.find('{{{ex}}}{nodename}'.format(ex=namespace, nodename=nodename)) 

queued = query(root, 'Queued') 
print queued.text

來源

2013-01-04 09:22:38 tuomur

這看起來不錯。 – Hussain

隨着lxml.etree：

In [8]: import lxml.etree as et 

In [9]: doc=et.fromstring(xmlData) 

In [10]: ns={'n':'http://example.com'} 

In [11]: doc.xpath('n:Queued/text()',namespaces=ns) 
Out[11]: ['false']

隨着elementtree你可以這樣做：

import xml.etree.ElementTree as ET  
root=ET.fromstring(xmlData)  
ns={'n':'http://example.com'} 
root.find('n:Queued',namespaces=ns).text 
Out[13]: 'false'

來源

2013-01-04 09:35:52 root

謝謝。我想知道在ElementTree中找到類似的東西。 +1 – Hussain

我該如何解析python中的這個xml字符串？

回答

相關問題