如何訪問python中xml節點中的鏈接數據？

我有以下類型的XML格式返回給我的數據（許多房間都返回;這是數據的一個例子，我回去）：如何訪問python中xml節點中的鏈接數據？

<?xml version="1.0" encoding="UTF-8"?> 
<rooms> 
    <total-results>1</total-results> 
    <items-per-page>1</items-per-page> 
    <start-index>0</start-index> 
    <room> 
     <id>xxxxxxxx</id> 
     <etag>5</etag> 
     <link rel="http://schemas.com.mysite.building" title="building" href="https://mysite.me.myschool.edu:8443/ess/scheduleapi/v1/buildings/yyyyyyyyy"/> 
     <name>1.306</name> 
     <status>active</status> 
     <link rel="self" title="self" href="https://mysite.me.myschool.edu:8443/ess/scheduleapi/v1/rooms/aaaaaaaaa"> 
    </room> 
</rooms>

如果節點類型== node.TEXT_NODE，我似乎能夠訪問數據（所以我可以看到我有空間1.306）。此外，我似乎可以訪問nodeName 鏈接，但我真的需要知道該房間是否在我的可接受建築物中，所以我需要能夠到該行的其餘部分查看YYYYYYYYY。有人可以請指教嗎？

好吧，@vezult，正如你所建議的，這裏是我最終想到的（工作代碼！）使用ElementTree。這可能不是最pythonic（或ElementTree-ic？）這樣做的方式，但它似乎工作。我很高興能夠訪問我的每一個xml的.tag，.attrib和.text。我歡迎任何有關如何改善它的建議。

# We start out knowing our room name and our building id. However, the same room can exist in many buildings. 
# Examine the rooms we've received and get the id of the one with our name that is also in our building. 

# Query the API for a list of rooms, getting u back. 

request = build_request(resourceUrl) 
u = urllib2.urlopen(request.to_url()) 
mydata = u.read() 

root = ElementTree.fromstring(mydata) 
print 'tree root', root.tag, root.attrib, root.text 
for child in root: 
    if child.tag == 'room': 
     for child2 in child: 
      # the id tag comes before the name tag, so hold on to it 
      if child2.tag == "id": 
       hold_id = child2.text 
      # the building link comes before the room name, so hold on to it 
      if child2.tag == 'link':       # if this is a link 
       if "building" in child2.attrib['href']:   # and it's a building link 
        hold_link_data = child2.attrib['href'] 
      if child2.tag == 'name': 
       if (out_bldg in hold_link_data and # the building link we're looking at has our building in it 
        (in_rm == child2.text)):  # and this room name is our room name 
        out_rm = hold_id 
        break # get out of for-loop

來源

2012-10-04 HelenM

發佈您當前的代碼。 – Blender

您試圖獲取節點的屬性，而不是文本。您用來執行此操作的代碼將取決於您用於查看xml的模塊。所以，是的。發佈您的當前代碼。 – kreativitea

您的XML無效。第二個鏈接元素沒有結束標記。 – vezult

您沒有提供您所使用的庫的指示，所以我假設你使用的是標準的Python ElementTree模塊。在這種情況下，請執行以下操作：

from xml.etree import ElementTree 

tree = ElementTree.fromstring("""<?xml version="1.0" encoding="UTF-8"?> 
<rooms> 
    <total-results>1</total-results> 
    <items-per-page>1</items-per-page> 
    <start-index>0</start-index> 
    <room> 
     <id>xxxxxxxx</id> 
     <etag>5</etag> 
     <link rel="http://schemas.com.mysite.building" title="building" href="https://mysite.me.myschool.edu:8443/ess/scheduleapi/v1/buildings/yyyyyyyyy" /> 
     <name>1.306</name> 
     <status>active</status> 
     <link rel="self" title="self" href="https://mysite.me.myschool.edu:8443/ess/scheduleapi/v1/rooms/aaaaaaaaa" /> 
    </room> 
</rooms> 
""") 

# Select the first link element in the example XML 
for node in tree.findall('./room/link[@title="building"]'): 
    # the 'attrib' attribute is a dictionary containing the node attributes 
    print node.attrib['href']

來源

2012-10-04 19:52:48 vezult

我認爲它應該是基於「print node.attrib ['href']'，所以我需要能夠到該行的其餘部分查看yyyyyyyyy」，但+1。 –

我還有一點 - 謝謝@vezult！我從xml.dom.minidom導入parse，parseString使用，但我不介意嘗試ElementTree（事實上，現在我看文檔，我可能應該。但是，在我的打印聲明，我現在得到一個錯誤，這是我的代碼：u = urllib2.urlopen（request.to_url（）） data = u.read（） tree = ElementTree.fromstring（data） for tree.findall（'./ room /鏈接[@標題= 「建築」] '）：打印treenode.attrib [' href「屬性] ENDFOR 而且我收到以下錯誤： – HelenM

@vezult續而且我收到以下錯誤： C：\ Python26 \ Lib \ xml \ etree \ ElementPath.py in __init__ 「ex pected path separator（％s）「％（op或tag） – HelenM

如何訪問python中xml節點中的鏈接數據？

回答

相關問題