非遞歸（單節點級別）Python中的getElementsByTagName xml.dom

是否有任何方法僅在單個節點級別使用getElementsByTagName而不是遞歸地使用getElementsByTagName？非遞歸（單節點級別）Python中的getElementsByTagName xml.dom

E.g.考慮解析pom.xml文件：

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"> 

    <parent> 
     <groupId>com.parent</groupId> 
     <artifactId>parent</artifactId> 
     <version>1.0-SNAPSHOT</version> 
     <relativePath>../pom.xml</relativePath> 
    </parent> 

    <modelVersion>2.0.0</modelVersion> 
    <groupId>com.parent.somemodule</groupId> 
    <artifactId>some_module</artifactId> 
    <packaging>jar</packaging> 
    <version>1.0-SNAPSHOT</version> 
    <name>Some Module</name> 
    ...

如果我想在頂層（特別project->groupId，不project->parent->groupId）得到groupId，我使用：

xmldoc = minidom.parse('pom.xml') 
groupId = xmldoc.getElementsByTagName("groupId")[0].childNodes[0].nodeValue

但不幸的是，找到的第一個物理出現文件中的groupId，而不考慮層次結構級別，即project->parent->groupId。我實際上只想在一個特定的節點級別進行一次非遞歸查找，而不是在其子節點內進行查找。有沒有辦法在xml.dom？

更新：我切換到BeautifulSoup但仍然有隱含的遞歸遍歷了同樣的問題：Finding a nonrecursive DOM subnode in Python using BeautifulSoup

來源

2014-01-15 amphibient

您可以通過getElementsByTagName()結果並且採取在根目錄下的第一個元素：

group_id_element = next(element for element in xmldoc.getElementsByTagName("groupId") 
         if element.parentNode == xmldoc.documentElement) 

print group_id_element.childNodes[0].nodeValue

請注意，與ElementTree相同，它也是標準庫的一部分，它會更容易，更短和更快。

希望有所幫助。

來源

2014-01-15 17:40:46 alecxe

你是說'ElementTree'更加細化和複雜？ – amphibient

@amphibient當然，這是我的看法。當我需要解析一個xml文件時，我更喜歡使用'ElementTree'，'lxml'或'BeautifulSoup'。 – alecxe

你能否也請參閱http://stackoverflow.com/questions/21146417/simple-dom-traversing-in-python-using-xml-etree-elementtree？謝謝 – amphibient

非遞歸（單節點級別）Python中的getElementsByTagName xml.dom

回答

相關問題