python
  • xml
  • lxml
  • 2013-04-12 46 views 0 likes 
    0

    我想使用lxml刪除XML元素,方法似乎沒問題,但它不工作。這就是我的代碼:lxml刪除元素不工作

    import lxml.etree as le 
    f = open('Bird.rdf','r') 
    doc=le.parse(f) 
    for elem in doc.xpath("//*[local-name() = 'dc' and namespace-uri() = 'http://purl.org/dc/terms/']"): 
        parent=elem.getparent().remove(elem) 
    print(le.tostring(doc)) 
    

    示例XML文件:

    <rdf:RDF xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:dc="http://purl.org/dc/terms/"> 
    
         <wo:Class rdf:about="/nature/life/Bird#class"> 
            <dc:description>Birds are a class of vertebrates. They are bipedal, warm-blooded, have a 
             covering of feathers, and their front limbs are modified into wings. Some birds, such as 
             penguins and ostriches, have lost the power of flight. All birds lay eggs. Because birds 
             are warm-blooded, their eggs have to be incubated to keep the embryos inside warm, or 
             they will perish</dc:description> 
         </wo:Class> 
    </rdf:RDF>     
    

    回答

    4

    您的問題是本地的名字是 '描述',而不是 'DC'(命名空間的別名)。您可以將您的命名空間中的XPath功能,更直接地寫你的XPath爲:

    import lxml.etree as le 
    
    txt="""<rdf:RDF xmlns:rdf="http://www.w3.org/2000/01/rdf-schema#" xmlns:dc="http://purl.org/dc/terms/" 
        xmlns:wo="http:/some/wo/namespace"> 
    
        <wo:Class rdf:about="/nature/life/Bird#class"> 
         <dc:description>Birds are a class of vertebrates. They are bipedal, warm-blooded, have a 
             covering of feathers, and their front limbs are modified into wings. Some birds, such as 
             penguins and ostriches, have lost the power of flight. All birds lay eggs. Because birds 
             are warm-blooded, their eggs have to be incubated to keep the embryos inside warm, or 
             they will perish</dc:description> 
        </wo:Class> 
    </rdf:RDF> 
    """ 
    
    namespaces = { 
        "rdf":"http://www.w3.org/2000/01/rdf-schema#", 
        "dc":"http://purl.org/dc/terms/", 
        "wo":"http:/some/wo/namespace" } 
    
    doc=le.fromstring(txt) 
    for elem in doc.xpath("//dc:description", namespaces=namespaces): 
        parent=elem.getparent().remove(elem) 
    print(le.tostring(doc)) 
    
    +3

    或'的XPath(...,命名空間= doc.getroot()nsmap)',節省打字 – mata

    +0

    @mata我不知道nsmap,感謝提示! – tdelaney

    相關問題