刪除在Python 3 lxml的

我有我以前評論的一些元素一個XML文件中的所有意見，現在我想以取消他們..刪除在Python 3 lxml的

我有這樣的結構

<parent parId="22" attr="Alpha"> 
<!--<reg regId="1"> 
    <cont>There is some content</cont><cont2 attr1="val">Another content</cont2> 
</reg> 
--></parent> 
<parent parId="23" attr="Alpha"> 
<reg regId="1"> 
    <cont>There is more content</cont><cont2 attr1="noval">Morecont</cont2> 
</reg> 
</parent> 
<parent parId="24" attr="Alpha"> 
<!--<reg regId="1"> 
    <cont>There is some content</cont><cont2 attr1="val">Another content</cont2> 
</reg> 
--></parent>

我想取消註釋文件的所有評論。因此，也是評論的因素，我會取消註釋。

我能找到的評論使用XPath的元素。這是我的代碼片段。

def unhide_element(): 
    path = r'path_to_file\file.xml' 
    xml_parser = et.parse(path) 
    comments = root.xpath('//comment') 
    for c in comments: 
     print('Comment: ', c) 
     parent_comment = c.getparent() 
     parent_comment.replace(c,'') 
     tree = et.ElementTree(root) 
     tree.write(new_file)

但是，替換不工作，因爲它期望另一個元素。

我該如何解決這個問題？

來源

2017-09-27 TMikonos

您的代碼中缺少創造從註釋文本的新的XML元素的關鍵位。還有一些與錯誤的XPath查詢相關的其他錯誤，並在循環內多次保存輸出文件。

而且，看來你與lxml.etree混合xml.etree。按照documentation，前者忽略註釋當XML文件進行解析，所以最好的方法是使用lxml。

固定所有上述的後，我們得到這樣的事情。

import lxml.etree as ET 


def unhide_element(): 
    path = r'test.xml' 
    root = ET.parse(path) 
    comments = root.xpath('//comment()') 
    for c in comments: 
     print('Comment: ', c) 
     parent_comment = c.getparent() 
     parent_comment.remove(c) # skip this if you want to retain the comment 
     new_elem = ET.XML(c.text) # this bit creates the new element from comment text 
     parent_comment.addnext(new_elem) 

    root.write(r'new_file.xml')

來源

2017-09-27 16:14:21

大，這個工作。但是，我不知道爲什麼我的lxml版本不使用getroot（）第一次不起作用。我無法直接在ElementTree中解析。 – TMikonos

好，既然你想取消註釋的一切，你真正需要做的是刪除每個「<！ - 」和「 - >」：

import re 

new_xml = ''.join(re.split('<!--|-->', xml))

或者：

new_xml = xml.replace('<!--', '').replace('-->', '')

來源

2017-09-27 15:55:57

刪除在Python 3 lxml的

回答

相關問題