2014-01-16 62 views
2

我記住我對Python很新。我試圖將sample1.xml中的少量XML節點複製到out.xml中,如果它不存在於sample2.xml中的話。如何在Python中將多個XML節點複製到另一個文件

這是多遠我得到了之前我堅持

import xml.etree.ElementTree as ET 

tree = ET.ElementTree(file='sample1.xml') 
addtree = ET.ElementTree(file='sample2.xml') 

root = tree.getroot() 
addroot = addtree.getroot() 

for adel in addroot.findall('.//cars/car'): 
    for el in root.findall('cars/car'): 
     with open('out.xml', 'w+') as f: 
      f.write("BEFORE\n")  
      f.write(el.tag) 
      f.write("\n") 
      f.write(adel.tag) 
      f.write("\n") 
      f.write("\n") 

      f.write("AFTER\n") 

      el = adel 

      f.write(el.tag) 
      f.write("\n") 
      f.write(adel.tag) 

我不知道我錯過了什麼,但它只是複製實際的「tag」本身。

輸出這樣:

BEFORE 
car 
car 

AFTER 
car 
car 

所以我錯過了孩子的節點,也是<></>標籤。預期的結果如下。

sample1.xml:

<cars> 
    <car> 
     <use-car>0</use-car> 
     <use-gas>0</use-gas> 
     <car-name /> 
     <car-key /> 
     <car-location>hawaii</car-location> 
     <car-port>5</car-port> 
    </car> 
</cars> 

sample2.xml:

<cars> 
    <old> 
     1 
    </old> 
    <new> 
     8 
    </new> 
    <car /> 
</cars> 

在out.xml(終產物)預期結果

<cars> 
    <old> 
     1 
    </old> 
    <new> 
     8 
    </old> 
    <car> 
     <use-car>0</use-car> 
     <use-gas>0</use-gas> 
     <car-name /> 
     <car-key /> 
     <car-location>hawaii</car-location> 
     <car-port>5</car-port> 
    </car> 
</cars> 

所有其他節點oldnew必須保持不變。我只是試圖用它的所有子孫(如果存在)替換<car />節點。

回答

2

首先,一對夫婦與您的XML瑣碎的問題:

  • SAMPLE1:截止cars標記缺少/
  • SAMPLE2:截止new標籤是不正確的old,應閱讀new

,免責聲明:下面我的解決方法有其侷限性 - 特別是,它不會處理重複地從SAMPLE1到多個點在SAMPLE2取代car節點。但它對你提供的示例文件工作正常。

第三:感謝頂部的幾個答案access ElementTree node parent node - 他們通知了下面的get_node_parent_info的實施。

最後,代碼:

import xml.etree.ElementTree as ET 

def find_child(node, with_name): 
    """Recursively find node with given name""" 
    for element in list(node): 
     if element.tag == with_name: 
      return element 
     elif list(element): 
      sub_result = find_child(element, with_name) 
      if sub_result is not None: 
       return sub_result 
    return None 

def replace_node(from_tree, to_tree, node_name): 
    """ 
    Replace node with given node_name in to_tree with 
    the same-named node from the from_tree 
    """ 
    # Find nodes of given name ('car' in the example) in each tree 
    from_node = find_child(from_tree.getroot(), node_name) 
    to_node = find_child(to_tree.getroot(), node_name) 

    # Find where to substitute the from_node into the to_tree 
    to_parent, to_index = get_node_parent_info(to_tree, to_node) 

    # Replace to_node with from_node 
    to_parent.remove(to_node) 
    to_parent.insert(to_index, from_node) 

def get_node_parent_info(tree, node): 
    """ 
    Return tuple of (parent, index) where: 
     parent = node's parent within tree 
     index = index of node under parent 
    """ 
    parent_map = {c:p for p in tree.iter() for c in p} 
    parent = parent_map[node] 
    return parent, list(parent).index(node) 

from_tree = ET.ElementTree(file='sample1.xml') 
to_tree = ET.ElementTree(file='sample2.xml') 

replace_node(from_tree, to_tree, 'car') 

# ET.dump(to_tree) 
to_tree.write('output.xml') 

UPDATE:這是最近引起了我的注意,的find_child()在我最初提供,如果「孩子」的問題是會失敗的解決方案的實施而不是在遍歷的XML樹的第一個分支中。我已經更新了上面的實現來糾正這個問題。

相關問題