2016-11-24 145 views
2

我是XML解析的新手。我已閱讀關於DOM和SAX解析器,並嘗試了幾個示例實現。然而,我無法解析以下XML數據解析XML並獲取標籤內的值屬性值

<?xml version="1.0" ?> 
<collection> 
<action value="submit"/> 
<protocol_version value="1"/> 
<reponse value="Success"/> 
<batch> 
    <sample> 
     <count value="1"/> 
     <count2 value="2"/> 
     <count3 value="3"/> 
    </sample> 
    <sample_2> 
     <date value="10/10/2010"/> 
     <page value="SampleData"/> 
     <track value="123123123"/> 
     <same value="1.00"/> 
     <data> 
      <first_name value="Jeffrey"/> 
      <SSID value="1231231231"/> 
      <last_name value="Chuckle"/> 
      <field1 value="123123123"/> 
      <field2 value="Sam E. Bonzella"/> 
      <field3 value="SOME VALUE"/> 
      <field4 value="SOME VALUE 2"/> 
      <field5 value="TEXT"/> 
      <field6 value="12312"/> 
     </data> 
    </sample_2> 
</batch> 
</collection> 

下面是示例代碼我試圖實現,但它需要但卻難免重複代碼,同時也中,數據是沒有組織。我也嘗試過JAXB解析器,但無法獲取值屬性。

public class test { 
public static void main(String[] args){ 

    try { 
     File inputFile = new File("staff.xml"); 
     DocumentBuilderFactory dbFactory 
       = DocumentBuilderFactory.newInstance(); 
     DocumentBuilder dBuilder = dbFactory.newDocumentBuilder(); 
     Document doc = dBuilder.parse(inputFile); 
     doc.getDocumentElement().normalize(); 
     System.out.println("Base :" 
       + doc.getDocumentElement().getNodeName()); 
     NodeList nList = doc.getElementsByTagName("action"); 
     for (int temp = 0; temp < nList.getLength(); temp++) { 
      Node nNode = nList.item(temp); 
      System.out.println("Element :" 
        + nNode.getNodeName()); 
      if (nNode.getNodeType() == Node.ELEMENT_NODE) { 
       Element eElement = (Element) nNode; 
       System.out.println("Action : " 
         + eElement.getAttribute("value")); 
      } 
     } 
     nList = doc.getElementsByTagName("transaction_count"); 
     for (int temp = 0; temp < nList.getLength(); temp++) { 
      Node nNode = nList.item(temp); 
      System.out.println("Element :" 
        + nNode.getNodeName()); 
      if (nNode.getNodeType() == Node.ELEMENT_NODE) { 
       Element eElement = (Element) nNode; 
       System.out.println("transaction_count : " 
         + eElement.getAttribute("value")); 
      } 
     } 


    } catch (Exception e) { 
     e.printStackTrace(); 
    } 
} 
} 

理想情況下,我希望將數據解析爲數組或可能是Map。

回答

3

getElementsByTagName(String name)在這種情況下無用,因爲應該提供所有標記名稱。上述

XML包含可以分爲兩類元素:

  1. 元素與值 - 如果我理解正確的問題,標記名和值應該存儲在地圖

  2. 元素沒有值。它們包含另一個元素。標記名不應該被存儲。

元素可以遞歸解析。如果元素包含屬性「值」,那麼它應該存儲在地圖中。否則,應該檢查該元素的子節點。

public static void main(String argv[]) { 

    Map<String, String> map = new LinkedHashMap<>(); 

    try { 
     File fXmlFile = new File("staff.xml"); 
     DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance(); 
     DocumentBuilder dBuilder = dbFactory.newDocumentBuilder(); 
     Document doc = dBuilder.parse(fXmlFile); 
     doc.getDocumentElement().normalize(); 

     NodeList collectionNodeList = doc.getElementsByTagName("collection"); 
     Element collectionElement = (Element) collectionNodeList.item(0); 
     findElementsWithValues(map, collectionElement); 

    } catch (Exception e) { 
     e.printStackTrace(); 
    } 

    System.out.println("Found values: " + map.size()); 
    System.out.println(map); 
} 

private static void findElementsWithValues(Map<String, String> map, Element rootElement) { 
    NodeList childNodes = rootElement.getChildNodes(); 
    for (int i = 0; i < childNodes.getLength(); i++) { 
     Node node = childNodes.item(i); 
     if (node.getNodeType() == Node.ELEMENT_NODE) { 
      Element element = (Element) node; 
      String value = element.getAttribute("value"); 
      if (!value.isEmpty()) { 
       String tagName = element.getTagName(); 
       map.put(tagName, value); 
      }else{ 
       findElementsWithValues(map, element); 
      } 
     } 
    } 
} 

輸出(在上面的XML文件的更正後,使其可解析)

Found values: 19 
{action=submit, protocol_version=1, reponse=Success, count=1, count2=2, count3=3, date=10/10/2010, page=SampleData, track=123123123, same=1.00, first_name=Jeffrey, SSID=1231231231, last_name=Chuckle, field1=123123123, field2=Sam E. Bonzella, field3=SOME VALUE, field4=SOME VALUE 2, field5=TEXT, field6=12312}