Android：解析XML DOM解析器。將子節點轉換爲字符串

再次提出一個問題。這一次我正在解析從服務器接收到的XML消息。有人認爲是聰明的，並決定將HTML頁面放在XML消息中。現在我遇到了一些問題，因爲我想從該XML消息中提取該HTML頁面作爲字符串。Android：解析XML DOM解析器。將子節點轉換爲字符串

好吧，這是我解析XML消息：

<AmigoRequest> <From></From> <To></To> <MessageType>showMessage</MessageType> <Param0>general message</Param0> <Param1><html><head>test</head><body>Testhtml</body></html></Param1> </AmigoRequest>

你看，在參數1中指定的HTML頁面。我嘗試通過以下方式提取消息：

 
public String getParam1(Document d) { 
     if (d.getDocumentElement().getTagName().equals("AmigoRequest")) { 
      NodeList results = d.getElementsByTagName("Param1"); 
      // Messagetype depends on what message we are reading.   
      if (results.getLength() > 0 && results != null) {     
       return results.item(0).getFirstChild().getNodeValue(); 
      } 
     } 
     return ""; 
    }

其中，d是文檔格式的XML消息。它總是返回一個空值，因爲getNodeValue（）返回null。當我嘗試results.item（0）.getFirstChild（）。hasChildNodes（）時，它將返回true，因爲他看到消息中有一個標記。

如何從Param0中提取html消息<html><head>test</head><body>Testhtml</body></html>中的字符串？

我正在使用Android sdk 1.5（幾乎是java）和一個DOM解析器。

感謝您的時間和答覆。

ANTEK

來源

2010-01-12 Antek Drzewiecki

是XPath的一個選擇嗎？如果是這樣，我可能會幫助你，而我從來沒有使用過，這就是我問的原因。 – ChadNC 2010-01-12 17:35:17

XPath不受支持，但我設法通過使用DOM4J和Jaxen來爲Android找到解決方法。 – 2010-01-13 09:20:38

你可以採取參數1的內容，就像這樣：

public String getParam1(Document d) { 
     if (d.getDocumentElement().getTagName().equals("AmigoRequest")) { 
      NodeList results = d.getElementsByTagName("Param1"); 
      // Messagetype depends on what message we are reading.   
      if (results.getLength() > 0 && results != null) {     

       // String extractHTMLTags(String s) is a function that you have 
       // to implement in a way that will extract all the HTML tags inside a string. 
       return extractHTMLTags(results.item(0).getTextContent()); 
      } 
     } 
     return ""; 
    }

所有你需要做的就是實現一個功能：

String extractHTMLTags(String s)

，將刪除所有HTML標記字符串中的事件。對於您可以看看這篇文章：Remove HTML tags from a String

來源

2010-01-12 17:19:40 Alex

太糟糕Android不支持getTextContent功能。 Android正在使用舊的dom解析器。但我知道現在看哪裏。仍然沒有找到該主題的解決方案，但我編輯了我的主題標題。 – 2010-01-12 17:36:16

如果'getTextContent'在平臺上可用，只需調用它就足夠了，而不用在其周圍包含'extractHTMLTags'調用。 getTextContent從所返回的字符串中去掉任何XML標記（更準確地說，它通過連接嵌套元素內的所有文本字符串來獲得它的值，同時留下元素標記）。當然，這確實假定HTML內容是格式良好的XML。但是，如果不是這樣，那麼在XML解析中你可能甚至不會達到這樣的程度。 – 2010-01-12 17:45:52

哦，我從來沒有使用過android，我不知道它的DOM解析器！我認爲這是使用最新版本。對於那個很抱歉！ – Alex 2010-01-12 17:48:36

編輯：我剛纔看到上面關於getTextContent()不支持Android平臺上的評論。如果對不同平臺上的某個人有用，我會留下這個答案。

如果你的DOM API支持的話，你可以打電話getTextContent()，如下：

public String getParam1(Document d) { 
     if (d.getDocumentElement().getTagName().equals("AmigoRequest")) { 
      NodeList results = d.getElementsByTagName("Param1"); 
      // Messagetype depends on what message we are reading.   
      if (results != null) {     
       return results.getTextContent(); 
      } 
     } 
     return ""; 
    }

然而，getTextContent()是DOM Level 3的API調用;並非所有的解析器都保證支持它。 Xerces-J does。

順便說一句，在你原來的例子中，你的支票null是在錯誤的地方;它應該是：

 if (results != null && results.getLength() > 0) {

否則，你會得到一個NPE如果results確實回來爲null。

來源

2010-01-12 17:38:42

由於getTextContent()不適用於您，另一個選擇是寫它 - 這並不難。事實上，如果你僅僅是爲了自己的用途而編寫這個代碼 - 或者你的僱主沒有對開放源代碼有嚴格的規定 - 你可以看看Apache's implementation作爲一個起點;第610-646行似乎包含你所需要的大部分內容。（請尊重Apache的版權和許可）

否則，該方法的一些粗略的僞代碼將是：

String getTextContent(Node node) { 
    if (node has no children) 
     return ""; 

    if (node has 1 child) 
     return getTextContent(node.getFirstChild()); 

    return getTextContent(new StringBuffer()).toString(); 
} 

StringBuffer getTextContent(Node node, StringBuffer sb) { 
    for each child of node { 
     if (child is a text node) sb.append(child's text) 
     else getTextContent(child, sb); 
    } 
    return sb; 
}

來源

2010-01-12 18:24:46

嗯，我幾乎沒有與代碼...

public String getParam1(Document d) { 
    if (d.getDocumentElement().getTagName().equals("AmigoRequest")) { 
     NodeList results = d.getElementsByTagName("Param1"); 
     // Messagetype depends on what message we are reading.   
     if (results.getLength() > 0 && results != null) {     
      DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); 
      DocumentBuilder db; 
      Element node = (Element) results.item(0); // get the value of Param1 
      Document doc2 = null; 
      try { 

       db = dbf.newDocumentBuilder(); 
       doc2 = db.newDocument(); //create new document 
       doc2.appendChild(doc2.importNode(node, true)); //import the <html>...</html> result in doc2 

      } catch (ParserConfigurationException e) { 
       // TODO Auto-generated catch block 
       Log.d(TAG, " Exception ", e); 
      } catch (DOMException e) { 
       // TODO: handle exception 
       Log.d(TAG, " Exception ", e); 
      } catch (Exception e) { 
       // TODO: handle exception 
       e.printStackTrace();    }    


      return doc2. .....// All I'm missing is something to convert a Document to a string. 
     } 
    } 
    return ""; 

}

就像我的代碼的評論中所解釋的那樣。我所缺少的是從文檔中創建一個字符串。你不能使用Android中的變換類... doc2.toString（）會給你一個對象的序列化..

但是，我的下一步是寫我自己的解析器，如果這沒有解決;）

不是最好的代碼，而是一個短暫的解決方案。

public String getParam1(String b) { 
     return b 
       .substring(b.indexOf("<Param1>") + "<Param1>".length(), b.indexOf("</Param1>")); 
    }

其中string b爲XML文檔的字符串。

來源

2010-01-13 13:29:07

檢查了很多，刮的時候我的頭數千後，我想出了簡單的改動，它需要改變你的API等級8

來源

2011-01-31 11:18:30 ruturaj

Android：解析XML DOM解析器。將子節點轉換爲字符串

回答

相關問題