2015-02-24 82 views
0

我是Java新手,正在嘗試編寫一個程序,該程序從MW api獲取給定單詞的含義。輸出是XML,現在我正在使用DOM解析器來打印所有定義的列表。通常情況下,檢索XML將如下如何讀取子標記的內容以及java中的父標記的XML

<?xml version="1.0" encoding="utf-8" ?> 
<entry_list version="1.0"> 
    <entry id="dictionary"><ew>dictionary</ew><subj>PU-1#PU-2#PU-3#CP-4</subj><hw>dic*tio*nary</hw><sound><wav>dictio04.wav</wav></sound><pr>ˈdik-shÉ™-ËŒner-Ä「, -ËŒne-rÄ「</pr><fl>noun</fl><in><il>plural</il> <if>dic*tio*nar*ies</if></in><et>Medieval Latin <it>dictionarium,</it> from Late Latin <it>diction-, dictio</it> word, from Latin, speaking</et><def><date>1526</date> <sn>1</sn> <dt>:a reference source in print or electronic form containing words usually alphabetically arranged along with information about their forms, <d_link>pronunciations</d_link>, functions, <d_link>etymologies</d_link>, meanings, and <d_link>syntactical</d_link> and idiomatic uses</dt> <sn>2</sn> <dt>:a reference book listing alphabetically terms or names important to a particular subject or activity along with discussion of their meanings and <d_link>applications</d_link></dt> <sn>3</sn> <dt>:a reference book listing alphabetically the words of one language and showing their meanings or translations in another language</dt> <sn>4</sn> <dt>:a <d_link>computerized</d_link> list (as of items of data or words) used for reference (as for information retrieval or word processing)</dt></def></entry> 
</entry_list> 

的定義列表將標籤<dt>

內部封閉現在我面臨的問題是標籤<dt>裏面有另一個子標籤<d_link>。每當DOM解析器過這個子標籤運行時,getNodeValue()方法正在考慮的結束標記<dt>

我的代碼如下:

import org.w3c.dom.*; 
import javax.xml.parsers.*; 

public class Dictionary5 { 
    public static void main(String[] args) throws Exception { 
     String head = new String("http://www.dictionaryapi.com/api/v1/references/collegiate/xml/"); 
     String word = new String("banal"); 
     String apiKey = new String("?key=xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxx"); //My API Key for Merriam webster 
     String finalURL = head.trim() + word.trim()+ apiKey.trim(); 
     try 
     { 
      DocumentBuilderFactory f = DocumentBuilderFactory.newInstance(); 
      DocumentBuilder b = f.newDocumentBuilder(); 
      Document doc = b.parse(finalURL); 

      doc.getDocumentElement().normalize(); 

      NodeList items = doc.getElementsByTagName("entry"); 
      for (int i = 0; i < items.getLength(); i++) 
      { 
       Node n = items.item(i); 

       if (n.getNodeType() != Node.ELEMENT_NODE) 
        continue; 

       Element e = (Element) n; 
       NodeList titleList = e.getElementsByTagName("dt"); 
       for (int j = 0; j < titleList.getLength(); j++){ 
        Node dt = titleList.item(j); 
        if (dt.getNodeType() != Node.ELEMENT_NODE) 
         continue;     
        Element titleElem = (Element) titleList.item(j); 
        Node titleNode = titleElem.getChildNodes().item(0); 
        System.out.println(titleNode.getNodeValue()); 
       } 
      } 
     } 
     catch (Exception e) 
     { 
      e.printStackTrace(); 
     } 

    } 
} 

輸出是如下

:a reference source in print or electronic form containing words usually alphabetically arranged along with information about their forms, 
:a reference book listing alphabetically terms or names important to a particular subject or activity along with discussion of their meanings and 
:a reference book listing alphabetically the words of one language and showing their meanings or translations in another language 
:a 

正如你所看到的,第一,第二和第四個定義會突然結束,因爲解析器遇到子標籤<d_link>

我的預期輸出是如下:

:a reference source in print or electronic form containing words usually alphabetically arranged along with information about their forms, pronunciations, functions, etymologies, meanings, and syntactical and idiomatic uses 
:a reference book listing alphabetically terms or names important to a particular subject or activity along with discussion of their meanings and applications 
:a reference book listing alphabetically the words of one language and showing their meanings or translations in another language 
:a computerized list (as of items of data or words) used for reference (as for information retrieval or word processing) 

可有人請幫我這。任何幫助,高度讚賞。提前致謝。

回答

0

在DOM模型中,對DT標籤的內容將是文本,D_LINK元素,TEXT,D_LINK ....

所以你要連接在一起的所有文本元素(和它似乎也是內容的d_link標記)。你只是讀第一個:titleElem.getChildNodes()。item(0)所以它是「突然」完成

+0

感謝您的回覆,關於如何獲取項目數量和循環它以concate所有文本轉換爲單個字符串。 – Naveen 2015-02-28 18:30:08

相關問題