0
我是Java新手,正在嘗試編寫一個程序,該程序從MW api獲取給定單詞的含義。輸出是XML,現在我正在使用DOM解析器來打印所有定義的列表。通常情況下,檢索XML將如下如何讀取子標記的內容以及java中的父標記的XML
<?xml version="1.0" encoding="utf-8" ?>
<entry_list version="1.0">
<entry id="dictionary"><ew>dictionary</ew><subj>PU-1#PU-2#PU-3#CP-4</subj><hw>dic*tio*nary</hw><sound><wav>dictio04.wav</wav></sound><pr>ˈdik-shÉ™-ËŒner-Ä「, -ËŒne-rÄ「</pr><fl>noun</fl><in><il>plural</il> <if>dic*tio*nar*ies</if></in><et>Medieval Latin <it>dictionarium,</it> from Late Latin <it>diction-, dictio</it> word, from Latin, speaking</et><def><date>1526</date> <sn>1</sn> <dt>:a reference source in print or electronic form containing words usually alphabetically arranged along with information about their forms, <d_link>pronunciations</d_link>, functions, <d_link>etymologies</d_link>, meanings, and <d_link>syntactical</d_link> and idiomatic uses</dt> <sn>2</sn> <dt>:a reference book listing alphabetically terms or names important to a particular subject or activity along with discussion of their meanings and <d_link>applications</d_link></dt> <sn>3</sn> <dt>:a reference book listing alphabetically the words of one language and showing their meanings or translations in another language</dt> <sn>4</sn> <dt>:a <d_link>computerized</d_link> list (as of items of data or words) used for reference (as for information retrieval or word processing)</dt></def></entry>
</entry_list>
的定義列表將標籤<dt>
內部封閉現在我面臨的問題是標籤<dt>
裏面有另一個子標籤<d_link>
。每當DOM解析器過這個子標籤運行時,getNodeValue()
方法正在考慮的結束標記<dt>
我的代碼如下:
import org.w3c.dom.*;
import javax.xml.parsers.*;
public class Dictionary5 {
public static void main(String[] args) throws Exception {
String head = new String("http://www.dictionaryapi.com/api/v1/references/collegiate/xml/");
String word = new String("banal");
String apiKey = new String("?key=xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxx"); //My API Key for Merriam webster
String finalURL = head.trim() + word.trim()+ apiKey.trim();
try
{
DocumentBuilderFactory f = DocumentBuilderFactory.newInstance();
DocumentBuilder b = f.newDocumentBuilder();
Document doc = b.parse(finalURL);
doc.getDocumentElement().normalize();
NodeList items = doc.getElementsByTagName("entry");
for (int i = 0; i < items.getLength(); i++)
{
Node n = items.item(i);
if (n.getNodeType() != Node.ELEMENT_NODE)
continue;
Element e = (Element) n;
NodeList titleList = e.getElementsByTagName("dt");
for (int j = 0; j < titleList.getLength(); j++){
Node dt = titleList.item(j);
if (dt.getNodeType() != Node.ELEMENT_NODE)
continue;
Element titleElem = (Element) titleList.item(j);
Node titleNode = titleElem.getChildNodes().item(0);
System.out.println(titleNode.getNodeValue());
}
}
}
catch (Exception e)
{
e.printStackTrace();
}
}
}
輸出是如下
:a reference source in print or electronic form containing words usually alphabetically arranged along with information about their forms,
:a reference book listing alphabetically terms or names important to a particular subject or activity along with discussion of their meanings and
:a reference book listing alphabetically the words of one language and showing their meanings or translations in another language
:a
正如你所看到的,第一,第二和第四個定義會突然結束,因爲解析器遇到子標籤<d_link>
。
我的預期輸出是如下:
:a reference source in print or electronic form containing words usually alphabetically arranged along with information about their forms, pronunciations, functions, etymologies, meanings, and syntactical and idiomatic uses
:a reference book listing alphabetically terms or names important to a particular subject or activity along with discussion of their meanings and applications
:a reference book listing alphabetically the words of one language and showing their meanings or translations in another language
:a computerized list (as of items of data or words) used for reference (as for information retrieval or word processing)
可有人請幫我這。任何幫助,高度讚賞。提前致謝。
感謝您的回覆,關於如何獲取項目數量和循環它以concate所有文本轉換爲單個字符串。 – Naveen 2015-02-28 18:30:08