2012-04-12 72 views
0

通過API,我得到一個XML文件,我試圖通過org.w3c.dom和XPath進行解析。 XML文件的一部分描述HTML內容:使用Java將XML解析爲HTML org.w3c.dom

<Para>Since 2001, state and local health departments in the US have accelerated efforts to prepare for bioterrorism and other high-impact public health emergencies. These activities have been spurred by federal funding and guidance from the US Centers for Disease Control and Prevention (CDC) and the Health Resources and Services Administration (HRSA) 
    <CitationRef CitationID="B1">1</CitationRef> 
    <CitationRef CitationID="B2">2</CitationRef> . Over time, the emphasis of this guidance has expanded from bioterrorism to include "terrorism and non-terrorism events, including infectious disease, environmental and occupational related emergencies" 
    <CitationRef CitationID="B4">4</CitationRef> as well as pandemic influenza. 
</Para> 

這應該成爲這樣的:

<p>Since 2001, state and local health departments in the US have accelerated efforts to prepare for bioterrorism and other high-impact public health emergencies. These activities have been spurred by federal funding and guidance from the US Centers for Disease Control and Prevention (CDC) and the Health Resources and Services Administration (HRSA) 
    <a href="link/B1">1</a> 
    <a href="link/B2">3</a> . Over time, the emphasis of this guidance has expanded from bioterrorism to include "terrorism and non-terrorism events, including infectious disease, environmental and occupational related emergencies" 
    <a href="link/B4">4</a> as well as pandemic influenza. 
</p> 

我如何能做到這一點有什麼建議?主要問題是檢索標籤並在保持其位置的同時更換它們。

+0

這聽起來像XSLT一個完美的工作,因爲它是將XML輸入一些其他的XML格式,或者轉換成HTML語言。如果您需要關於XSLT代碼的幫助,請將XSLT標籤添加到您的問題中。 – 2012-04-13 09:35:17

回答

1

這裏是你如何能做到這一點與XSLT:

<xsl:stylesheet 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
    version="1.0"> 

<xsl:template match="@* | node()"> 
    <xsl:copy> 
    <xsl:apply-templates select="@* | node()"/> 
    </xsl:copy> 
</xsl:template> 

<xsl:template match="Para"> 
    <p> 
    <xsl:apply-templates select="@* | node()"/> 
    </p> 
</xsl:template> 

<xsl:template match="CitationRef[@CitationID]"> 
    <a href="link/{@CitationID}"> 
    <xsl:apply-templates/> 
    </a> 
</xsl:template> 

</xsl:stylesheet> 
+0

感謝您的回覆,我正在研究XSLT(http://www.rgagnon.com/javadetails/java-0407.html),有沒有辦法讓我提供您提供的XSL文件,需要哪種XML被解析和輸出全部是一個字符串(所以不是文件)? – user485659 2012-04-13 10:53:39

+0

我非常確定輸入,樣式表和結果作爲一個字符串可能與JAXP,這只是一個問題,使用正確的源http://docs.oracle.com/javase/6/docs/api/javax/xml /transform/stream/StreamSource.html和結果類型(例如通過StringReader的StreamSource)。我會把它留給那些比我更熟悉Java API的人。 – 2012-04-13 11:56:08

+0

感謝您的提示,我懂了它的工作原理!對於輸入XML,我使用以下代碼:'nl =(Node)xpath.evaluate(「// expression/here」,doc,XPathConstants.NODE); DOMSource source = new DOMSource(nl);' – user485659 2012-04-13 12:58:02