我正在使用XML轉換器將XML轉換爲另一種XML。有些沒有英文字符轉換失敗。XML轉換失敗
原始的XML:
<?xml version="1.0" encoding="UTF-8"?>
<RR_KeyPersonExpanded_2_0:RR_KeyPersonExpanded_2_0 xmlns:RR_KeyPersonExpanded_2_0="http://apply.grants.gov/forms/RR_KeyPersonExpanded_2_0-V2.0" xmlns:att="http://apply.grants.gov/system/Attachments-V1.0" xmlns:glob="http://apply.grants.gov/system/Global-V1.0" xmlns:globLib="http://apply.grants.gov/system/GlobalLibrary-V2.0" RR_KeyPersonExpanded_2_0:FormVersion="2.0">
<RR_KeyPersonExpanded_2_0:KeyPerson>
<RR_KeyPersonExpanded_2_0:Profile>
<RR_KeyPersonExpanded_2_0:Name>
<globLib:PrefixName>候.</globLib:PrefixName>
<globLib:FirstName>Lakshmi</globLib:FirstName>
<globLib:MiddleName>AB</globLib:MiddleName>
<globLib:LastName>Sørensen</globLib:LastName>
</RR_KeyPersonExpanded_2_0:Name>
</RR_KeyPersonExpanded_2_0:Profile>
</RR_KeyPersonExpanded_2_0:KeyPerson>
</RR_KeyPersonExpanded_2_0:RR_KeyPersonExpanded_2_0>
removeemptytags.xsl:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:strip-space elements="*"/>
<xsl:output indent="yes" omit-xml-declaration="yes" encoding="UTF-8" method="xml"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="*[not(descendant-or-self::*[text()[normalize-space()] | @*])]"/>
</xsl:stylesheet>
Java代碼:
public String removeEmptyTags(String xml) {
String filteredXML = "";
try (OutputStream bos = new ByteArrayOutputStream();) {
TransformerFactory transformerFactory = TransformerFactory.newInstance();
StreamSource inputXMLSource = new StreamSource(new ByteArrayInputStream(xml.getBytes("UTF-8")));
StreamSource xsltSource = new StreamSource(getClass().getClassLoader().getResourceAsStream("removeemptytags.xsl"));
Transformer transformer = transformerFactory.newTransformer(xsltSource);
StreamResult result = new StreamResult(bos);
transformer.transform(inputXMLSource, result);
bos.flush();
filteredXML = bos.toString();
} catch (Exception e) {
logger.log(Level.SEVERE, "Exception while removing empty tags : ", e);
throw new ParsingException(e.getMessage());
}
return filteredXML;
}
輸出中的xml:
<RR_KeyPersonExpanded_2_0:RR_KeyPersonExpanded_2_0 xmlns:RR_KeyPersonExpanded_2_0="http://apply.grants.gov/forms/RR_KeyPersonExpanded_2_0-V2.0" xmlns:att="http://apply.grants.gov/system/Attachments-V1.0" xmlns:glob="http://apply.grants.gov/system/Global-V1.0" xmlns:globLib="http://apply.grants.gov/system/GlobalLibrary-V2.0" RR_KeyPersonExpanded_2_0:FormVersion="2.0">
<RR_KeyPersonExpanded_2_0:KeyPerson>
<RR_KeyPersonExpanded_2_0:Profile>
<RR_KeyPersonExpanded_2_0:Name>
<globLib:PrefixName>候.</globLib:PrefixName>
<globLib:FirstName>Lakshmi</globLib:FirstName>
<globLib:MiddleName>AB</globLib:MiddleName>
<globLib:LastName>Sørensen</globLib:LastName>
</RR_KeyPersonExpanded_2_0:Name>
</RR_KeyPersonExpanded_2_0:Profile>
</RR_KeyPersonExpanded_2_0:KeyPerson>
</RR_KeyPersonExpanded_2_0:RR_KeyPersonExpanded_2_0>
正如你所看到的,「非英語單詞」只是成爲一羣無意義的人物。我嘗試將xslt中的編碼更改爲「UTF-16」,但它不起作用。有人在這裏遇到同樣的問題嗎?
你輸出的編碼設置爲UTF-8? – Compass