Java jaxb utf-8/iso轉換

我有一個XML文件，其中包含非標準字符（如奇怪的「引用」）。Java jaxb utf-8/iso轉換

我使用UTF-8/ISO/ASCII讀取XML +解組：

BufferedReader br = new BufferedReader(new InputStreamReader(
       (conn.getInputStream()),"ISO-8859-1")); 
     String output; 
     StringBuffer sb = new StringBuffer(); 
     while ((output = br.readLine()) != null) { 
      //fetch XML 
      sb.append(output); 
     } 


     try { 

      jc = JAXBContext.newInstance(ServiceResponse.class); 

      Unmarshaller unmarshaller = jc.createUnmarshaller(); 

      ServiceResponse OWrsp = (ServiceResponse) unmarshaller 
        .unmarshal(new InputSource(new StringReader(sb.toString())));

我將採取ISO-8859-1代碼的Oracle函數，並將其轉換/映射他們「字面「符號。即：「&＃x2019」=>「左單引號」

JAXB使用iso解組，顯示iso轉換正常的字符。即所有怪異的單引號將被編碼爲「&＃x2019的」

這樣想我的字符串是：一流的10-11歲（注意怪異 - 11和年間）

jc = JAXBContext.newInstance(ScienceProductBuilderInfoType.class); 
     Marshaller m = jc.createMarshaller(); 
     m.setProperty(Marshaller.JAXB_ENCODING, "ISO-8859-1"); 
     //save a temp file 
     File file2 = new File("tmp.xml");

這將保存文件：

class of 10&#8211;11&#8208;year&#8208;olds. (what i want..so file saving works!)

[側注：我已閱讀用java文件閱讀器的文件，它把上面的字符串精]

這個問題我有是，使用jaxb unmarshaller的STRING表示有奇怪的輸出，由於某種原因，我似乎無法得到代表–的字符串。

當我 1：檢查XML解組輸出：

class of 10?11?year?olds

2：文件輸出：

class of 10&#8211;11&#8208;year&#8208;olds

我甚至試圖讀取從已保存的XML文件，然後解組這（在我的字符串獲得–的希望）

String sCurrentLine; 
     BufferedReader br = new BufferedReader(new FileReader("tmp.xml")); 
     StringBuffer sb = new StringBuffer(); 
     while ((sCurrentLine = br.readLine()) != null) { 
      sb.append(sCurrentLine); 
     } 




     ScienceProductBuilderInfoType rsp = (ScienceProductBuilderInfoType) unm 
       .unmarshal(new InputSource(new StringReader(sb.toString())));

無濟於事。

任何想法如何獲得jaxb中的iso-8859-1編碼字符？

來源

2013-08-22 nate

你用什麼軟件來顯示/查看解組字符串表示發現這個tibid代碼？（「10？11？歲？」文本） – Joni

eclipse控制檯。我無法弄清楚爲什麼jaxb正在轉換– – nate

如何使用System.out將字符串輸出到控制檯？ JAXB解碼實體引用，因爲這是XML解析器應該做的事情，儘管iirc可以配置爲不執行它。 – Joni

解決：使用計算器上

final class HtmlEncoder { 
    private HtmlEncoder() {} 

    public static <T extends Appendable> T escapeNonLatin(CharSequence sequence, 
     T out) throws java.io.IOException { 
    for (int i = 0; i < sequence.length(); i++) { 
     char ch = sequence.charAt(i); 
     if (Character.UnicodeBlock.of(ch) == Character.UnicodeBlock.BASIC_LATIN) { 
     out.append(ch); 
     } else { 
     int codepoint = Character.codePointAt(sequence, i); 
     // handle supplementary range chars 
     i += Character.charCount(codepoint) - 1; 
     // emit entity 
     out.append("&#x"); 
     out.append(Integer.toHexString(codepoint)); 
     out.append(";"); 
     } 
    } 
    return out; 
    } 
}

HtmlEncoder.escapeNonLatin（MyString的）

來源

2013-08-23 14:25:38 nate

Java jaxb utf-8/iso轉換

回答

相關問題