Mojibakes在SOAP消息

在我的Java Web服務，我實現WebServiceProvider此類並試圖獲得原請求的客戶端已經完成。問題是我在soap消息體內的xml標籤中獲取了不可讀的字符，如<Applicant_Place_Born>ÐÐ¾ÑÐºÐ²Ð°</Applicant_Place_Born>，而不是普通的西裏爾字母。所以我正在尋找如何解決這個問題的方法。可能我可以使用<Source>泛型類型而不是<SOAPMessage>，但我不知道如何將它轉換爲字節。
Q1：是否有可能得到的字節數組原（原始二進制數據），這樣我可以手動將其解碼客戶端的請求？
Q2：是否有直接的方法通過爲SOAP消息指定解碼字符集來修復錯誤的字符？下面Mojibakes在SOAP消息

我當前的代碼給出：

@WebServiceProvider(
    portName="SoaprequestImplPort", 
    serviceName="services/soaprequest", 
    targetNamespace="http://tempuri.org/soaprequest", 
    wsdlLocation="/wsdl/SoaprequestImpl.wsdl" 
) 
@BindingType(value="http://schemas.xmlsoap.org/wsdl/soap/http") 
@ServiceMode(value=javax.xml.ws.Service.Mode.MESSAGE) 
public class SoaprequestImpl implements Provider<SOAPMessage> { 

    private static final String hResponse = "<soapenv:Envelope xmlns:soapenv=\\"; 

    public SOAPMessage invoke(SOAPMessage req) { 
     getSOAPMessage(req); 
      SOAPMessage res = null; 
     try { 
       res = makeSOAPMessage(hResponse); 
     } catch (Exception e) { 
      System.out.println("Exception: occurred " + e); 
     } 
     return res; 
    } 

    private String getSOAPMessage(SOAPMessage msg) { 
     ByteArrayOutputStream baos = null; 
     try { 
      baos = new ByteArrayOutputStream(); 
      msg.writeTo(baos); 
      OutputStream outputStream = new FileOutputStream ("/opt/data/tomcat/end.txt"); 
      baos.writeTo(outputStream);  
     } catch(Exception e) { 
      e.printStackTrace(); 
     } 
     return s; 
    } 

    private SOAPMessage makeSOAPMessage(String msg) { 
     try { 
       MessageFactory factory = MessageFactory.newInstance(); 
       SOAPMessage message = factory.createMessage(); 
       message.getSOAPPart().setContent((Source)new StreamSource(new StringReader(msg))); 
       message.saveChanges(); 
       return message; 
     } catch (Exception e) { 
      return null; 
     } 
    } 
}

來源

2015-06-30 griboedov

什麼，你已經證明恰恰是「Москва」的UTF-8編碼的表示。您的SOAP數據是最有可能是在具有頂部<?xml version='1.0' encoding='UTF-8' ?>這表明內容是使用UTF-8編碼的XML文件。要將這些數據轉換回Unicode，您需要對其進行解碼。你也有一些HTML轉義，所以你必須首先逃避。我用的Tcl來測試這一點：

# The original string reported 
set s "Ð&#156;Ð¾Ñ&#129;ÐºÐ²Ð°" 
# substituting the html escapes 
set t "Ð\x9cÐ¾Ñ\x81ÐºÐ²Ð°" 
# decode from utf-8 into Unicode 
encoding convertfrom utf-8 "Ð\x9cÐ¾Ñ\x81ÐºÐ²Ð°" 
Москва

所以你的SOAP信息可能是罰款，但你很可能需要處理HTML允許任何嘗試從UTF-8字符串解碼之前逃脫。

來源

2015-06-30 10:00:07 patthoyts

謝謝你的回答。我明白你的解釋，這就是我所問的。我需要將字符串從UTF-8解碼回客戶端使用的原始編碼。但是我不確切知道最初使用的編碼以及如何在java代碼中完成編碼。 – griboedov

Mojibakes在SOAP消息

回答

相關問題