2013-08-29 93 views
1

我嘗試使用gSOAP 2.8.10 DOM解析器來解析包含UTF8編碼西里爾文本的簡單XML。 我創建了VC++控制檯應用程序,添加到項目soapC.cppsoapns.cppgSOAP DOM解析器問題

soapns.cpp:

#include <soap.nsmap> 

soap.nsmap:

#include "soapH.h" 
SOAP_NMAC struct Namespace namespaces[] = 
{ 
    {"SOAP-ENV", "http://schemas.xmlsoap.org/soap/envelope/", "http://www.w3.org /*/soap-envelope", NULL}, 
    {"SOAP-ENC", "http://schemas.xmlsoap.org/soap/encoding/", "http://www.w3.org/*/soap-encoding", NULL}, 
    {"xsi", "http://www.w3.org/2001/XMLSchema-instance", "http://www.w3.org/*/XMLSchema-instance", NULL}, 
    {"xsd", "http://www.w3.org/2001/XMLSchema", "http://www.w3.org/*/XMLSchema", NULL}, 
    {"ns2", "http://schemas.microsoft.com/2003/10/Serialization/", NULL, NULL}, 
    {"ns1", "http://asp.net/ApplicationServices/v200", NULL, NULL}, 
    {"ns3", "http://tempuri.org/", NULL, NULL}, 
    {NULL, NULL, NULL, NULL} 
}; 

soapC.cpp, soap.H, soap.nsmap使用soapcpp2.exe實用程序生成。

main.cpp中:

#include <stdsoap2.h> 
#include <string> 
#include <sstream> 
#include <iomanip> 
#include <iostream> 
#include <tchar.h> 

void print_in_hex(const std::string& str) 
{ 
    std::string::const_iterator ch; 
    for(ch = str.begin(); ch != str.end(); ++ch) 
    { 
     std::cout << std::hex << 
     std::setw(2) << std::setfill('0') << std::uppercase << 
      static_cast<unsigned int>(static_cast<unsigned char>(*ch)) << " "; 

    } 
    std::cout << std::endl; 
} 

// Sample XML content 

const std::string Xml = 
"<?xml version=\"1.0\" encoding=\"utf-8\"?>\ 
<entry>\ 
<properties>\ 
<Id>a8a4cf87-9497-4078-9166-0737a55ca7fc</Id>\ 
<Name>\xD0\x9D\xD0\xBE\xD0\xB2\xD0\xB0\xD1\x8F\x20\xD0\xBA\ 
\xD0\xBE\xD0\xBB\xD0\xBB\xD0\xB5\xD0\xBA\xD1\x86\xD0\xB8\xD1\x8F</Name>\ 
</properties>\ 
</entry>"; 

const std::string correctName = "\xD0\x9D\xD0\xBE\xD0\xB2\xD0\xB0\xD1\x8F\x20\xD0\xBA\ 
\xD0\xBE\xD0\xBB\xD0\xBB\xD0\xB5\xD0\xBA\xD1\x86\xD0\xB8\xD1\x8F"; 

int _tmain(int argc, _TCHAR* argv[]) 
{ 
    std::stringstream inputStream; 
    inputStream.str(Xml); 
    struct soap_dom_element entry(soap_new()); 
    soap_set_mode(entry.soap, SOAP_DOM_TREE | SOAP_C_UTFSTRING); 
    inputStream >> entry; 
    soap_dom_element_iterator it = entry.find(NULL, "Name"); 
    if(it != entry.end()) 
    { 
     std::cout << "Original content:" << std::endl; 
     print_in_hex(correctName); 
     std::string name = (*it).data; 
     std::cout << "Parsed content:" << std::endl; 
     print_in_hex(name); 
    } 
    return 0; 
} 

輸出:

Original content: 
D0 9D D0 BE D0 B2 D0 B0 D1 8F 20 D0 BA D0 BE D0 BB D0 BB D0 B5 D0 BA D1 86 D0 B8 D1 8F 
Parsed content: 
C3 90 9D D0 BE D0 B2 D0 B0 D1 8F 20 D0 BA D0 BE D0 BB D0 BB D0 B5 D0 BA D1 86 D0 B8 D1 8F 

當正在從流讀取XML,gSOAP的把兩個字節0xC3 0x90代替<Name>標籤的原始內容的第一個字節0xD0。因此,當文本從UTF8解碼爲Windows-1251時,我看到'??овая коллекция'而不是'Новая коллекция'。有誰知道如何解決這個問題?謝謝!

回答

2

此問題被固定在gSOAP的2.8.16