如何在UTF-16字符串轉換爲UTF-8在C++

考慮：如何在UTF-16字符串轉換爲UTF-8在C++

STDMETHODIMP CFileSystemAPI::setRRConfig(BSTR config_str, VARIANT* ret) 
{ 
mReportReaderFactory.reset(new sbis::report_reader::ReportReaderFactory()); 

USES_CONVERSION; 
std::string configuration_str = W2A(config_str);

但config_str我得到了UTF-16的字符串。我如何在這段代碼中將它轉換爲UTF-8？

來源

2014-01-30 user3252635

如果您正在使用C++ 11你可以看看這個：

http://www.cplusplus.com/reference/codecvt/codecvt_utf8_utf16/

來源

2014-01-30 12:49:58 beardedN5rd

你能告訴我一個例子，因爲我不知道如何一起工作它。 BSTR輸入參數在UTF-16le – user3252635

沒有時間創建一個，但發現[鏈接]（https://stackoverflow.com/questions/7232710/convert-between-string-u16string-u32string）他涵蓋了非常明確。我希望這有助於 – beardedN5rd

-2

void encode_unicode_character(char* buffer, int* offset, wchar_t ucs_character) 
{ 
    if (ucs_character <= 0x7F) 
    { 
     // Plain single-byte ASCII. 
     buffer[(*offset)++] = (char) ucs_character; 
    } 
    else if (ucs_character <= 0x7FF) 
    { 
     // Two bytes. 
     buffer[(*offset)++] = 0xC0 | (ucs_character >> 6); 
     buffer[(*offset)++] = 0x80 | ((ucs_character >> 0) & 0x3F); 
    } 
    else if (ucs_character <= 0xFFFF) 
    { 
     // Three bytes. 
     buffer[(*offset)++] = 0xE0 | (ucs_character >> 12); 
     buffer[(*offset)++] = 0x80 | ((ucs_character >> 6) & 0x3F); 
     buffer[(*offset)++] = 0x80 | ((ucs_character >> 0) & 0x3F); 
    } 
    else if (ucs_character <= 0x1FFFFF) 
    { 
     // Four bytes. 
     buffer[(*offset)++] = 0xF0 | (ucs_character >> 18); 
     buffer[(*offset)++] = 0x80 | ((ucs_character >> 12) & 0x3F); 
     buffer[(*offset)++] = 0x80 | ((ucs_character >> 6) & 0x3F); 
     buffer[(*offset)++] = 0x80 | ((ucs_character >> 0) & 0x3F); 
    } 
    else if (ucs_character <= 0x3FFFFFF) 
    { 
     // Five bytes. 
     buffer[(*offset)++] = 0xF8 | (ucs_character >> 24); 
     buffer[(*offset)++] = 0x80 | ((ucs_character >> 18) & 0x3F); 
     buffer[(*offset)++] = 0x80 | ((ucs_character >> 12) & 0x3F); 
     buffer[(*offset)++] = 0x80 | ((ucs_character >> 6) & 0x3F); 
     buffer[(*offset)++] = 0x80 | ((ucs_character >> 0) & 0x3F); 
    } 
    else if (ucs_character <= 0x7FFFFFFF) 
    { 
     // Six bytes. 
     buffer[(*offset)++] = 0xFC | (ucs_character >> 30); 
     buffer[(*offset)++] = 0x80 | ((ucs_character >> 24) & 0x3F); 
     buffer[(*offset)++] = 0x80 | ((ucs_character >> 18) & 0x3F); 
     buffer[(*offset)++] = 0x80 | ((ucs_character >> 12) & 0x3F); 
     buffer[(*offset)++] = 0x80 | ((ucs_character >> 6) & 0x3F); 
     buffer[(*offset)++] = 0x80 | ((ucs_character >> 0) & 0x3F); 
    } 
    else 
    { 
     // Invalid char; don't encode anything. 
    } 
}

ISO10646-2012它是所有你需要了解UCS。

來源

2014-01-30 13:00:15 kvv

UCS不在問題中。而UCS不是UTF-16。您的代碼是否適用於UTF-16？ – rubenvb

@rubenvb，它工作肯定，你應該嘗試。 – kvv

我沒有說utf-16是ucs，但它是utf-8的一部分。 – kvv

你可以做這樣的事情

std::string WstrToUtf8Str(const std::wstring& wstr) 
{ 
    std::string retStr; 
    if (!wstr.empty()) 
    { 
    int sizeRequired = WideCharToMultiByte(CP_UTF8, 0, wstr.c_str(), -1, NULL, 0, NULL, NULL); 

    if (sizeRequired > 0) 
    { 
     std::vector<char> utf8String(sizeRequired); 
     int bytesConverted = WideCharToMultiByte(CP_UTF8, 0, wstr.c_str(),  
          -1, &utf8String[0], utf8String.size(), NULL, 
          NULL); 
     if (bytesConverted != 0) 
     { 
     retStr = &utf8String[0]; 
     } 
     else 
     { 
     std::stringstream err; 
     err << __FUNCTION__ 
      << " std::string WstrToUtf8Str failed to convert wstring '" 
      << wstr.c_str() << L"'"; 
     throw std::runtime_error(err.str()); 
     } 
    } 
    } 
    return retStr; 
}

你可以給你的BSTR的功能作爲一個std :: wstring的

來源

2017-01-15 19:15:28

如何在UTF-16字符串轉換爲UTF-8在C++

回答

相關問題