2014-11-23 40 views
0

編輯

我通過鍵入out << L"Swedish: å ä ö Å Ä Ö"這裏解決下這個問題,那就是字符串前的前綴L,在這個答案解釋:What exactly is the L prefix in C++? 我的問題是,如果現在這是一個好的解決方案,還是有解決這個問題的首選方案?fastcgipp <無輸出爲utf8的字符


代碼

http://www.nongnu.org/fastcgipp/doc/2.1/a00004.html下列編輯方法:

bool response() 
    { 
     wchar_t russian[]={ 0x041f, 0x0440, 0x0438, 0x0432, 0x0435, 0x0442, 0x0020, 0x043c, 0x0438, 0x0440, 0x0000 }; 
     wchar_t chinese[]={ 0x4e16, 0x754c, 0x60a8, 0x597d, 0x0000 }; 
     wchar_t greek[]={ 0x0393, 0x03b5, 0x03b9, 0x03b1, 0x0020, 0x03c3, 0x03b1, 0x03c2, 0x0020, 0x03ba, 0x03cc, 0x03c3, 0x03bc, 0x03bf, 0x0000 }; 
     wchar_t japanese[]={ 0x4eca, 0x65e5, 0x306f, 0x4e16, 0x754c, 0x0000 }; 
     wchar_t runic[]={ 0x16ba, 0x16d6, 0x16da, 0x16df, 0x0020, 0x16b9, 0x16df, 0x16c9, 0x16da, 0x16de, 0x0000 }; 
     out << "Content-Type: text/html; charset=utf-8\r\n\r\n"; 
     out << "<html><head><meta http-equiv='Content-Type' content='text/html; charset=utf-8' />"; 
     out << "<title>fastcgi++: Hello World in UTF-8</title></head><body>"; 
     out << "English: Hello World<br />"; 
     out << "Russian: " << russian << "<br />"; 
     out << "Greek: " << greek << "<br />"; 
     out << "Chinese: " << chinese << "<br />"; 
     out << "Japanese: " << japanese << "<br />"; 
     out << "Runic English?: " << runic << "<br />"; 
     out << "Swedish: å ä ö Å Ä Ö<br />"; 
     out << "</body></html>"; 
     return true; 
    } 

原始輸出

Content-Type: text/html; charset=utf-8 

<html><head><meta http-equiv='Content-Type' content='text/html; charset=utf-8' /><title>fastcgi++: Hello World in UTF-8</title></head><body>English: Hello World<br />Russian: Привет мир<br />Greek: Γεια σας κόσμο<br />Chinese: 世界您好<br />Japanese: 今日は世界<br />Runic English?: ᚺᛖᛚᛟ ᚹᛟᛉᛚᛞ<br />Swedish:  <br /></body></html> 

瀏覽器解譯

English: Hello World 
Russian: Привет мир 
Greek: Γεια σας κόσμο 
Chinese: 世界您好 
Japanese: 今日は世界 
Runic English?: ᚺᛖᛚᛟ ᚹᛟᛉᛚᛞ 
Swedish: 

如上所示,瑞典最後一條線具有預期的輸出「åäöÅÄÖ」的行爲。然而,由於某些原因,這被替換爲空格。必須有一種方法,我不能太直接地輸出該字母的unicode十六進制表示。

經過一些谷歌研究,我試圖在主腳本的開頭添加setLocale沒有成功。

這是爲什麼?
如何解決問題,以便能夠以上述方式編碼時自由使用任何utf8字符?

回答

1

這在Linux:

#include <iostream> 
#include <locale> 

    bool response() 
    { 
     wchar_t russian[]={ 0x041f, 0x0440, 0x0438, 0x0432, 0x0435, 0x0442, 0x0020, 0x043c, 0x0438, 0x0440, 0x0000 }; 
     wchar_t chinese[]={ 0x4e16, 0x754c, 0x60a8, 0x597d, 0x0000 }; 
     wchar_t greek[]={ 0x0393, 0x03b5, 0x03b9, 0x03b1, 0x0020, 0x03c3, 0x03b1, 0x03c2, 0x0020, 0x03ba, 0x03cc, 0x03c3, 0x03bc, 0x03bf, 0x0000 }; 
     wchar_t japanese[]={ 0x4eca, 0x65e5, 0x306f, 0x4e16, 0x754c, 0x0000 }; 
     wchar_t runic[]={ 0x16ba, 0x16d6, 0x16da, 0x16df, 0x0020, 0x16b9, 0x16df, 0x16c9, 0x16da, 0x16de, 0x0000 }; 
     std::wcout << "Content-Type: text/html; charset=utf-8\r\n\r\n" << std::endl; 
     std::wcout << "<html><head><meta http-equiv='Content-Type' content='text/html; charset=utf-8' />" << std::endl; 
     std::wcout << "<title>fastcgi++: Hello World in UTF-8</title></head><body>" << std::endl; 
     std::wcout << "English: Hello World<br />" << std::endl; 
     std::wcout << "Russian: " << russian << "<br />" << std::endl; 
     std::wcout << "Greek: " << greek << "<br />" << std::endl; 
     std::wcout << "Chinese: " << chinese << "<br />" << std::endl; 
     std::wcout << "Japanese: " << japanese << "<br />" << std::endl; 
     std::wcout << "Runic English?: " << runic << "<br />" << std::endl; 
     std::wcout << L"Swedish: å ä ö Å Ä Ö<br />" << std::endl; 
     std::wcout << "</body></html>" << std::endl; 
     return true; 
    } 

int main() 
{ 
    std::locale::global(std::locale("")); 
    response(); 
} 

注(1)的輸出是一個寬流和(2)的瑞典字符串文字是寬(L"whatever")。字符串文字之前的L前綴(「Long」)表示文字是寬字符文字(wchar_t[]),而不是常規字符串文字(char[])。

窄字符串文字在這裏不起作用,因爲窄字符集默認是UTF-8,默認情況下沒有從UTF-8到任何寬編碼(UCS4可能)的轉換。每個字節只是擴大了,這是完全錯誤的。如果你想要,你可以自己轉換它,或者使用標準轉換函數之一:mbstowcs(不是真正的可移植的)或C++ 11 wstring_convert(並不真正與gcc/libstdC++一起工作,使用clang/libC++)。

如何讓這項工作在Windows上是任何人的猜測。

建議您堅持使用char和UTF-8或wchar_tUCS4(在Linux上)。由於您要輸出UTF-8,因此使用char而非wchar_t是合理的。