0
我正在閱讀某人的代碼,我試圖看到代碼如下。將ucs(通用字符集)字符轉換爲unicode?
根據評論,這個功能是Convert a UCS character to an UTF-8 string
。但什麼是ucs字符,將ucs轉換爲unicode的規則是什麼,我可以在哪裏找到這些文檔?
/*
* Convert a UCS character to an UTF-8 string
*
* Returns the string length of the result
*/
size_t
tUcs2Utf8(ULONG ulChar, char *szResult, size_t tMaxResultLen)
{
if (szResult == NULL || tMaxResultLen == 0) {
return 0;
}
if (ulChar < 0x80 && tMaxResultLen >= 2) {
szResult[0] = (char)ulChar;
szResult[1] = '\0';
return 1;
}
if (ulChar < 0x800 && tMaxResultLen >= 3) {
szResult[0] = (char)(0xc0 | ulChar >> 6);
szResult[1] = (char)(0x80 | (ulChar & 0x3f));
szResult[2] = '\0';
return 2;
}
if (ulChar < 0x10000 && tMaxResultLen >= 4) {
szResult[0] = (char)(0xe0 | ulChar >> 12);
szResult[1] = (char)(0x80 | (ulChar >> 6 & 0x3f));
szResult[2] = (char)(0x80 | (ulChar & 0x3f));
szResult[3] = '\0';
return 3;
}
if (ulChar < 0x200000 && tMaxResultLen >= 5) {
szResult[0] = (char)(0xf0 | ulChar >> 18);
szResult[1] = (char)(0x80 | (ulChar >> 12 & 0x3f));
szResult[2] = (char)(0x80 | (ulChar >> 6 & 0x3f));
szResult[3] = (char)(0x80 | (ulChar & 0x3f));
szResult[4] = '\0';
return 4;
}
szResult[0] = '\0';
return 0;
} /* end of tUcs2Utf8 */
真的嗎? [this](https://www.google.com/search?q=ucs+character&oq=ucs+character&aqs=chrome..69i57j69i60&sourceid=chrome&es_sm=122&ie=UTF-8)沒有幫助? –
@SouravGhosh,我可以閱讀這段代碼,但爲什麼呢?所以我想知道什麼是轉換之間的規則 – roger
當測試和穩定的替代品存在時,請不要推出自己的代碼。如果這是Windows特定的,則可以使用'MultibyteToWideChar'和/或'WideCharToMultibyte'。否則,您可以使用ICU。 – szczurcio