這是我的情況。我有一個問題,我需要過濾用戶可能從Word或Excel文檔粘貼的無效字符。C++ - 當用戶粘貼網格時刪除無效字符
這是我正在做的事情。
首先,我想轉換任何Unicode字符的ASCII
extern "C" COMMON_STRING_FUNCTIONS long ConvertUnicodeToAscii(wchar_t * pwcUnicodeString, char* &pszAsciiString)
{
int nBufLen = WideCharToMultiByte(CP_ACP, 0, pwcUnicodeString, -1, NULL, 0, NULL, NULL)+1;
pszAsciiString = new char[nBufLen];
WideCharToMultiByte(CP_ACP, 0, pwcUnicodeString, -1, pszAsciiString, nBufLen, NULL, NULL);
return nBufLen;
}
接下來,我過濾掉不具有31和127
String __fastcall TMainForm::filterInput(String l_sConversion)
{
// Used to store every character that was stripped out.
String filterChars = "";
// Not Used. We never received the whitelist
String l_SWhiteList = "";
// Our String without the invalid characters.
AnsiString l_stempString;
// convert the string into an array of chars
wchar_t* outputChars = l_sConversion.w_str();
char * pszOutputString = NULL;
//convert any unicode characters to ASCII
ConvertUnicodeToAscii(outputChars, pszOutputString);
l_stempString = (AnsiString)pszOutputString;
//We're going backwards since we are removing characters which changes the length and position.
for (int i = l_stempString.Length(); i > 0; i--)
{
char l_sCurrentChar = l_stempString[i];
//If we don't have a valid character, filter it out of the string.
if (((unsigned int)l_sCurrentChar < 31) ||((unsigned int)l_sCurrentChar > 127))
{
String l_sSecondHalf = "";
String l_sFirstHalf = "";
l_sSecondHalf = l_stempString.SubString(i + 1, l_stempString.Length() - i);
l_sFirstHalf = l_stempString.SubString(0, i - 1);
l_stempString = l_sFirstHalf + l_sSecondHalf;
filterChars += "\'" + ((String)(unsigned int)(l_sCurrentChar)) + "\' ";
}
}
if (filterChars.Length() > 0)
{
LogInformation(__LINE__, __FUNC__, Utilities::LOG_CATEGORY_GENERAL, "The Following ASCII Values were filtered from the string: " + filterChars);
}
// Delete the char* to avoid memory leaks.
delete [] pszOutputString;
return l_stempString;
}
之間的值現在這個任意字符似乎工作,除非,當你嘗試從word文檔複製和過去的項目符號。
o Bullet1:
▪subbullet1。
你會得到這樣的事情
oBullet1?subbullet1。
我的過濾器函數在onchange事件上調用。
項目符號被替換爲值o和一個問號。
我在做什麼錯,是否有更好的方法來嘗試這樣做。
我正在使用C++ builder XE5,所以請不要使用Visual C++解決方案。
'CP_ACP'並不代表ASCII,它代表了操作系統的當前區域,這可能是任何語言。 ASCII本身是代碼頁20127。當您只需使用'AnsiStringT <20127>'來定義自己的轉換函數也是多餘的,並讓RTL爲您處理轉換。 – 2014-10-29 17:02:46