Q

如何使語言友好的功能降低？

2014-04-24 49 views 2 likes

2

我想要一個函數'降低'（從單詞）在兩種語言上正確工作，例如英語和俄語。我該怎麼辦？我應該使用std :: wstring，還是我可以使用std :: string？另外我希望它是跨平臺的，不要重新發明輪子。如何使語言友好的功能降低？

2014-04-24 Ava_Katushka

+0

這是一個複雜的問題。確保你知道區域設置，並且你已經閱讀了這個：http：//www.joelonsoftware.com/articles/Unicode.html –

+1

最後，爲了做到這一點，你不得不使用unicode字符串，您選擇的編碼（更喜歡UTF-8）。對於單個unicode代碼點，未正確定義更改大小寫（低，高，標題，摺疊）。儘管如此，還有很多語言對這些轉換的定義有衝突。 – Deduplicator

+0

所以我應該使用unicode，還有什麼？我確切知道我將會擁有哪些語言。其中之一。它無法幫助一些 - 如何？ –

A

回答

6

對於這種事情的規範庫是ICU：

http://site.icu-project.org/

還有一個升壓包裝：

http://www.boost.org/doc/libs/1_55_0/libs/locale/doc/html/index.html

另見這個問題： Is there an STL and UTF-8 friendly C++ Wrapper for ICU, or other powerful Unicode library

首先確保你瞭解這個騙局您可以牢牢掌握Unicode和更一般的編碼系統。

一些很好的讀取快速啓動：

http://joelonsoftware.com/articles/Unicode.html

http://en.wikipedia.org/wiki/Locale

2014-04-24 19:16:05

0

我認爲這個解決方案是確定的。我不確定它適合所有情況，但這很有可能。

#include <locale> 
#include <codecvt> 
#include <string> 

std::string toLowerCase (const std::string& word) { 
    std::wstring_convert<std::codecvt_utf8<wchar_t> > conv; 
    std::locale loc("en_US.UTF-8"); 
    std::wstring wword = conv.from_bytes(word); 
    for (int i = 0; i < wword.length(); ++i) { 
     wword[i] = std::tolower(word[i], loc); 
    } 
    return conv.to_bytes(wword); 
}

2014-04-26 13:56:15

相關問題