2016-09-23 79 views
4

爲什麼「Ū」先取代「U」?有序按文化排序不能按預期工作

CultureInfo ci = CultureInfo.GetCultureInfo("lt-LT"); 
    bool ignoreCase = true; //whether comparison should be case-sensitive 
    StringComparer comp = StringComparer.Create(ci, ignoreCase); 
    string[] unordered = { "Za", "Žb", "Ūa", "Ub" }; 
    var ordered = unordered.OrderBy(s => s, comp); 

結果: UA 泛 雜誌 ZB

應該是:泛UA雜誌ZB

這裏是立陶宛字母秩序。 https://www.assorti.lt/userfiles/uploader/no/norvegiska-lietuviska-delione-abecele-maxi-3-7-m-vaikams-larsen.jpg

+2

http://stackoverflow.com/questions/1371813/why-does-string-compare-seem-to-handle-accented-characters-inconsistently –

回答

1

我剛剛做了什麼可能是(有限)解決您的問題。 這不是最優化的,但它可以給出如何解決它的想法。 我創建了一個LithuanianString類,它僅用於封裝您的字符串。 此類實現IComparable以便能夠對LithuanianString的列表進行排序。

以下是可能是一流的:

public class LithuanianString : IComparable<LithuanianString> 
{ 

    const string UpperAlphabet = "AĄBCČDEĘĖFGHIĮYJKLMNOPRSŠTUŲŪVZŽ"; 
    const string LowerAlphabet = "aąbcčdeęėfghiįyjklmnoprsštuųūvzž"; 
    public string String; 

    public LithuanianString(string inputString) 
    { 
     this.String = inputString; 
    } 

    public int CompareTo(LithuanianString other) 
    { 
     var maxIndex = this.String.Length <= other.String.Length ? this.String.Length : other.String.Length; 
     for (var i = 0; i < maxIndex; i++) 
     { 
      //We make the method non case sensitive 
      var indexOfThis = LowerAlphabet.Contains(this.String[i]) 
       ? LowerAlphabet.IndexOf(this.String[i]) 
       : UpperAlphabet.IndexOf(this.String[i]); 

      var indexOfOther = LowerAlphabet.Contains(other.String[i]) 
       ? LowerAlphabet.IndexOf(other.String[i]) 
       : UpperAlphabet.IndexOf(other.String[i]); 

      if (indexOfOther != indexOfThis) 
       return indexOfThis - indexOfOther; 
     } 
     return this.String.Length - other.String.Length; 
    } 
} 

這裏是我做了嘗試它的樣本:

static void Main(string[] args) 
    { 
     string[] unordered = { "Za", "Žb", "Ūa", "Ub" }; 

     //Create a list of lithuanian string from your array 
     var lithuanianStringList = (from unorderedString in unordered 
      select new LithuanianString(unorderedString)).ToList(); 
     //Sort it 
     lithuanianStringList.Sort(); 

     //Display it 
     Console.WriteLine(Environment.NewLine + "My Comparison"); 
     lithuanianStringList.ForEach(c => Console.WriteLine(c.String)); 
    } 

輸出是預期之一:

UbŪaZaŽb

該類僅允許通過替換開頭定義的兩個常量中的字符來創建自定義字母。