查找鋸齒狀陣列的獨特行

我有一個鋸齒狀的字符串陣列，我需要找到所有獨特的行。例如，對於例如查找鋸齒狀陣列的獨特行

[ 
["A","B"] , 
["C","D","E"], 
["B", "A"], 
["E","A"] 
]

這應該返回行1和行3作爲行0和行2是重複的。如何才能做到這一點？我可以使用hashets嗎？

來源

2012-12-26 annantDev

作爲數組，第0行和第2行不重複。他們只有一套相同的元素。 – SWeko

是的，你可以使用HashSet。爲每個行創建一個包裝類型，或者使用IEqualityComparer和[HashSet構造函數]（http://msdn.microsoft.com/zh-cn/library/bb359438.aspx）。（確保使用所需的業務規則：例如，在計算散列值或檢查序列相等之前先排序。） – 2012-12-26 22:32:27

（即使不使用HashSet，也會創建[IEqualityComparer]（http://msdn.microsoft.com/zh-cn/ us/library/ms132151.aspx）可能是明智的，可以與需要測試每個業務規則的「相等」的其他方法一起使用。） – 2012-12-26 22:39:20

首先，作爲數組，行0和行2不重複。他們只有一套相同的元素。但是，如果你只是想刪除這些樣行，你可以這樣做：

string[][] GetNonDuplicates(string[][] jagged) 
{ 
    //not a hashset, but a dictionary. A value of false means that the row 
    //is not duplicate, a value of true means that at least one dulicate was found 
    Dictionary<string[], bool> dict = 
      new Dictionary<string[], bool>(new RowEqualityComparer()); 

    foreach(string[] row in jagged) 
    { 
    //if a duplicate is found - using the hash and the compare method 
    if (dict.ContainsKey(row)) 
    { 
     dict[row] = true; //set value to true 
    } 
    else 
    { 
     dict.Add(row, false); //first time we see this row, add it 
    } 
    } 

    //just pop out all the keys which have a value of false 
    string[][] result = dict.Where(item => !item.Value) 
          .Select(item => item.Key) 
          .ToArray(); 
    return result; 
} 

... 
string[][] jagged = new []{new []{"A","B"} , 
          new []{"C","D","E"}, 
          new []{"B", "A"}, 
          new []{"E","A"}}; 

string[][] nonDuplicates = GetNonDuplicates(jagged);

其中RowEqualityComparer是：

class RowEqualityComparer : IEqualityComparer<string[]> 
{ 
    public bool Equals(string[] first, string[] second) 
    { 
     // different legths - different rows 
     if (first.Length != second.Length) 
      return false; 

     //we need to copy the arrays because Array.Sort 
     //will change the original rows 
     var flist = first.ToList(); 
     flist.Sort(); 
     var slist = second.ToList(); 
     slist.Sort(); 

     //loop and compare one by one 
     for (int i=0; i < flist.Count; i++) 
     { 
      if (flist[i]!=slist[i]) 
       return false; 
     } 
     return true; 
    } 

    public int GetHashCode(string[] row) 
    { 
     //I have no idea what I'm doing, just some generic hash code calculation 
     if (row.Length == 0) 
     return 0; 
     int hash = row[0].GetHashCode(); 
     for (int i = 1; i < row.Length; i++) 
     hash ^= row[i].GetHashCode(); 
     return hash; 
    } 

}

來源

2012-12-26 23:05:34 SWeko

我假設不僅順序是無關緊要的，而且當數組中的重複項不計算時（'HashSet'將消除它們），'Length'也沒有意義。 –

我正在設想[A，B，B]和[A，A，B]會被認爲是不同的。在這種情況下，這種比較是有道理的。否則，HashSet將是一個正確的方法。 – SWeko

至於算法解去，我倒是

排序的行（你可以使用任何排序指標你喜歡，只要它區別於任何兩個不同行。）
挑行沒有相同的相鄰行。

如果你這樣做，你應該能夠完成O（m * n個* LG電子（n））的其中米是你行的長度，ñ是您的要求行數

鑑於值集意味着相等，您可以對每行的單元格進行排序以幫助您對行列表進行排序。這將導致O（n * m * lg（m）+ m * n * lg（n））

來源

2012-12-26 22:33:13

你可以發表一些例子嗎？ – annantDev

我會計算每行的哈希值如下：

[ 
["A","B"] , // hash of this row :10 as example 
["C","D","E"], // hash of this row : 20 
["B", "A"], // hash of this row would be 10 as well 
["E","A"] 
]

因爲它們都是字符串，所以可以計算哈希值併爲每行創建一個哈希值。

您可以使用HashSet的方式如下，每行創建一個哈希集，然後找到每行其他行的差異，如果差異是空的，那麼它們是相同的。

也可以使用交點，如果交點不爲空，那麼該行不是唯一的。

來源

2012-12-26 23:14:27 DarthVader

假設您想忽略順序，重複項（因爲您已經提到了HashSet），並且結果應該只包含沒有重複項的數組。

您可以實現自定義IEqualityComparer<String[]>爲Enumerable.GroupBy並僅選擇都有它獨特陣列（組數== 1）：

class IgnoreOrderComparer : IEqualityComparer<string[]> 
{ 
    public bool Equals(string[] x, string[] y) 
    { 
     if (x == null || y == null) return false; 
     return !x.Distinct().Except(y.Distinct()).Any(); 
    } 

    public int GetHashCode(string[] arr) 
    { 
     if (arr == null) return int.MinValue; 
     int hash = 19; 
     foreach (string s in arr.Distinct()) 
     { 
      hash = hash + s.GetHashCode(); 
     } 
     return hash; 
    } 
}

其餘部分很簡單：

String[][] uniques = arrays.GroupBy(arr => arr, new IgnoreOrderComparer()) 
          .Where(g => g.Count() == 1) 
          .Select(g => g.First()) 
          .ToArray();

編輯：下面是使用同一比較器的可能更高效的版本：

IEqualityComparer<string[]> comparer = new IgnoreOrderComparer(); 
String[][] uniques = arrays.Where(a1 => 
    !arrays.Any(a2 => a1 != a2 && comparer.Equals(a1, a2))) 
          .ToArray();

來源

2012-12-26 23:18:16

查找鋸齒狀陣列的獨特行

回答

相關問題