如何在二維int數組中找到最常見的int？

好的，所以我剛開始考慮如何爲Paint.NET實現一個新的圖形插件，我需要知道如何在二維整數數組中找到最常見的整數。有沒有內置的C＃方式來做到這一點？或者，有沒有人有一個爽快的方式來做到這一點？如何在二維int數組中找到最常見的int？

數組將是這個樣子：

300 300 300 300 300 300 300 
    0 150 300 300 300 300 300 
    0 0 150 300 300 300 300 
    0 0 0 0 300 300 300 
    0 0 0 0 150 300 300 
    0 0 0 0 0 150 300 
    0 0 0 0 0 0 300

我需要知道，300是陣列中最常用的號碼。如果沒有「最常見」，那麼只需返回中心數（陣列減少總是奇數x奇數）0.

我將使用「強力」算法來實現這一點，除非您的專家能夠出現用更快的東西。

任何幫助將非常感激。

謝謝！

編輯：更多信息...

值幾乎總是非常不同的（比我的例子數組更加多樣化）。值將在0-360的範圍內。根據算法的速度，陣列的大小將爲5x5至17x17。對於大圖像中的每個像素，結果將會計算一次...所以速度越快越好。 ;）

來源

2009-01-29 BoltBait

聽起來像一個有趣的問題 - 我敢打賭有一個答案。讓我感興趣。 – Jeffrey 2009-01-29 18:59:39

如果是平局（例如300和125都有相同的命中次數），你會怎麼做？ – 2009-01-29 19:39:23

@邁克爾，在最初的問題中說：「如果沒有」最常見的「，那麼只需返回中心號碼」，這意味着迄今爲止發佈的解決方案都不符合要求。 – BoltBait 2009-01-29 19:42:07

查看Paint.NET中的LocalHistogramEffect代碼，特別是LocalHistorgramEffect.RenderRect。

我走過輸入圖像，用目標像素的'r'像素維持每個源像素的強度直方圖。當輸出像素被遍歷時，它將前沿添加到直方圖並減去後沿。它能很好地處理所有的邊緣情況，而且速度非常快。這是「中性」，「未聚焦」，「輪廓」和「去除噪音」效果的基礎。

調整這個以支持Hue而不是RGB強度將是相當微不足道的。

性能非常好，而且爲了您的目的，它在O（r^2 + w r + n w）中運行，其中r是半徑，w是圖像的寬度，n是直方圖中的層數。

-tjackson

來源

2009-04-03 17:30:03 ddrcoder

它至少是O（n * m）任何你切片的方式 - 你將不得不看每個單元格至少一次。節省的地方是在你尋找最常見之前積累每個價值的數量的地方;如果你的整數在一個相對較小的範圍內變化（它們都是uint16，我們假設），那麼你可以簡單地使用平面數組而不是地圖。

我猜你也可以保留，只要你已經小於（N * M的運行計數X，目前的頂部和第二最近的候選爲「最常見的」和早期輸出的ÿ ） - （xy）個細胞留下來看，因爲在那時，亞軍無法超越最佳人選。

這樣的整數運算速度相當快;即使對於百萬像素圖像，蠻力算法也只需要幾毫秒。

我注意到你已經編輯你的原始問題，說像素值從0..255 - 在這種情況下，肯定要用一個簡單的平面數組;這個尺寸足夠小，可以輕鬆適應l1 dcache，並且在平面陣列中的查找速度非常快。

編輯：一旦你建立了直方圖數組，處理「沒有最常見的數字」的情況是非常簡單的：你所要做的就是遍歷它，找到「最」和「第二大多數「常見數字;如果他們同樣頻繁，那麼根據定義，沒有人最常見。

const int numLevels = 360; // you said each cell contains a number [0..360) 
int levelFrequencyCounts[numLevels]; // assume this has been populated such that levelFrequencyCounts[i] = number of cells containing "i" 
int mostCommon = 0, runnerUp = 0; 
for (int i = 1 ; i < numLevels ; ++i) 
{ 
    if (levelFrequencyCounts[i] > levelFrequencyCounts[mostCommon]) 
    { 
    runnnerUp = mostCommon; 
    mostCommon = i; 
    } 
} 

if (levelFrequencyCounts[mostCommon] != levelFrequencyCounts[runnerUp]) 
{ 
    return mostCommon; 
} 
else 
{ 
    return CenterOfInputData; // (something like InputData[n/2][m/2]) 
}

來源

2009-01-29 19:04:38 Crashworks

我會怎麼做在C＃這樣的事情？

事情是這樣的：

Dictionary<int, int> d = new Dictionary<int, int>(); 
foreach (int value in matrix) 
{ 
if (!d.ContainsKey(value)) 
    d.Add(value, 1); 
else 
    d[value] = d[value] + 1; 
} 
KeyValuePair<int, int> biggest = null; 
foreach (KeyValuePair<int, int> found in d) 
{ 
    if ((biggest == null) || (biggest.Value < found.Value)) 
    biggest = found; 
}

來源

2009-01-29 19:06:59 ChrisW

一種選擇是LINQ - 有點效率低下，而且確定非巨大的陣列：

var max = (from cell in data.Cast<int>() 
       group cell by cell into grp 
       select new { Key = grp.Key, Count = grp.Count() } into agg 
       orderby agg.Count descending 
       select agg).First(); 
    Console.WriteLine(max.Key + ": " + max.Count);

或用交錯數組：

var max = (from row in data 
       from cell in row 
       group cell by cell into grp 
       select new {Key = grp.Key, Count = grp.Count()} into agg 
       orderby agg.Count descending 
       select agg).First(); 
    Console.WriteLine(max.Key + ": " + max.Count);

實際上，我可能會使用字典/計數。這個沒有LINQ的例子，只是「因爲」：

Dictionary<int, int> counts = new Dictionary<int, int>(); 
    foreach (int value in data) 
    { 
     int count; 
     counts.TryGetValue(value, out count); 
     counts[value] = count + 1; 
    } 
    int maxCount = -1, maxValue = 0; 
    foreach (KeyValuePair<int, int> pair in counts) 
    { 
     if (pair.Value > maxCount) 
     { 
      maxCount = pair.Value; 
      maxValue = pair.Key; 
     } 
    } 
    Console.WriteLine(maxCount + ": " + maxValue);

來源

2009-01-29 19:07:37

如果速度是你最關心的問題，不要使用字典。堅持一個字節數組。試試這個：

// stores hit counts (0-360) 
short[] hitCounts = new short[361]; 

// iterate through 2d array and increment hit counts 
for (int i = 0; i < toEvaluate.Length; i++) 
{ 
    for (int j = 0; j < toEvaluate[i].Length; j++) 
     hitCounts[toEvaluate[i][j]]++; 
} 

int greatestHitCount = 0; // the hit count of the current greatest value 
int greatest = -1; // the current greatest valeu 

// iterate through values (0-360) and evalute hit counts 
for (int i = 0; i < hitCounts.Length; i++) 
{ 
    // the hit count of hitCounts[i] is higher than the current greatest hit count value 
    if (hitCounts[i] > greatestHitCount) 
    { 
     greatestHitCount = vals[i]; // store the new hit count 
     greatest = i; // store the greatest value 
    } 
    // there is already a value with the same hit count (which is the greatest) 
    else if (hitCounts[i] == greatestHitCount) 
     greatest = -1; // there are more than one value, we can't use this if it ends up being the greatest 
} 

if (greatest >= 0) // no greatest value found 
    return greatest; 

// figure out the middle x and y value 
int x = (toEvaluate.Length - 1)/2 + 1; 
int y = (toEvaluate[x].Length - 1)/2 + 1; 

// return the value at the center of the 2d array as the value 
return toEvaluate[x][y];

當速度成爲關注可讀性的問題時，最終會遇到一些難看的代碼。以上這些肯定可以從重構中受益（因此過分註釋），但它應該運行得很快。如果速度不夠快，可以通過將其移至非託管代碼來獲得更多優化。

來源

2009-01-29 19:54:30

邁克爾打我的帖子，但我這樣做，是這樣的：

 int MaxValueIn2dArray(int[,] matrix) 
    { 
     var d = new int[360]; 
     int MaxValue = 0; 
     for (int x = 0; x <= matrix.GetUpperBound(0); x++) 
     { 
      for (int y = 0; y <= matrix.GetUpperBound(1); y++) 
      { 
       d[matrix[x, y]]++; 
      } 
     } 
     foreach (int value in d) 
     { 
      if (value > MaxValue) MaxValue = value; 
     } 
     return MaxValue; 
    }

這將需要爲您的特定需求進行優化。

來源

2009-01-29 20:03:37 pro3carp3

您的圖片：

300+ 300+ 300+ 300 300 300 300 
    0+ 150+ 300+ 300 300 300 300 
    0+ 0+ 150+ 300 300 300 300 
    0 0 0 0 300 300 300 
    0 0 0 0 150 300 300 
    0 0 0 0 0 150 300 
    0 0 0 0 0 0 300

標記（+）號是你的窗口。 w，h是你的窗口尺寸。申請bucket sorting（正如其他人所建議的，因爲你的數值範圍相當有限）。請不要像Crashworks暗示的那樣減半評估。不要拋棄你的結果。這是第一步。

300- 300- 300- 300 300 300 300 
    0. 150. 300. 300 300 300 300 
    0. 0. 150. 300 300 300 300 
    0+ 0+ 0+ 0 300 300 300 
    0 0 0 0 150 300 300 
    0 0 0 0 0 150 300 
    0 0 0 0 0 0 300

改變你的窗戶。而不是添加，減去您傳遞的最後一行/列中的存儲桶並添加新的存儲桶。通過這種方式，你可以檢查每個像素2（w + h）次，即當它穿過窗口邊界時，而不是w * h次，即當該像素在窗口中時，在天真的實現中。

換句話說，你需要將你的窗口是這樣的：

| ^->|^
| | | | 
| | | | 
V->| V->|

我假設你正在試圖實現非線性卷積過濾器。

更正歡迎。

來源

2009-01-29 22:43:29 artificialidiot

所有我提供的是該檢查每一個細胞的任何算法（這是一個很值得你期待怎樣做）做兩件多餘的東西：

1）確保程序退出時計對於當前最常用的值>（M x N/2）。如果網格上的內容覆蓋率> 50％，那麼這是最常見的值，不需要繼續。如果你的例程只需要正確的時間大部分時間，那麼你可以降低百分比，並把它當作啓發式。你甚至可以運行一些分析，如果覆蓋率> 37.6％，那麼它會成爲最常見的值的99.9％，然後使用該百分比。2.如果有任何方法可以確定最常見的值可能在哪一邊，角落或一般位置（外邊緣，中間等），那麼可以按照這個順序一起掃描使用上面的優化1可以減少很多掃描。例如，在你的例子中，右上角對共同價值很重要。如果這可以通過某種啓發式方法確定，則可以以某種方式從右上角到左下角進行掃描。如果需要的掃描模式很複雜，請預先生成它。

來源

2009-02-20 17:22:15

如何在二維int數組中找到最常見的int？

回答

相關問題