我一直在思考這個問題。正如你所提到的Levenshtein距離,我假設你想通過將他們從D2中的位置移動到D2中的某個其他位置來獲得這些元素,您將以最少的移動次數從D2中獲得D1(忽略不存在於兩個序列)。
我寫了一個貪婪算法,它可能足以滿足您的需求,但它可能不一定會在所有情況下都給出最佳結果。我真的不確定,可能會在稍後(最早的週末)回來查看正確性。然而,如果你真的需要在300萬個元素的序列上做這件事,我相信沒有任何一種算法在這方面做得很好,因爲我看不到一個O(n)算法,好工作,即使在一些微不足道的輸入上也不會失敗。
該算法嘗試將每個元素移動到其預期位置,並計算移動後誤差的總和(每個元素距原始位置的距離)。導致錯誤總和最小的元素被宣佈爲斷路器並移動。重複此操作直到序列返回到D1。我認爲它具有O(n^3)的複雜性,儘管元素有時需要被移動多次,所以它可能是O(n^4)最壞的情況,我不確定,但是在100萬個隨機數有50個元素的例子中,外環運行的最大數量是51(n^4意味着它可以是2500,並且在我所有的百萬測試中我都很幸運)。只有鑰匙,沒有價值。這是因爲這個值在這一步是無關緊要的,所以沒有必要存儲它們。
編輯:我爲此寫了一個反例生成器,事實上它並不總是最優的。破壞者越多,獲得最佳解決方案的可能性就越小。例如,在1000個元素與50次隨機移動的人它通常會發現一組55-60斷路器,當最佳的解決方案是至多50
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace Breakers
{
class Program
{
static void Main(string[] args)
{
//test case 1
//List<string> L1 = new List<string> { "R5", "R1", "R2", "R3", "R6", "R4", "R10", "R8", "R9", "R7" };
//List<string> L2 = new List<string> { "R9", "R5", "R1", "R2", "R8", "R3", "R6", "R4", "R10", "R11", "R7" };
//GetBreakers<string>(L1, L2);
//test case 2
//List<string> L1 = new List<string> { "R5", "R1", "R2", "R3", "R6", "R4", "R10", "R8", "R9", "R7" };
//List<string> L2 = new List<string> { "R5", "R9", "R1", "R6", "R2", "R3", "R4", "R10", "R8", "R7" };
//GetBreakers<string>(L1, L2);
//test case 3
List<int> L1 = new List<int>();
List<int> L2 = new List<int>();
Random r = new Random();
int n = 100;
for (int i = 0; i < n; i++)
{
L1.Add(i);
L2.Add(i);
}
for (int i = 0; i < 5; i++) // number of random moves, this is the upper bound of the optimal solution
{
int a = r.Next() % n;
int b = r.Next() % n;
if (a == b)
{
i--;
continue;
}
int x = L2[a];
Console.WriteLine(x);
L2.RemoveAt(a);
L2.Insert(b, x);
}
for (int i = 0; i < L2.Count; i++) Console.Write(L2[i]);
Console.WriteLine();
GetBreakers<int>(L1, L2);
}
static void GetBreakers<T>(List<T> L1, List<T> L2)
{
Dictionary<T, int> Appearances = new Dictionary<T, int>();
for (int i = 0; i < L1.Count; i++) Appearances[L1[i]] = 1;
for (int i = 0; i < L2.Count; i++) if (Appearances.ContainsKey(L2[i])) Appearances[L2[i]] = 2;
for (int i = L1.Count - 1; i >= 0; i--) if (!(Appearances.ContainsKey(L1[i]) && Appearances[L1[i]] == 2)) L1.RemoveAt(i);
for (int i = L2.Count - 1; i >= 0; i--) if (!(Appearances.ContainsKey(L2[i]) && Appearances[L2[i]] == 2)) L2.RemoveAt(i);
Dictionary<T, int> IndInL1 = new Dictionary<T, int>();
for (int i = 0; i < L1.Count; i++) IndInL1[L1[i]] = i;
Dictionary<T, int> Breakers = new Dictionary<T, int>();
int steps = 0;
int me = 0;
while (true)
{
steps++;
int minError = int.MaxValue;
int minErrorIndex = -1;
for (int from = 0; from < L2.Count; from++)
{
T x = L2[from];
int to = IndInL1[x];
if (from == to) continue;
L2.RemoveAt(from);
L2.Insert(to, x);
int error = 0;
for (int i = 0; i < L2.Count; i++)
error += Math.Abs((i - IndInL1[L2[i]]));
L2.RemoveAt(to);
L2.Insert(from, x);
if (error < minError)
{
minError = error;
minErrorIndex = from;
}
}
if (minErrorIndex == -1) break;
T breaker = L2[minErrorIndex];
int breakerOriginalPosition = IndInL1[breaker];
L2.RemoveAt(minErrorIndex);
L2.Insert(breakerOriginalPosition, breaker);
Breakers[breaker] = 1;
me = minError;
}
Console.WriteLine("Breakers: " + Breakers.Count + " Steps: " + steps);
foreach (KeyValuePair<T, int> p in Breakers)
Console.WriteLine(p.Key);
Console.ReadLine();
}
}
}
是它允許修剪D2爲D1的大小(因爲D2在這種情況下更大)在排序之前? – 2012-04-06 07:04:38
不允許。你必須像現在一樣使用D2。 – 2012-04-06 07:05:50
在排序之前,D1中第0個元素的鍵名「R1」是否總是與D2中第0個元素的鍵名匹配? – 2012-04-06 07:08:54