2013-01-04 26 views
6

我要尋找一個解決以下問題:陣列變異算法

給定一個數組「a」和陣列「B」,發現了一組操作,當其應用到「一」,轉變「一個「變成」b「。

因此,舉例來說,假設我有:

a = [1,2,3] 
b = [3,2,4] 
c = transmute(a, b) 

現在我希望C至包含類似:

[["remove", 0], ["add", 2, 4], ["swap", 0, 1]] 

添加上「一」這些操作,在給定的順序,應該產生「b」:

[1,2,3] => [2,3] => [2,3,4] => [3,2,4] 

在Ruby中這是一個非常天真的實現:https://gist.github.com/4455256。這個假設在數組中沒有重複(這不是一個好的假設)。我認爲這也是O(n²),如果有更多的表現會更好。

是否有任何已知的算法做到這一點?我可以做更多的閱讀嗎?你有什麼可以改進的建議嗎?

+0

這是相當類似的編輯距離。我認爲你可以比Levenshtein算法修改得更好。 –

+0

我不知道這有一個多項式時間算法。看到[這個相關的問題](http://stackoverflow.com/questions/6037481/string-to-string-correction-problem-np-completeness-proof)。 –

+0

到目前爲止,問題似乎是一個向量和一個常量向量之間的相加,這會導致另一個向量。我想我不完全瞭解這個問題,所以請帶上一些解釋。 – user1929959

回答

1

這裏有一個解決方案:

get the token-index pairs of all tokens in the source string 
get the token-index pairs of all tokens in the target string 

from both sets remove the values that are in the other set. 

foreach token-index in the source set 
    if the target set has the token at the same location 
     remove it from both sets, this is a match created by a previous swap 
    get the target token at the source index 
    if the source set has the target token (at any index) 
     create a swap command to swap the source token at source index with 
     the target token at its index in the source 
     remove the token-index from the source set 
     remove the target token-index from the target set 
     add a token-index for the target token at the new index to the source set 
     loop without moving to the next token-index 

create remove commands for any remaining token-indexes in the source set 
create add commands for any remaining token-indexes in the target set 

這裏有一個快速的C#實現:

private static IEnumerable<ICommand> GetChangeCommands(string source, string target) 
{ 
    var unmatchedSourceTokens = GetUnmatchedTokenIndexes(source, target); 
    var unmatchedTargetTokens = GetUnmatchedTokenIndexes(target, source); 

    var commands = new List<ICommand>(); 

    foreach (var tokenIndexList in unmatchedSourceTokens) 
    { 
     var sourceToken = tokenIndexList.Key; 
     var sourceStringSourceTokenIndexes = unmatchedSourceTokens[sourceToken]; 

     foreach (var sourceLoopIndex in tokenIndexList.Value.ToList()) 
     { 
      var sourceIndex = sourceLoopIndex; 
      bool swapped; 
      do 
      { 
       swapped = false; 
       if (sourceIndex >= target.Length) 
       { 
        continue; 
       } 
       var targetToken = target[sourceIndex]; 
       if (targetToken == sourceToken) 
       { 
        sourceStringSourceTokenIndexes.Remove(sourceIndex); 
        unmatchedTargetTokens[targetToken].Remove(sourceIndex); 
        continue; 
       } 
       List<int> sourceStringTargetTokenIndexes; 
       if (!unmatchedSourceTokens.TryGetValue(targetToken, out sourceStringTargetTokenIndexes) || 
        !sourceStringTargetTokenIndexes.Any()) 
       { 
        continue; 
       } 
       var targetIndex = sourceStringTargetTokenIndexes.First(); 
       commands.Add(new SwapCommand(sourceIndex, targetIndex)); 
       sourceStringTargetTokenIndexes.RemoveAt(0); 
       sourceStringSourceTokenIndexes.Remove(sourceIndex); 
       sourceStringSourceTokenIndexes.Add(targetIndex); 
       unmatchedTargetTokens[targetToken].Remove(sourceIndex); 
       swapped = true; 
       sourceIndex = targetIndex; 
      } while (swapped); 
     } 
    } 

    var removalCommands = unmatchedSourceTokens 
     .SelectMany(x => x.Value) 
     .Select(x => new RemoveCommand(x)) 
     .Cast<ICommand>() 
     .OrderByDescending(x => x.Index) 
     .ToList(); 

    commands.AddRange(removalCommands); 

    var insertCommands = unmatchedTargetTokens 
     .SelectMany(x => x.Value.Select(y => new InsertCommand(y, x.Key))) 
     .Cast<ICommand>() 
     .OrderBy(x => x.Index) 
     .ToList(); 

    commands.AddRange(insertCommands); 

    return commands; 
} 

private static IDictionary<char, List<int>> GetUnmatchedTokenIndexes(string source, string target) 
{ 
    var targetTokenIndexes = target.Select((x, i) => new 
                  { 
                   Token = x, 
                   Index = i 
                  }) 
            .ToLookup(x => x.Token, x => x.Index) 
            .ToDictionary(x => x.Key, x => x.ToList()); 

    var distinctSourceTokenIndexes = new Dictionary<char, List<int>>(); 
    foreach (var tokenInfo in source.Select((x, i) => new 
                  { 
                   Token = x, 
                   Index = i 
                  })) 
    { 
     List<int> indexes; 
     if (!targetTokenIndexes.TryGetValue(tokenInfo.Token, out indexes) || 
      !indexes.Contains(tokenInfo.Index)) 
     { 
      if (!distinctSourceTokenIndexes.TryGetValue(tokenInfo.Token, out indexes)) 
      { 
       indexes = new List<int>(); 
       distinctSourceTokenIndexes.Add(tokenInfo.Token, indexes); 
      } 
      indexes.Add(tokenInfo.Index); 
     } 
    } 
    return distinctSourceTokenIndexes; 
} 

internal class InsertCommand : ICommand 
{ 
    private readonly char _token; 

    public InsertCommand(int index, char token) 
    { 
     Index = index; 
     _token = token; 
    } 

    public int Index { get; private set; } 

    public string Change(string input) 
    { 
     var chars = input.ToList(); 
     chars.Insert(Index, _token); 
     return new string(chars.ToArray()); 
    } 

    public override string ToString() 
    { 
     return string.Format("[\"add\", {0}, '{1}']", Index, _token); 
    } 
} 

internal class RemoveCommand : ICommand 
{ 
    public RemoveCommand(int index) 
    { 
     Index = index; 
    } 

    public int Index { get; private set; } 

    public string Change(string input) 
    { 
     var chars = input.ToList(); 
     chars.RemoveAt(Index); 
     return new string(chars.ToArray()); 
    } 

    public override string ToString() 
    { 
     return string.Format("[\"remove\", {0}]", Index); 
    } 
} 

internal class SwapCommand : ICommand 
{ 
    private readonly int _targetIndex; 

    public SwapCommand(int sourceIndex, int targetIndex) 
    { 
     Index = sourceIndex; 
     _targetIndex = targetIndex; 
    } 

    public int Index { get; private set; } 

    public string Change(string input) 
    { 
     var chars = input.ToArray(); 
     var temp = chars[Index]; 
     chars[Index] = chars[_targetIndex]; 
     chars[_targetIndex] = temp; 
     return new string(chars); 
    } 

    public override string ToString() 
    { 
     return string.Format("[\"swap\", {0}, {1}]", Index, _targetIndex); 
    } 
} 

internal interface ICommand 
{ 
    int Index { get; } 
    string Change(string input); 
} 

樣品用量:

const string source = "123"; 
const string target = "324"; 
var commands = GetChangeCommands(source, target); 
Execute(source, target, commands); 

private static void Execute(string current, string target, IEnumerable<ICommand> commands) 
{ 
    Console.WriteLine("converting".PadRight(19) + current + " to " + target); 
    foreach (var command in commands) 
    { 
     Console.Write(command.ToString().PadRight(15)); 
     Console.Write(" => "); 
     current = command.Change(current); 
     Console.WriteLine(current); 
    } 
} 

輸出樣本:

converting   123 to 324 
["swap", 0, 2] => 321 
["remove", 2] => 32 
["add", 2, '4'] => 324 

converting   hello to world 
["swap", 1, 4] => holle 
["remove", 4] => holl 
["remove", 2] => hol 
["remove", 0] => ol 
["add", 0, 'w'] => wol 
["add", 2, 'r'] => worl 
["add", 4, 'd'] => world 

converting   something to smith 
["swap", 1, 2] => smoething 
["swap", 2, 6] => smiethong 
["swap", 3, 4] => smitehong 
["swap", 4, 5] => smitheong 
["remove", 8] => smitheon 
["remove", 7] => smitheo 
["remove", 6] => smithe 
["remove", 5] => smith 

converting   something to sightseeing 
["swap", 1, 6] => simethong 
["swap", 6, 3] => simotheng 
["swap", 3, 5] => simhtoeng 
["swap", 2, 8] => sightoenm 
["remove", 8] => sightoen 
["remove", 7] => sightoe 
["remove", 5] => sighte 
["add", 5, 's'] => sightse 
["add", 7, 'e'] => sightsee 
["add", 8, 'i'] => sightseei 
["add", 9, 'n'] => sightseein 
["add", 10, 'g'] => sightseeing 

這裏有一些低效率以上的樣品中明顯的: 它交換被要被除去 它消除令牌,然後重新添加令牌

2

你可以分階段進行來解決這個問題。根據我的想法,應該有3個階段。 O(N)解決方案將成爲可能。

複印數組A成陣列C和數組B插入到陣列D.

  1. 現在,比較2個陣列Ç& D. 取消C不存在於D.這些都是元素需要如果我們使用該步驟一個HashMap從陣列A. 刪除的元素,那麼可以爲O完成(N)
  2. 再次比較C和d,刪除d不在C.這些元素內容將主要是我們將添加到陣列 A.元素
  3. 所以,現在我們有2個陣列 - Ç& d基本上具有相同的元件。我們只有需要看看我們如何交換這些元素,以使它們看起來像
  4. 一旦他們看上去都差不多,我們可以添加從A缺少的元素融入d。將步驟2中刪除的元素添加到數組D中。可以通過與原始數組A相比以正確的順序添加它們。

因爲步驟1,2,4的操作相當簡單。我會解釋如何來與交換的順序。我們舉一個例子。如果目前我們的數組C看起來像1,3,2和D看起來像3,2,1。 我們在與C中的對應的值d的每個索引處的比較值如果它們是不同的,那麼我們紀念從元件在C元件D. 所以有向邊,在索引0,C具有在1和d具有3。它們是不同的,因此我們畫出1到3的有向邊。1-> 3。 同樣,我們從3從2繪製的邊緣到2索引1 和邊緣爲1索引2

這導致我們一個DAG。你可以嘗試像DFS這樣的各種東西,看看,我只是在這裏陳述結果。沒有。掉期將爲(圖中節點數-1)圖形的DFS遍歷將告訴交換將發生的順序

注:如果在數組重複元素,則多了幾分簿記將必需的,但相同的解決方案會奏效。

如果你不能夠通過DAG交換算法的階段。請看@ handcraftsman引用的問題,這是string transposition algorithm。它提出了一個類似的方法來解決同樣的問題。