高效計算集合中的唯一排列

我目前正在計算數據數組的唯一排列。雖然下面的代碼正在工作，但並不像我想的那樣高效。一旦我得到了6或8個項目，它變得非常慢，我開始遇到內存問題。高效計算集合中的唯一排列

下面是代碼和一個解釋

<?php 
function permuteUnique($items, $count = false, $perms = [], &$return = []) { 
    if ($count && count($return) == $count) return $return; 

    if (empty($items)) { 
     $duplicate = false; 

     foreach ($return as $a) { 
      if ($a === $perms) { 
       $duplicate = true; 
       break; 
      } 
     } 
     if (!$duplicate) $return[] = $perms; 
    } else { 
     for ($i = count($items) - 1; $i >= 0; --$i) { 
      $newitems = $items; 
      $newperms = $perms; 
      list($tmp) = array_splice($newitems, $i, 1); 
      array_unshift($newperms, $tmp); 
      permuteUnique($newitems, $count, $newperms, $return); 
     } 
     return $return; 
    } 
} 

function factorial($n) { 
    $f = 1; 
    for ($i = 2; $i <= $n; $i++) $f *= $i; 
    return $f; 
}

鑑於我接收預期以下輸出輸入[1, 1, 2]

array (size=3) 
    0 => 
    array (size=3) 
     0 => int 1 
     1 => int 1 
     2 => int 2 
    1 => 
    array (size=3) 
     0 => int 1 
     1 => int 2 
     2 => int 1 
    2 => 
    array (size=3) 
     0 => int 2 
     1 => int 1 
     2 => int 1

的$count參數，所以我可以通過獨特的排列數我期待該功能，一旦發現很多，它可以停止計算並返回數據。這被計算爲項目總數的階乘除以所有重複次數的階乘的乘積。我不確定我是否說得對，所以讓我舉個例子。

給定了[1, 2, 2, 3, 4, 4, 4, 4]因爲有8個項目總額，其中一人被複制兩次獨特排列的計數計算 8!/(2!4!) = 840，另一個是重複4次。

現在，如果我翻譯，爲PHP代碼...

<?php 
$set = [1, 2, 2, 3, 4, 4, 4, 4]; 
$divisor = 1; 

foreach (array_count_values($set) as $v) { 
    $divisor *= factorial($v); 
} 

$count = factorial(count($set))/$divisor; 
$permutations = permuteUnique($set, $count);

這是非常緩慢的。如果我在permuteUnique函數中拋出一個計數器，它會在找到840個唯一排列之前運行超過10萬次。

我想找到一種方法來減少這種情況，並找到最短的路徑，以獨特的排列。我感謝您能給予的任何幫助或建議。

來源

2013-09-21 Rob

查看C++的['std :: next_permutation']（http://en.cppreference.com/w/cpp/algorithm/next_permutation），找到或實現類似於PHP的東西。 – MvG

所以我花了更多的時間思考這個，這就是我想出的。

<?php 
function permuteUnique($items, $perms = [], &$return = []) { 
    if (empty($items)) { 
     $return[] = $perms; 
    } else { 
     sort($items); 
     $prev = false; 
     for ($i = count($items) - 1; $i >= 0; --$i) { 
      $newitems = $items; 
      $tmp = array_splice($newitems, $i, 1)[0]; 
      if ($tmp != $prev) { 
       $prev = $tmp; 
       $newperms = $perms; 
       array_unshift($newperms, $tmp); 
       permuteUnique($newitems, $newperms, $return); 
      } 
     } 
     return $return; 
    } 
} 

$permutations = permuteUnique([1, 2, 2, 3, 4, 4, 4, 4]);

上一頁統計

Uniques: 840 
Calls to permuteUnique: 107,591 
Duplicates found: 38737 
Execution time (seconds): 4.898668050766

新統計

Uniques: 840 
Calls to permuteUnique: 2647 
Duplicates found: 0 
Execution time (seconds): 0.0095300674438477

因此，所有我真的是排序的數據集，跟蹤上一個項目，而不是計算如果當前項目與前一項匹配，則進行排列。我也不再需要預先計算唯一身份的數量並遍歷排列來檢查重複項。這造成了一個不同的世界。

來源

2013-09-25 03:49:33 Rob

在這一行* if（$ tmp！= $ prev）*您應該使用*！= *的*！== * insetd。對於鬆散比較，當設定爲0時，它會中斷，例如， ** $ permutations = permuteUnique（[0，1，1]）; ** – f1ames

你們用於分析和獲取這些統計信息的數據 – rbz

我剛剛在wiki上嘗試了「按字典順序生成」的方式，並且它爲您的「1,2,2,3,4,4,4,4」樣本生成了相同的結果，所以我猜測它是正確的。下面是代碼：

function &permuteUnique($items) { 
    sort($items); 
    $size = count($items); 
    $return = []; 
    while (true) { 
     $return[] = $items; 
     $invAt = $size - 2; 
     for (;;$invAt--) { 
      if ($invAt < 0) { 
       break 2; 
      } 
      if ($items[$invAt] < $items[$invAt + 1]) { 
       break; 
      } 
     } 
     $swap1Num = $items[$invAt]; 
     $inv2At = $size - 1; 
     while ($swap1Num >= $items[$inv2At]) { 
      $inv2At--; 
     } 
     $items[$invAt] = $items[$inv2At]; 
     $items[$inv2At] = $swap1Num; 
     $reverse1 = $invAt + 1; 
     $reverse2 = $size - 1; 
     while ($reverse1 < $reverse2) { 
      $temp = $items[$reverse1]; 
      $items[$reverse1] = $items[$reverse2]; 
      $items[$reverse2] = $temp; 
      $reverse1++; 
      $reverse2--; 
     } 
    } 
    return $return; 
}

仿形爲您的示例輸入的時間：上述方法：2600,3000,3000,2400,2400,3000; 您的「調用permuteUnique：2647」方法：453425.6,454425.4,454625.8。在你的示例輸入中，它大約快了500倍:)如果你正在逐一處理結果（我想你會），使用這種非遞歸方法，你可以處理一個生成的，然後生成下一個（而不是在處理之前生成全部並全部存儲）。

來源

2013-09-25 14:12:27 daifei4321

我不確定哪裏獲得了500倍的速度。它比我的測試快大約3-5倍，即使是更大的一套。仍然是一個非常好的答案。小心提供一個鏈接到您引用的wiki？ – Rob

@Rob：當然。它是http://en.wikipedia.org/wiki/Permutation#Generation_in_lexicographic_order我找到了一種方法來說它是正確的（以前我只是猜測）。這個500次的事情來自於剖析。 – daifei4321

試試這個修改後的迭代版本。它沒有遞歸開銷。

上找到： http://docstore.mik.ua/orelly/webprog/pcook/ch04_26.htm

ORIGINAL：

function pc_next_permutation($p, $size) { 
    // slide down the array looking for where we're smaller than the next guy 
    for ($i = $size - 1; $p[$i] >= $p[$i+1]; --$i) { } 

    // if this doesn't occur, we've finished our permutations 
    // the array is reversed: (1, 2, 3, 4) => (4, 3, 2, 1) 
    if ($i == -1) { return false; } 

    // slide down the array looking for a bigger number than what we found before 
    for ($j = $size; $p[$j] <= $p[$i]; --$j) { } 

    // swap them 
    $tmp = $p[$i]; $p[$i] = $p[$j]; $p[$j] = $tmp; 

    // now reverse the elements in between by swapping the ends 
    for (++$i, $j = $size; $i < $j; ++$i, --$j) { 
     $tmp = $p[$i]; $p[$i] = $p[$j]; $p[$j] = $tmp; 
    } 

    return $p; 
} 

$set = split(' ', 'she sells seashells'); // like array('she', 'sells', 'seashells') 
$size = count($set) - 1; 
$perm = range(0, $size); 
$j = 0; 

do { 
    foreach ($perm as $i) { $perms[$j][] = $set[$i]; } 
} while ($perm = pc_next_permutation($perm, $size) and ++$j); 

foreach ($perms as $p) { 
    print join(' ', $p) . "\n"; 
}

下面是其修改爲不同的排列一個想法，但我認爲有更快的解決方案....

function pc_next_permutation($p, $size) { 
    for ($i = $size - 1; $p[$i] >= $p[$i+1]; --$i) { } 
    if ($i == -1) { return false; } 
    for ($j = $size; $p[$j] <= $p[$i]; --$j) { } 
    $tmp = $p[$i]; $p[$i] = $p[$j]; $p[$j] = $tmp; 
    for (++$i, $j = $size; $i < $j; ++$i, --$j) { 
     $tmp = $p[$i]; $p[$i] = $p[$j]; $p[$j] = $tmp; 
    } 

    return $p; 
} 

$uniqueMap=array(); 
$set = split(' ', '1 2 2 3 4 4 4 4'); 
$size = count($set) - 1; 
$perm = range(0, $size); 
$j=0; 

do { 
    $uniqueSetString=""; 
    foreach ($perm as $i) 
     $uniqueSetString .= "|".$set[$i]; 

    if (!isset($uniqueMap[$uniqueSetString])) 
    { 
     foreach ($perm as $i) 
      $perms[$j][] = $set[$i]; 

     $uniqueMap[$uniqueSetString]=1; 
    } 
} while ($perm = pc_next_permutation($perm, $size) and ++$j); 

foreach ($perms as $p) { 
    print join(' ', $p) . "\n"; 
}

來源

2013-09-27 19:24:45 Soulfreezer

未定義的偏移量：第3行的-1？：○ – hanshenrik

你需要的是factoriadic，它允許你生成第n個排列，而不需要所有的前面/後面g的。我使用PHP編寫了代碼，但是我沒有使用ATM，對不起。

編輯：Here you go，它應該讓你開始。

來源

2013-09-28 23:55:53

高效計算集合中的唯一排列

回答

相關問題