2017-09-06 111 views
5

我想生成一個給定數組大小爲n的所有可能的4元組對的列表。 n至少8個,所以始終可以找到至少1對。從n個元素生成所有4元組對

舉個例子,有助於理解我使用一個較小版本的問題,2元組對給定大小5的arrayy問題。 2元組對預期的結果將導致15個項目(元組是有序的,不重複):

[(1,2), (3,4)], [(1,2), (3,5)], [(1,2), (4,5)], [(1,3), (2,4)], [(1,3), (2,5)], [(1,3), (4,5)], [(1,4), (2,3)], [(1,4), (2,5)], [(1,4), (3,5)], [(1,5), (2,3)], [(1,5), (2,4)], [(1,5), (3,4)], [(2,3), (4,5)], [(2,4), (3,5)], [(2,5), (3,4)]

我目前在做這個的方法是使用itertools從蟒蛇去通過返回的所有元素itertools.combinations,做2個循環,找到2對不共享單個元素,然後使用該元素。

表達這種在Python代碼我編寫了一個小片段:

arr = list(range(30)) # example list 
comb = list(itertools.combinations(range(0, len(arr)), 4)) 

for c1 in comb: 
    for c2 in comb: # go through all possible pairs 
     if len([val for val in c1 if val in c2]) == 0: # intersection of both sets results in 0, so they don't share an element 
      ... # do something and check for duplicates 

這種方法做它的工作,但效率不高,由於2路,只有在給定的時間內適用於小型n。這可以做得更有效率嗎?


更新:在一些答案後,我評估了這些建議。我的具體情況下,最好的辦法是通過MSeifert的(現已刪除)回答提供的(擴展)算法,其執行速度最快:

def generate_four_pairs(n): 
    valids = range(0, n) 
    for x00, x01, x02, x03, x10, x11, x12, x13 in itertools.combinations(valids, 8): 
     yield [x00, x01, x02, x03], [x10, x11, x12, x13] 
     yield [x00, x01, x02, x10], [x03, x11, x12, x13] 
     yield [x00, x01, x02, x11], [x03, x10, x12, x13] 
     yield [x00, x01, x02, x12], [x03, x10, x11, x13] 
     yield [x00, x01, x02, x13], [x03, x10, x11, x12] 
     yield [x00, x01, x03, x10], [x02, x11, x12, x13] 
     yield [x00, x01, x03, x11], [x02, x10, x12, x13] 
     yield [x00, x01, x03, x12], [x02, x10, x11, x13] 
     yield [x00, x01, x03, x13], [x02, x10, x11, x12] 
     yield [x00, x01, x10, x11], [x02, x03, x12, x13] 
     yield [x00, x01, x10, x12], [x02, x03, x11, x13] 
     yield [x00, x01, x10, x13], [x02, x03, x11, x12] 
     yield [x00, x01, x11, x12], [x02, x03, x10, x13] 
     yield [x00, x01, x11, x13], [x02, x03, x10, x12] 
     yield [x00, x01, x12, x13], [x02, x03, x10, x11] 
     yield [x00, x02, x03, x10], [x01, x11, x12, x13] 
     yield [x00, x02, x03, x11], [x01, x10, x12, x13] 
     yield [x00, x02, x03, x12], [x01, x10, x11, x13] 
     yield [x00, x02, x03, x13], [x01, x10, x11, x12] 
     yield [x00, x02, x10, x11], [x01, x03, x12, x13] 
     yield [x00, x02, x10, x12], [x01, x03, x11, x13] 
     yield [x00, x02, x10, x13], [x01, x03, x11, x12] 
     yield [x00, x02, x11, x12], [x01, x03, x10, x13] 
     yield [x00, x02, x11, x13], [x01, x03, x10, x12] 
     yield [x00, x02, x12, x13], [x01, x03, x10, x11] 
     yield [x00, x03, x10, x11], [x01, x02, x12, x13] 
     yield [x00, x03, x10, x12], [x01, x02, x11, x13] 
     yield [x00, x03, x10, x13], [x01, x02, x11, x12] 
     yield [x00, x03, x11, x12], [x01, x02, x10, x13] 
     yield [x00, x03, x11, x13], [x01, x02, x10, x12] 
     yield [x00, x03, x12, x13], [x01, x02, x10, x11] 
     yield [x00, x10, x11, x12], [x01, x02, x03, x13] 
     yield [x00, x10, x11, x13], [x01, x02, x03, x12] 
     yield [x00, x10, x12, x13], [x01, x02, x03, x11] 
     yield [x00, x11, x12, x13], [x01, x02, x03, x10] 
     yield [x01, x02, x03, x00], [x10, x11, x12, x13] 
     yield [x01, x02, x03, x10], [x00, x11, x12, x13] 
     yield [x01, x02, x03, x11], [x00, x10, x12, x13] 
     yield [x01, x02, x03, x12], [x00, x10, x11, x13] 
     yield [x01, x02, x03, x13], [x00, x10, x11, x12] 
     yield [x01, x02, x10, x00], [x03, x11, x12, x13] 
     yield [x01, x02, x10, x11], [x00, x03, x12, x13] 
     yield [x01, x02, x10, x12], [x00, x03, x11, x13] 
     yield [x01, x02, x10, x13], [x00, x03, x11, x12] 
     yield [x01, x02, x11, x00], [x03, x10, x12, x13] 
     yield [x01, x02, x11, x12], [x00, x03, x10, x13] 
     yield [x01, x02, x11, x13], [x00, x03, x10, x12] 
     yield [x01, x02, x12, x00], [x03, x10, x11, x13] 
     yield [x01, x02, x12, x13], [x00, x03, x10, x11] 
     yield [x01, x02, x13, x00], [x03, x10, x11, x12] 
     yield [x01, x03, x10, x00], [x02, x11, x12, x13] 
     yield [x01, x03, x10, x11], [x00, x02, x12, x13] 
     yield [x01, x03, x10, x12], [x00, x02, x11, x13] 
     yield [x01, x03, x10, x13], [x00, x02, x11, x12] 
     yield [x01, x03, x11, x00], [x02, x10, x12, x13] 
     yield [x01, x03, x11, x12], [x00, x02, x10, x13] 
     yield [x01, x03, x11, x13], [x00, x02, x10, x12] 
     yield [x01, x03, x12, x00], [x02, x10, x11, x13] 
     yield [x01, x03, x12, x13], [x00, x02, x10, x11] 
     yield [x01, x03, x13, x00], [x02, x10, x11, x12] 
     yield [x01, x10, x11, x00], [x02, x03, x12, x13] 
     yield [x01, x10, x11, x12], [x00, x02, x03, x13] 
     yield [x01, x10, x11, x13], [x00, x02, x03, x12] 
     yield [x01, x10, x12, x00], [x02, x03, x11, x13] 
     yield [x01, x10, x12, x13], [x00, x02, x03, x11] 
     yield [x01, x10, x13, x00], [x02, x03, x11, x12] 
     yield [x01, x11, x12, x00], [x02, x03, x10, x13] 
     yield [x01, x11, x12, x13], [x00, x02, x03, x10] 
     yield [x01, x11, x13, x00], [x02, x03, x10, x12] 
     yield [x01, x12, x13, x00], [x02, x03, x10, x11] 

對於一般的方法,我會建議由NPE作爲這個答案是這個問題的最短和最簡單的可讀答案。

+0

什麼是'arr',你確定'itertools'解決方案真的有效嗎?另外爲什麼15個元素而不是30個?例如'[(4,5)(2,3)]'有什麼問題? – MSeifert

+0

@ MSeifert,它相當於已經在列表中的'[(2,3),(4,5)]'。 「元組是有序的,沒有重複」。 – skovorodkin

+0

但是如果'itertools.combinations(range(0,len(arr)),4)'替換爲'itertools','itertools'解決方案實際上會給出'[(4,5),(2,3)]' (範圍(1,6),2)',這就是爲什麼我想知道該解決方案是全部還是正確工作的原因。當然,如果我們假設'len(arr)'是'5',那肯定不會。 – MSeifert

回答

3

您正在通過生成所有組合對來執行大量不必要的工作,然後丟棄幾乎所有組合,因爲它們包含常用元素。

下面的地址本,採取先所有子集的四個數字(在2元組爲例),然後拆分每個到所有可能對:

import itertools 

def gen_pairs(n, m): 
    for both_halves in itertools.combinations(xrange(1, n + 1), 2 * m): 
    for first_half in itertools.combinations(both_halves, m): 
     second_half = tuple(sorted(set(both_halves) - set(first_half))) 
     yield [first_half, second_half] 

print sorted(gen_pairs(5, 2)) 

請注意,這並不能消除重複(爲例如,[(4, 5) (2, 3)] vs [(2, 3), (4, 5)]),因此會產生兩倍的期望元素數量。

但是,刪除重複項很重要。這留給讀者作爲練習。

+1

不錯的解決方案!爲讀者提示:如果first_half的第一個元素小於second_half的第一個元素,只調用yield。 –

0

可以使用置換和分裂,其可能會更快:

array = ... 
size = 4 
c = itertools.permutations(array) 
for t in c: 
    a = [] 
    for i in range(0, len(t), size): 
     if i + size <= len(t): 
      a.append(t[i:i+size]) 
    yield a 

注意:在該陣列的長度不是大小的倍數這種解決方案工作,但產生重複的情況。

0

我會怎麼做:

from itertools import combinations 

sample = range(1,6) 
x1 = [subset for subset in combinations(sample,2)] #getting the set of tuples 
x2 = [list(subset) for subset in combinations(x1,2)] #getting the pair of tuples 
x3 = [x for x in x2 if (set(x[0]) & set(x[1]) == set())] #finally filtering the tuples with no intersection 

輸出:

[[(1, 2), (3, 4)], 
[(1, 2), (3, 5)], 
[(1, 2), (4, 5)], 
[(1, 3), (2, 4)], 
[(1, 3), (2, 5)], 
[(1, 3), (4, 5)], 
[(1, 4), (2, 3)], 
[(1, 4), (2, 5)], 
[(1, 4), (3, 5)], 
[(1, 5), (2, 3)], 
[(1, 5), (2, 4)], 
[(1, 5), (3, 4)], 
[(2, 3), (4, 5)], 
[(2, 4), (3, 5)], 
[(2, 5), (3, 4)]] 
0

而這裏的代碼來生成MSeifert的收益報表:)(而且只得到其中的35,這意味着有沒有重複:)

def g(L, n, k, A, B): 
    if len(A) == k: 
     return [[tuple(A), tuple([L[i] for i in xrange(0, n + 1)] + B)]] 

    elif len(B) == k: 
     return [[tuple([L[i] for i in xrange(0, n + 1)] + A), tuple(B)]] 

    return g(L, n - 1, k, A, [L[n]] + B[0:]) + g(L, n - 1, k, [L[n]] + A[0:], B) 

def f(L): 
    assert(len(L) > 3 and len(L) % 2 == 0) 
    return g(L, len(L) - 2, len(L)/2, [], [L[-1]]) 

for i in f(['x00','x01','x02','x03','x10','x11','x12','x13']): 
    print(i)