2017-02-16 88 views
0

我喜歡這些的元組的若干名單(它們代表座標):在元組列表中找到相同的元素?

a = [(100,100), (50,60)] 
b = [(100,50), (50,60)] 
c = [(100,100), (20,60)] 
d = [(70,100), (50,10)] 
e = [(100,80), (70,100)] 

我想知道如何有效地管理他們在一個單獨的列表中找到相同的值,然後存儲整個列表。

由於它們是座標,所以每個元組的X和Y在另一個元組中不能相同(即相同的X,但不同的Y)。

對於上面的例子,我想終於有這樣的事情(如列表,而且還以更有效的方式,如果有可能):

new_list1 = [a, b, c] 
new_list2 = [d, e] 

有沒有更有效的方式來沒有在幾個列表之間進行一對一解析的情況下得到這個結果?

+1

爲什麼'a','b'和'c'分組在一起,即什麼決定了哪一組中的元組列表結束? – Cleb

+0

@Cleb將每個列表看作一條線(因爲它有2個點)。那麼,我會嘗試構建一個「鏈」,即至少有兩個共享點。例如,對於'new_list1','a'和'b'共享'(50,60)',而'a'和'c'共享'(100,100)'。我希望現在會更清楚。 – mgri

+0

但在'd'中還有一個'(50,60)'元組。爲什麼它在list2中而不是在list1中? – Cleb

回答

1

好的,這是一個numpy tastic vectorised方法。似乎相當快。雖然沒有徹底測試。它顯式假設每個列表都有兩個元組和每個元組2的座標。

import time 
import numpy as np 

def find_chains_nmp(lists): 
    lists = np.asanyarray(lists) 
    lists.shape = -1,2 
    dtype = np.rec.fromrecords(lists[:1, :]).dtype 
    plists = lists.view(dtype) 
    lists.shape = -1, 2, 2 
    uniq, inv = np.unique(plists, return_inverse=True) 
    uniqf = uniq.view(lists.dtype).reshape(-1, 2) 
    inv.shape = -1, 2 
    to_flip = inv[:, 0] > inv[:, 1] 
    inv[to_flip, :] = inv[to_flip, ::-1].copy() 
    sl = np.lexsort(inv.T[::-1]) 
    sr = np.lexsort(inv.T) 
    lj = inv[sl, 0].searchsorted(np.arange(len(uniq)+1)) 
    rj = inv[sr, 1].searchsorted(np.arange(len(uniq)+1)) 
    mask = np.ones(uniq.shape, bool) 
    mask[0] = False 
    rooted = np.zeros(uniq.shape, int) 
    l, r = 0, 1 
    blocks = [0] 
    rblocks = [0] 
    reco = np.empty_like(lists) 
    reci = 0 
    while l < len(uniq): 
     while l < r: 
      ll = r 
      for c in rooted[l:r]: 
       if (rj[c]==rj[c+1]) and (lj[c]==lj[c+1]): 
        continue 
       connected = np.r_[inv[sr[rj[c]:rj[c+1]], 0], 
            inv[sl[lj[c]:lj[c+1]], 1]] 
       reco[reci:reci+lj[c+1]-lj[c]] = uniqf[inv[sl[lj[c]:lj[c+1]], :]] 
       reci += lj[c+1]-lj[c] 
       connected = np.unique(connected[mask[connected]]) 
       mask[connected] = False 
       rr = ll + len(connected) 
       rooted[ll:rr] = connected 
       ll = rr 
      l, r = r, rr 
     blocks.append(l) 
     rblocks.append(reci) 
     if l == len(uniq): 
      break 
     r = l + 1 
     rooted[l] = np.where(mask)[0][0] 
     mask[rooted[l]] = 0 
    return blocks, rblocks, reco, uniqf[rooted] 


# obsolete 
def find_chains(lists): 
    outlist = [] 
    outinds = [] 
    outset = set() 
    for j, l in enumerate(lists): 
     as_set = set(l) 
     inds = [] 
     for k in outset.copy(): 
      if outlist[k] & as_set: 
       outset.remove(k) 
       as_set |= outlist[k] 
       inds.extend(outinds[k]) 
     outset.add(j) 
     outlist.append(as_set) 
     outinds.append(inds + [j]) 
    outinds = [outinds[j] for j in outset] 
    del outset, outlist 
    result = [[lists[j] for j in k] for k in outinds] 
    return result, outinds 


if __name__ == '__main__': 
    a = [(100,100), (50,60)] 
    b = [(100,50), (50,60)] 
    c = [(100,100), (20,60)] 
    d = [(70,100), (50,10)] 
    e = [(100,80), (70,100)] 

    lists = [a, b, c, d, e] 
    print(find_chains(lists)) 



    lists = np.array(lists) 
    tblocks, lblocks, lreco, treco = find_chains_nmp(lists) 


    coords = np.random.random((12_000, 2)) 
    pairs = np.random.randint(0, len(coords), (12_000, 2)) 
    pairs = np.delete(pairs, np.where(pairs[:, 0] == pairs[:, 1]), axis=0) 
    pairs = coords[pairs, :] 
    t0 = time.time() 
    tblocks, lblocks, lreco, treco = find_chains_nmp(pairs) 
    t0 = time.time() - t0 
    print('\n\nproblem:') 
    print('\n\ntuples {}, lists {}'.format(len(coords), len(pairs))) 
    if len(pairs) < 40: 
     for k, l in enumerate(pairs): 
      print('[({:0.6f}, {:0.6f}), ({:0.6f}, {:0.6f})] ' 
        .format(*l.ravel()), end='' if k % 2 != 1 else '\n') 
    print('\n\nsolution:') 
    for j, (lists, tuples) in enumerate(zip(
      np.split(lreco, lblocks[1:-1], axis=0), 
      np.split(treco, tblocks[1:-1], axis=0))): 
     print('\n\ngroup #{}: {} tuples, {} list{}'.format(
      j + 1, len(tuples), len(lists), 
      's' if len(lists) != 1 else '')) 
     if len(pairs) < 40: 
      print('\ntuples:') 
      for k, t in enumerate(tuples): 
       print('({:0.6f}, {:0.6f}) '.format(*t), 
         end='' if k % 4 != 3 else '\n') 
      print('\nlists:') 
      for k, l in enumerate(lists): 
       print('[({:0.6f}, {:0.6f}), ({:0.6f}, {:0.6f})] ' 
         .format(*l.ravel()), end='' if k % 2 != 1 else '\n') 
    print('\n\ncomputation time', t0) 
+0

謝謝!我會盡快嘗試! – mgri

+0

我沒有測試代碼,但它似乎拆分了原來的列表。你可否確認?我想保留它們作爲一個整體 – mgri

+0

代碼通過一個罰款齒梳,但它不會改變它們。它創建了對列表元素的新引用,但不會改變它們。順便說一句。代碼假設你把它們全部放在一個大列表中,即'lists = [a,b,c,d,e]'。 –

相關問題