2016-07-27 72 views
0

如何在以下元組列表中查找重複值?在Python中的元組列表中查找重複值

[(1622, 4081), (1622, 4082), (1624, 4083), (1626, 4085), (1650, 4086), (1650, 4090)] 

我希望得到一個列表,如:

[4081, 4082, 4086, 4090] 

我一直在使用itemgetter然後按選項嘗試,但沒有奏效。

如何做到這一點?

+0

可您發佈的嘗試? – 0xtvarun

+0

轉到下面的鏈接可能是它可以幫助你。 [link1](http://stackoverflow.com/questions/32464290/python-find-tuples-from-a-list-of-tuples-having-duplicate-data-in-the-0th-elem)[link2]( http://stackoverflow.com/questions/17482944/find-duplicate-items-within-a-list-of-list-of-tuples-python) –

回答

2

使用一個有序字典,第一項爲鍵和值的第二項列表(其中創建副本使用dict.setdefalt())然後拿起那些有長度超過1:

>>> from itertools import chain 
>>> from collections import OrderedDict 
>>> d = OrderedDict() 
>>> for i, j in lst: 
...  d.setdefault(i,[]).append(j) 
... 
>>> 
>>> list(chain.from_iterable([j for i, j in d.items() if len(j)>1])) 
[4081, 4082, 4086, 4090] 
+1

+5謝謝!它的工作完美!沒有足夠的意見來讚揚你! –

0

沒有測試過這一點....(編輯:是的,它的工作原理)

l = [(1622, 4081), (1622, 4082), (1624, 4083), (1626, 4085), (1650, 4086), (1650, 4090)] 

dup = [] 

for i, t1 in enumerate(l): 
    for t2 in l[i+1:]: 
     if t1[0]==t2[0]: 
      dup.extend([t1[1], t2[1]]) 
print dup 
1

作爲替代方案,如果你想使用groupby,這裏是一個辦法做到這一點:

In [1]: from itertools import groupby 

In [2]: ts = [(1622, 4081), (1622, 4082), (1624, 4083), (1626, 4085), (1650, 4086), (1650, 4090)] 

In [3]: dups = [] 

In [4]: for _, g in groupby(ts, lambda x: x[0]): 
    ...:  grouped = list(g) 
    ...:  if len(grouped) > 1: 
    ...:   dups.extend([dup[1] for dup in grouped]) 
    ...:   

In [5]: print(dups) 
[4081, 4082, 4086, 4090] 

您可以使用groupby從元組的第一個元素進行分組,並將重複值從元組添加到列表中。

1

另一種方法(沒有任何進口):

In [896]: lot = [(1622, 4081), (1622, 4082), (1624, 4083), (1626, 4085), (1650, 4086), (1650, 4090)] 

In [897]: d = dict() 

In [898]: for key, value in lot: 
    ...:  d[key] = d.get(key, []) + [value] 
    ...: 
    ...: 

In [899]: d 
Out[899]: {1622: [4081, 4082], 1624: [4083], 1626: [4085], 1650: [4086, 4090]} 

In [900]: [d[key] for key in d if len(d[key]) > 1] 
Out[900]: [[4086, 4090], [4081, 4082]] 

In [901]: sorted([num for num in lst for lst in [d[key] for key in d if len(d[key]) > 1]]) 
Out[901]: [4081, 4081, 4082, 4082]