2012-11-20 52 views
1

比方說,我在這個格式的數據(假設製表符分隔)如何找到使用python的對子?

1 10,11,15 
2 12 
3 12,11 
4 10,11 

我如何遍歷列表,並在第二列數最流行的對對象的?假設第二列可以有無限數量的項目。

理想的輸出會回到像

pairs count 
10,11 (2) 
10,15 (1) 
11,15 (1) 
11,12 (1) 
+0

這看起來並不像一個'list'? –

+0

這是作業嗎? –

+1

爲什麼使用兩種語言標籤? –

回答

5

的東西,這些都使您能得到您的輸入列表的列表假設:

如果你有Python 2.7版,嘗試組合Counteritertools

>>> from collections import Counter 
>>> from itertools import combinations 
>>> l = [[10, 11, 15], [12], [12, 11], [10, 11]] 
>>> c = Counter(x for sub in l for x in combinations(sub, 2)) 
>>> for k, v in c.iteritems(): 
... print k, v 
... 
(10, 15) 1 
(11, 15) 1 
(10, 11) 2 
(12, 11) 1 

如果你有Python的< 2.6,你可以使用一個defaultdict聯合itertools(我敢肯定,其中一位大師將提供更清潔的解決方案)。

In [1]: from collections import defaultdict 

In [2]: from itertools import combinations 

In [3]: l = [[10, 11, 15], [12], [12, 11], [10, 11]] 

In [4]: counts = defaultdict(int) 

In [5]: for x in l: 
    ...:  for item in combinations(x, 2): 
    ...:   counts[item] += 1 
    ...: 
    ...: 

In [6]: for k, v in counts.iteritems(): 
    ...:  print k, v 
    ...: 
    ...: 
(10, 15) 1 
(11, 15) 1 
(10, 11) 2 
(12, 11) 1 
0
In [7]: with open("data1.txt") as f: 
     lis=[map(int,x.split(",")) for x in f] 
    ...:  

In [8]: Counter(chain(*[combinations(x,2) for x in lis])) 
Out[8]: Counter({(10, 11): 2, (10, 15): 1, (11, 15): 1, (12, 11): 1}) 
0

你可以使用combinationsCounter

from itertools import combinations 
import collections 

newinput = [] 

# Removes the tabs 
for line in oldinput: 
    newinput.append(line.partition("\t")[2]) 

# set up the counter 
c = collections.Counter() 

for line in newinput: 
    # Split by comma 
    a = line.split(',') 
    # make into integers from string 
    a = map(int, a) 
    # add to counter 
    c.update(combinations(a, 2)) 

然後,結束了一個Counter有你所有的計數: `(10,15):1)等