的Python在列表

計數列表我有以下列表：的Python在列表

data = [('A', 'B'), ('C','D'), ('E','F'), ('G','H'), ('B','A'), ('D','C')]

所述第一和第二元素的順序並不重要，從而，例如，（「A」，「B」）和（」 B'，'A'）被視爲相同。理想的結果是：

('A','B') 2 
('C','D') 2 
('E','F') 1 
('G','H') 1

我想這（改編自How to count number of duplicates in a list of tuples?）：

data = [('A', 'B'), ('C','D'), ('E','F'), ('G','H'), ('B','A'), ('D','C')] 
from collections import Counter 
for i, j in Counter(data).most_common(): 
    print i, j

結果看起來是這樣的：

('G', 'H') 1 
('B', 'A') 1 
('E', 'F') 1 
('A', 'B') 1 
('D', 'C') 1 
('C', 'D') 1

有什麼建議？

來源

2016-12-24 kevin

每個元組總是隻有兩個項目嗎？ – Ronikos

是的。兩個元素在每個元組中。 – kevin

參見例如HTTP：//計算器。com/q/41259493/3001761，你可以使用它來和Counter一起工作。或者只是'map（frozenset，data）'。 – jonrsharpe

解決此問題的一種方法是遍歷每個元組並按照字母順序使用sorted()對它們進行排序。因此("B", "A")將成爲("A", "B")等等，那麼你可以繼續使用您之前寫的代碼來算OCCURENCES

from collections import Counter 

data = [('A', 'B'), ('C','D'), ('E','F'), ('G','H'), ('B','A'), ('D','C')] 

data = [tuple(sorted(item)) for item in data] # sorts each tuple alphabetically 

for i, j in Counter(data).most_common(): 
    print(i, j)

使用或不使用列表理解（和使用Python 2.x的語法）：

from collections import Counter 

data = [('A', 'B'), ('C','D'), ('E','F'), ('G','H'), ('B','A'), ('D','C')] 

for i in range(0, len(data)): 
    data[i] = tuple(sorted(data[i])) 

for i, j in Counter(data).most_common(): 
    print i, j

來源

2016-12-24 23:15:50 Ronikos

感謝@Ronikos它的工作原理。我會接受這個答案。 – kevin

循環太多。 :) – 2016-12-25 02:24:41

一種方式做，這是計算內部的元組的計數器，就像這樣：

from collections import Counter 
data = [('A', 'B'), ('C','D'), ('E','F'), ('G','H'), ('B','A'), ('D','C')] 
data = [Counter(x) for x in data] 
print Counter([", ".join(list(x.elements())) for x in data]).most_common()

來源

2016-12-24 23:21:07 Natecat

如果由於某種原因你不想使用計數器

data_dict = {} 
for d in data: 
    temp_d = tuple(sorted(d)) 
    if temp_d in data_dict: 
     data_dict[temp_d] += 1 
    else: 
     data_dict[temp_d] = 1

輸出

{('A', 'B'): 2, ('C', 'D'): 2, ('E', 'F'): 1, ('G', 'H'): 1}

如果您使用熊貓

import pandas as pd 
pd.Series(data).map(lambda x: tuple(sorted(x))).value_counts()

輸出

(C, D) 2 
(A, B) 2 
(G, H) 1 
(E, F) 1 
dtype: int64

來源

2016-12-24 23:21:54

您必須計數之前對它們進行排序。

data = [('A', 'B'), ('C','D'), ('E','F'), ('G','H'), ('B','A'), ('D','C')] 

def count(): 
    sorted_data = [tuple(sorted(d)) for d in data] 
    for i, j in Counter(sorted_data).most_common(): 
     print(i, j)

來源

2016-12-24 23:32:02

糟糕的解決方案，浪費計算時間。 – Natecat

我不這麼認爲。談論計算時間，如果你不知道你究竟在尋找什麼，那它就是ridicuolus。下一次看起來會更好：凱文問這個問題，最後選擇了一個和我所建議的完全一樣的解決方案。 –

tuple不是您的用例的最佳類型。考慮改用set。

例如，

(1, 2) == (2, 1) # False 
{1, 2} == {2, 1} # True

來源

2016-12-24 23:34:01 Eddie

無負載Counter模塊簡單的解決方案：

data = [('A', 'B'), ('C','D'), ('E','F'), ('G','H'), ('B','A'), ('D','C')] 
counts = {} 
for t in data: 
    k = tuple(sorted(t)) 
    counts[k] = counts.get(k, 0) + 1 

print(counts)

輸出：

{('C', 'D'): 2, ('G', 'H'): 1, ('E', 'F'): 1, ('A', 'B'): 2}

來源

2016-12-24 23:39:58 RomanPerekhrest

嘗試用熊貓。 :)

import pandas as pd 
pd.Series(pd.Series([('a','b'),('b','a'),('c','d')]).apply(lambda x: tuple(sorted(list(x))))).value_counts() 

#output 
(a, b) 2 
(c, d) 1 
dtype: int64

來源

2016-12-24 23:45:25

的Python在列表

回答

相關問題