2016-11-24 34 views
0

在我的功能,我將創建不同的元組,並加入到一個空表:如何在python的元組列表中應用groupby?

tup = (pattern,matchedsen) 
matchedtuples.append(tup) 

的模式有正則表達式的格式。我要尋找一個在下列方式上matchedtuples適用groupby()

例如:

matchedtuples = [(p1, s1) , (p1,s2) , (p2, s5)] 

而且我在尋找這樣的結果:

result = [ (p1,(s1,s2)) , (p2, s5)] 

因此,以這種方式,我將有組的句子具有相同的模式。我怎樣才能做到這一點?

回答

0

如果您需要輸出結果,您需要手動循環遍歷matchedtuples的分組並建立您的列表。

首先,當然,如果matchedtuples列表不排序,排序它itemgetter

from operator import itemgetter as itmg 

li = sorted(matchedtuples, key=itmg(0)) 

然後,通過groupby供應,追加到基於大小的列表r遍歷結果組:

r = [] 
for i, j in groupby(matchedtuples, key=itmg(0)): 
    j = list(j) 
    ap = (i, j[0][1]) if len(j) == 1 else (i, tuple(s[1] for s in j)) 
    r.append(ap) 
0

我的答案爲您的問題將適用於任何輸入結構,您將使用和打印相同的輸出,因爲你給。我將只groupby使用來自itertools模塊:

# Let's suppose your input is something like this 
a = [("p1", "s1"), ("p1", "s2"), ("p2", "s5")] 

from itertools import groupby 

result = [] 

for key, values in groupby(a, lambda x : x[0]): 
    b = tuple(values) 
    if len(b) >= 2: 
     result.append((key, tuple(j[1] for j in b))) 
    else: 
     result.append(tuple(j for j in b)[0]) 

print(result) 

輸出:

[('p1', ('s1', 's2')), ('p2', 's5')] 

如果你添加更多的值,以你的輸入同樣的解決方案的工作:

# When you add more values to your input 
a = [("p1", "s1"), ("p1", "s2"), ("p2", "s5"), ("p2", "s6"), ("p3", "s7")] 

from itertools import groupby 

result = [] 

for key, values in groupby(a, lambda x : x[0]): 
    b = tuple(values) 
    if len(b) >= 2: 
     result.append((key, tuple(j[1] for j in b))) 
    else: 
     result.append(tuple(j for j in b)[0]) 

print(result) 

輸出:

[('p1', ('s1', 's2')), ('p2', ('s5', 's6')), ('p3', 's7')] 

現在,如果您修改輸入結構:

# Let's suppose your modified input is something like this 
a = [(["p1"], ["s1"]), (["p1"], ["s2"]), (["p2"], ["s5"])] 

from itertools import groupby 

result = [] 

for key, values in groupby(a, lambda x : x[0]): 
    b = tuple(values) 
    if len(b) >= 2: 
     result.append((key, tuple(j[1] for j in b))) 
    else: 
     result.append(tuple(j for j in b)[0]) 

print(result) 

輸出:

[(['p1'], (['s1'], ['s2'])), (['p2'], ['s5'])] 

另外,如果你添加更多的值到新的輸入結構相同的解決方案的工作:

# When you add more values to your new input 
a = [(["p1"], ["s1"]), (["p1"], ["s2"]), (["p2"], ["s5"]), (["p2"], ["s6"]), (["p3"], ["s7"])] 

from itertools import groupby 

result = [] 

for key, values in groupby(a, lambda x : x[0]): 
    b = tuple(values) 
    if len(b) >= 2: 
     result.append((key, tuple(j[1] for j in b))) 
    else: 
     result.append(tuple(j for j in b)[0]) 

print(result) 

輸出:

[(['p1'], (['s1'], ['s2'])), (['p2'], (['s5'], ['s6'])), (['p3'], ['s7'])] 

Ps:測試此代碼,如果它與任何其他類型的輸入中斷,請讓我知道。