2017-05-23 60 views
0

我有一個Python中可能包含重複的數字列表。我需要對重複值進行小計處理,然後解壓縮重複項以返回原始列表並跟蹤每個小計中使用的值。我遇到了第一輪小計導致新副本必須小計的問題。例如,列表[10,10,20,50,50,75]應被細分爲[40,100,75],因爲重複的10s的小計導致新的重複的20s需要被小計。小計列表值重複,直到沒有重複保留

我試過使用下面的代碼來創建一個重複的字典並跟蹤每個的出現次數,但是這種方法在這種情況下不起作用。

import collections 

def compress_dups(values): 
    compressed_indices = [] 
    for val in set(values): 
     indices = [i for i, x in enumerate(values) if x == val] 
     compressed_indices.append(indices) 
    return(compressed_indices) 

compress_dict = collections.OrderedDict() 
initial_list = [10, 10, 20, 50, 50, 75] 
compressed_list = [] 
g = compress_dups(initial_list) 

print(initial_list) 

for item in g: 
    compressed_list.append(len(item)*initial_list[min(item)]) 
    compress_dict[(len(item)*initial_list[min(item)])] = len(item) 

print(sorted(compressed_list)) #this is the subtotaled list I'll work with 

for k,v in reversed(compress_dict.items()): 
    del compressed_list[compressed_list.index(k)] 
    for x in xrange(v): 
     compressed_list.append(k/v) 

print(sorted(compressed_list)) # this is the list after it's unpacked 

所需的輸出:

[10, 10, 20, 50, 50, 75] 
[40, 75, 100] 
[10, 10, 20, 50, 50, 75] 
+0

你爲什麼要做'min(item)'? –

+0

如果'[10,20,20,10]'應該輸出什麼? –

回答

2

這裏有一個簡單的功能,我提出做你的任務:

def count(lst): 
    counter = [] 
    for e in sorted(lst): 
     if e in counter: 
      counter.remove(e) 
      counter.append(e*2) 
     else: 
      counter.append(e) 
    return counter # or return sorted(counter) if you want it to be sorted 

initial_list = [10, 10, 20, 50, 50, 75] 
print(count(initial_list)) # prints [40, 100, 75] or [40, 75, 100] if its sorted 

second_list = [5, 5, 10, 20, 40, 80] 
print(count(second_list)) # prints [160] 

third_list = [100, 50, 25, 25, 78] 
print(count(third_list)) # prints [78, 200] 

說明:該函數創建一個列表,然後用initial_list遍歷檢查每個值已經在新列表中,如果是,則將其從新列表中移除並附加值的兩倍。如果不是,只需將該值添加到新列表中。然後它返回新的列表。

0

冗長,但工作,

def getNextLevel(a): 
    b = [] 
    visited= [] 
    found = False 
    for i in a: 
     if i not in visited: 
     c = a.count(i) 
     if c>1: 
      found = True 
     b.append(i*c) 
     visited.append(i) 
    return [b,found] 

if __name__ =='__main__': 
    a = [10, 10, 20, 50, 50, 75] 
    m = {} 
    level = 1 
    m[0] = a 
    while True: 
     [b,f] = getNextLevel(a) 
     if f: 
     m[level] = b 
     level +=1 
     else: 
     break 
     a = b 

    #print  
    for i in range(level): 
     print m[i] 

    for i in range(level-2,-1,-1): 
     print m[i] 

輸出,

[10, 10, 20, 50, 50, 75] 
[20, 20, 100, 75] 
[40, 100, 75] 
[20, 20, 100, 75] 
[10, 10, 20, 50, 50, 75] 
0

可以使用這樣的函數

def sum_dup(l): 

for i in range(len(l)-1): 
    if l[i] == l[i+1]: 
     l[i]+=l[i+1] 
     l[i+1] = 0 
     l.sort() 
l = list(set(l)) 
l.remove(0) 
return(l) 

sum_dup(l) 

回報

[40, 75, 100] 
0

我需要將重複值小計與它們一起使用,然後解壓重複以返回到原始列表並跟蹤每個小計中使用的值。

在Python 3,有可能產生兩個結果:

# Python 3 
import collections as ct 

def compress(initial, saved=None): 
    """Yield a compressed list of summed repeated values and a Counter, else compress again.""" 
    c = ct.Counter(initial) 
    if saved is None: saved = c       # store starting Counter 
    if len(initial) == len(set(initial)): 
     yield initial 
     yield saved 
    else: 
     compressed = sorted(k*v for k, v in c.items()) 
     yield from compress(compressed, saved=saved) 

lst = [10, 10, 20, 50, 50, 75] 
tuple(compress(lst)) 
# ([40, 75, 100], Counter({10: 2, 20: 1, 50: 2, 75: 1})) 

在這裏,我們得到兩個壓縮的列表,並開始Counter。注意:術語「compress」不等於itertools.compress。現在,我們可以通過遍歷Counter恢復原來的列表:

clst, counter = tuple(compress(lst)) 
rlst = sorted(counter.elements())       # sorted(k for k, v in counter.items() for _ in range(v)) 

print("Original list :", lst) 
print("Counter  :", counter) 
print("Compressed list:", clst) 
print("Recovered list :", rlst) 
# Original list : [10, 10, 20, 50, 50, 75] 
# Counter  : Counter({10: 2, 50: 2, 75: 1, 20: 1}) 
# Compressed list: [40, 75, 100] 
# Recovered list : [10, 10, 20, 50, 50, 75] 

摘要:此示例使用遞歸,yield from和存儲開始反追多元素的原始列表。它適用於多個重複項目,不僅重複。雖然慢了10倍,行爲並行的@ abccd的測試,如果遇到更多重複的元素會有所不同:

lst = [10, 10, 10, 20, 50, 50, 75] 
tuple(compress(lst))[0] 
# [20, 30, 75, 100] 
+0

這個答案是有用的,它也適用於任何數字不止1個副本的場景。謝謝! – David