2016-01-15 91 views
0

假設我有大量的字典(可能是10'000字典)。我想統計所有字典中每個鍵的數量。即如果我有3點字典:如何統計多個字典中重複鍵的數量?

  • {1: 'url1', 3: 'url2', 7: 'url3', 5: 'url4'}
  • {1: 'url1', 7: 'url3'}
  • {5: 'url4', 10: 'url5'}

然後在結果我應該得到{1: [2, 'url1'], 10: [1, 'url5'], 3: [1, 'url2'], 5: [2, 'url4'], 7: [2, 'url3']}

我來到了下面的代碼:

lists = [{1: 'url1', 3: 'url2', 7: 'url3', 5: 'url4'}, {1: 'url1', 7: 'url3'}, {5: 'url4', 10: 'url5'}] 
result = {} 
for l in lists: 
    for i in l: 
     if i in result: 
      result[i][0] += 1 
     else: 
      result[i] = [1, l[i]] 

是在沒有更好的(快)的方式來做到這一點?

+0

你的榜樣字典都對同鍵相等的值。那是'dict1 [1] == dict2 [1]'。這對於你的大量字典會是真的嗎? –

+0

@Robᵩ,是的,對於每個相等的鍵值都是一樣的。 –

回答

2

如果你能接受輸出略有不同,這可能會爲你工作:

from collections import Counter 

dicts = [ 
    {1: 'url1', 3: 'url2', 7: 'url3', 5: 'url4'}, 
    {1: 'url1', 7: 'url3'}, 
    {5: 'url4', 10: 'url5'}, 
] 

result = Counter() 
for d in dicts: 
    result.update(d.keys()) 

print dict(result) 

注已經鍵和數量,但沒有價值。

或者:

from collections import Counter 
from itertools import chain 

dicts = [ 
    {1: 'url1', 3: 'url2', 7: 'url3', 5: 'url4'}, 
    {1: 'url1', 7: 'url3'}, 
    {5: 'url4', 10: 'url5'}, 
] 

result = Counter(chain.from_iterable(dicts)) 

print dict(result) 

最後版本:這一個正好產生您所要求的輸出:

from collections import Counter 
from itertools import chain 

dicts = [ 
    {1: 'url1', 3: 'url2', 7: 'url3', 5: 'url4'}, 
    {1: 'url1', 7: 'url3'}, 
    {5: 'url4', 10: 'url5'}, 
] 

result = Counter(chain.from_iterable(d.items() for d in dicts)) 
result = {k:[n,v] for ((k,v),n) in result.items()} 

print dict(result) 
+0

謝謝!但是我也需要價值(它可以是單獨的字典,如果它能以這種方法更快地工作)。 –

+0

'result = Counter(chain.from_iterable(d.items()for d in dicts))'是否滿足您的需求? –

+0

@LA_:見上面我的第三個解決方案。 –