2013-03-14 33 views
2

我有一個以下格式的字典。在這本詞典中,存在不同類型的區域,但多次。我想從這裏生成另一個字典,它將包含一個額外的鍵「Count」,並且該鍵將包含重複一個區域的次數(即「全運行或半運行」或「半運行」)。計算python字典中值的重複性

[ 
{'zip_zone': u'Full Run', 'zipcode': u'14042', 'longitude': -78.516154}, 
{'zip_zone': u'Full Run', 'zipcode': u'14101', 'longitude': -78.51734}, 
{'zip_zone': u'Full Run', 'zipcode': u'14706', 'longitude': -78.493761}, 
{'zip_zone': u'Half Run', 'zipcode': u'14709', 'longitude': -78.024817}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14711', 'longitude': -78.119974}, 
{'zip_zone': u'Full Run', 'zipcode': u'14714', 'longitude': -78.256921}, 
{'zip_zone': u'Half Run', 'zipcode': u'14715', 'longitude': -78.157392}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14717', 'longitude': -78.210567}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14719', 'longitude': -78.86951}, 
{'zip_zone': u'Half Run', 'zipcode': u'14727', 'longitude': -78.268103}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14731', 'longitude': -78.658909}, 
{'zip_zone': u'Half Run', 'zipcode': u'14735', 'longitude': -78.087607}, 
{'zip_zone': None, 'zipcode': u'14737', 'longitude': -78.431625}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14739', 'longitude': -78.139046}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14741', 'longitude': -78.5907}, 
{'zip_zone': u'Special Run', 'zipcode': u'14743', 'longitude': -78.4098}, 
{'zip_zone': u'Special Run', 'zipcode': u'14744', 'longitude': -78.167853}, 
{'zip_zone': u'Half Run', 'zipcode': u'14748', 'longitude': -78.639987}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14753', 'longitude': -78.640416}, 
{'zip_zone': u'Special Run', 'zipcode': u'14754', 'longitude': -78.18395}, 
{'zip_zone': u'Special Run', 'zipcode': u'14755', 'longitude': -78.800866}, 
{'zip_zone': u'Half Run', 'zipcode': u'14760', 'longitude': -78.426015}, 
] 

輸出字典應該像

[ 
{'zip_zone': u'Full Run', 'zipcode': u'14042', 'longitude': -78.516154, 'count': 4}, 
{'zip_zone': u'Full Run', 'zipcode': u'14101', 'longitude': -78.51734, 'count': 4}, 
{'zip_zone': u'Full Run', 'zipcode': u'14706', 'longitude': -78.493761, 'count': 4}, 
{'zip_zone': u'Half Run', 'zipcode': u'14709', 'longitude': -78.024817, 'count': 6}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14711', 'longitude': -78.119974, 'count': 7}, 
{'zip_zone': u'Full Run', 'zipcode': u'14714', 'longitude': -78.256921, 'count': 4}, 
{'zip_zone': u'Half Run', 'zipcode': u'14715', 'longitude': -78.157392, 'count': 6}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14717', 'longitude': -78.210567, 'count': 7}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14719', 'longitude': -78.86951, 'count': 7}, 
{'zip_zone': u'Half Run', 'zipcode': u'14727', 'longitude': -78.268103, 'count': 6}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14731', 'longitude': -78.658909, 'count': 7}, 
{'zip_zone': u'Half Run', 'zipcode': u'14735', 'longitude': -78.087607, 'count': 6}, 
{'zip_zone': None, 'zipcode': u'14737', 'longitude': -78.431625, 'count': 0}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14739', 'longitude': -78.139046, 'count': 7}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14741', 'longitude': -78.5907, 'count': 7}, 
{'zip_zone': u'Special Run', 'zipcode': u'14743', 'longitude': -78.4098, 'count': 4}, 
{'zip_zone': u'Special Run', 'zipcode': u'14744', 'longitude': -78.167853, 'count': 4}, 
{'zip_zone': u'Half Run', 'zipcode': u'14748', 'longitude': -78.639987, 'count': 6}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14753', 'longitude': -78.640416, 'count': 7}, 
{'zip_zone': u'Special Run', 'zipcode': u'14754', 'longitude': -78.18395, 'count': 4}, 
{'zip_zone': u'Special Run', 'zipcode': u'14755', 'longitude': -78.800866, 'count': 4}, 
{'zip_zone': u'Half Run', 'zipcode': u'14760', 'longitude': -78.426015, 'count': 6}, 
] 
+0

如果計數字段要增加完整運行,半運行或半運行,那麼您確定zip區域應該包含'全運行'嗎? – GodMan 2013-03-14 06:33:25

回答

5

這是Python集合模塊中Counter類的一個很好的用例。

import collections 

# u is your input list of dictionaries, entries in u will be modified in place 

c = collections.Counter(e["zip_zone"] for e in u) 
for e in u: 
    e["count"] = c[e["zip_zone"]] 
0

也許不是很漂亮,但你可以嘗試使用defaultdict

from collections import defaultdict 

output = defaultdict(list) 

for line in origData: 
    output[line['zip_zone']].append(line) 

for line in origData: 
    line['Count'] = len(output[line['zip_zone']]) 

print origData 
+0

嗨Artsiom,什麼是數據意味着在這裏追加(數據) – sandeep 2013-03-14 06:52:58

+0

@sandeep我的錯,糾正 – 2013-03-14 06:53:45

0

我不是很肯定你的問題,但以下代碼可以做你想要的東西,如問題中所表達的:

input = [ 
{'zip_zone': u'Full Run', 'zipcode': u'14042', 'longitude': -78.516154}, 
{'zip_zone': u'Full Run', 'zipcode': u'14101', 'longitude': -78.51734}, 
{'zip_zone': u'Full Run', 'zipcode': u'14706', 'longitude': -78.493761}, 
{'zip_zone': u'Half Run', 'zipcode': u'14709', 'longitude': -78.024817}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14711', 'longitude': -78.119974}, 
{'zip_zone': u'Full Run', 'zipcode': u'14714', 'longitude': -78.256921}, 
{'zip_zone': u'Half Run', 'zipcode': u'14715', 'longitude': -78.157392}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14717', 'longitude': -78.210567}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14719', 'longitude': -78.86951}, 
{'zip_zone': u'Half Run', 'zipcode': u'14727', 'longitude': -78.268103}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14731', 'longitude': -78.658909}, 
{'zip_zone': u'Half Run', 'zipcode': u'14735', 'longitude': -78.087607}, 
{'zip_zone': None, 'zipcode': u'14737', 'longitude': -78.431625}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14739', 'longitude': -78.139046}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14741', 'longitude': -78.5907}, 
{'zip_zone': u'Special Run', 'zipcode': u'14743', 'longitude': -78.4098}, 
{'zip_zone': u'Special Run', 'zipcode': u'14744', 'longitude': -78.167853}, 
{'zip_zone': u'Half Run', 'zipcode': u'14748', 'longitude': -78.639987}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14753', 'longitude': -78.640416}, 
{'zip_zone': u'Special Run', 'zipcode': u'14754', 'longitude': -78.18395}, 
{'zip_zone': u'Special Run', 'zipcode': u'14755', 'longitude': -78.800866}, 
{'zip_zone': u'Half Run', 'zipcode': u'14760', 'longitude': -78.426015}, 
]; 
output=[]; 
zipZoneCnt={}; 
for item in input: 
     if item['zip_zone'] in zipZoneCnt.keys(): 
       zipZoneCnt[item['zip_zone']]+=1; 
     else: 
       zipZoneCnt[item['zip_zone']]=1; 
zipZoneCnt[None]=0; 
for item in input: 
     item['count']=zipZoneCnt[item['zip_zone']]; 
print zipZoneCnt; 
for item in input: 
     print item; 
0

collections.Counter來救援。

from collections import Counter 
a = [ 
{'zip_zone': u'Full Run', 'zipcode': u'14042', 'longitude': -78.516154}, 
{'zip_zone': u'Full Run', 'zipcode': u'14101', 'longitude': -78.51734}, 
{'zip_zone': u'Full Run', 'zipcode': u'14706', 'longitude': -78.493761}, 
{'zip_zone': u'Half Run', 'zipcode': u'14709', 'longitude': -78.024817}, 
{'zip_zone': u'Semi Run', 'zipcode': u'14711', 'longitude': -78.119974}, 
] 

# to obtain the counts: 
c = Counter(x['zip_zone'] for x in a) 
c 
= Counter({u'Full Run': 3, u'Semi Run': 1, u'Half Run': 1}) 

# to update original structure in place: 
for x in a: 
    x['count'] = c[x['zip_zone']] 

a 

[{'count': 3, 
    'longitude': -78.516154, 
    'zip_zone': u'Full Run', 
    'zipcode': u'14042'}, 
{'count': 3, 
    'longitude': -78.51734, 
    'zip_zone': u'Full Run', 
    'zipcode': u'14101'}, 
{'count': 3, 
    'longitude': -78.493761, 
    'zip_zone': u'Full Run', 
    'zipcode': u'14706'}, 
{'count': 1, 
    'longitude': -78.024817, 
    'zip_zone': u'Half Run', 
    'zipcode': u'14709'}, 
{'count': 1, 
    'longitude': -78.119974, 
    'zip_zone': u'Semi Run', 
    'zipcode': u'14711'}]