2016-11-29 73 views
0

目前我使用下面的python腳本:增加的增量計數器循環在Python腳本解析

import json 
from collections import defaultdict 
from pprint import pprint 

with open('prettyPrint.txt') as data_file: 
    data = json.load(data_file) 

locations = defaultdict(list) 


for item in data['data']: 
    location = item['relationships']['location']['data']['id'] 
    locations[location].append(item['id']) 

pprint(locations) 

解析有些髒的JSON數據,如下所示:

{ 
    "links": { 
     "self": "http://localhost:2510/api/v2/jobs?skills=data%20science" 
    }, 
    "data": [ 
     { 
      "id": 121, 
      "type": "job", 
      "attributes": { 
       "title": "Data Scientist", 
       "date": "2014-01-22T15:25:00.000Z", 
       "description": "Data scientists are in increasingly high demand amongst tech companies in London. Generally a combination of business acumen and technical skills are sought. Big data experience ..." 
      }, 
      "relationships": { 
       "location": { 
        "links": { 
         "self": "http://localhost:2510/api/v2/jobs/121/location" 
        }, 
        "data": { 
         "type": "location", 
         "id": 3 
        } 
       }, 
       "country": { 
        "links": { 
         "self": "http://localhost:2510/api/v2/jobs/121/country" 
        }, 
        "data": { 
         "type": "country", 
         "id": 1 
        } 
       }, 

在這一點的輸出是這樣的:

  85: [36026, 
       36028, 
       36032, 
       36027, 
       217897, 
       286398, 
       315064, 
       320879, 
       322303, 
       322608, 
       322611, 
       323199, 
       325659, 
       327652], 
     88: [13690, 
       13693, 
       13689, 
       13692, 
       13691, 
       16454, 
       16453, 
       28002, 
       28003, 
       28004, 
       28001, 
       114667, 
       233319, 
       233329, 
       263814, 
       271490, 
       271571, 
       271569, 
       271570, 
       291274, 
       291275, 
       300376, 
       300373, 
       301293, 
       301295, 
       304286, 
       304285, 
       320425, 
       320426, 
       320424, 
       320431, 
       320430, 
       321284, 
       321281, 
       321283, 
       321282, 
       321280, 
       324345, 
       327926, 
       347985, 
       358537, 
       358549, 
       357807, 
       364541, 
       358431, 
       334990, 
       359241], 

但我想改變它,使輸出如下所示:

  ... 
     87: 02 
     88: 73 
     89: 15 
     90: 104 
     ... 

我知道我需要把某種i=0i++成環somewhere-,但我想不出它OUT-如何做到這一點?

+1

如何預期輸出涉及到的原始數據? –

+1

通常,如果您正在考慮使用像「i ++」這樣的C型增量表,那麼Python中幾乎總是有一種更好的方式,例如, 'enumerate','itertools.count','collections.Counter'。這些通常會完成這項工作。 – pylang

回答

1

你只需要在項目的計數的字典,而不是實際的項目是locations字典的一部分。使用intdefaultdict爲:

locations = defaultdict(int) 
# makes default value of each key as `0` 

,使您的for環路:

for item in data['data']: 
    location = item['relationships']['location']['data']['id'] 
    locations[location] += 1 # increase the count by `1` 

OR,那就更好了使用collections.Counter()發電機表達一起,由@ TigerhawkT3提到:

from collections import Counter 

Counter(item['relationships']['location']['data'‌​]['id'] for item in data['data']) 
+1

或'collections.Counter',專門用於計數。 – TigerhawkT3

+1

'collections.Counter(項[ '關係'] [ '位置'] [ '數據'] [ 'ID' 用於在數據[ '數據']項)'? – TigerhawkT3