2015-12-14 258 views
1

辭典鍵的嵌套列表組我有基礎的嵌套列表上艱難的時間,分組標識(S)辭典鍵如何在蟒蛇

下面的代碼是基於工作對我來說,組ID和ST值位置

null='' 
dataset={"users": [ 
    {"id": 20, "loc": "Chicago", "st":"4", "sectors": [{"sname": "Retail"}, {"sname": "Manufacturing"}, {"sname": null}]}, 
    {"id": 21, "loc": "Frankfurt", "st":"4", "sectors": [{"sname": null}]}, 
    {"id": 22, "loc": "Berlin", "st":"6", "sectors": [{"sname": "Manufacturing"}, {"sname": "Banking"},{"sname": "Agri"}]}, 
    {"id": 23, "loc": "Chicago", "st":"2", "sectors": [{"sname": "Banking"}, {"sname": "Agri"}]}, 
    {"id": 24, "loc": "Bern", "st":"1", "sectors": [{"sname": "Retail"}, {"sname": "Agri"}]}, 
    {"id": 25, "loc": "Bern", "st":"4", "sectors": [{"sname": "Retail"}, {"sname": "Agri"}, {"sname": "Banking"}]} 
    ]} 

byloc = lambda x: x['loc'] 

it = (
    (loc, list(user_grp)) 
    for loc, user_grp in itertools.groupby(
     sorted(dataset['users'], key=byloc), key=byloc 
    ) 
) 
fs_loc = [ 
    {'loc': loc, 'ids': [{'id':x['id'],'st':x['st']} for x in grp], 'count': len(grp)} 
    for loc, grp in it 
] 

print(fs_loc) 

fs_loc給我的ID和各自的ST值如下(連同ID數)的列表現在

[ 
    {"loc": "Chicago","count":2,"ids": [{"id":"20","st":"4"}, {"id":"23","st":"2"}]}, 
    {"loc": "Bern","count":2,"ids": [{"id":"24","st":"1"}, {"id":"25","st":"4"}]}, 
    {"loc": "Frankfurt","count":1,"ids": [{"id":"21","st":"4"}]}, 
    {"loc": "Berlin","count":1,"ids": [{"id":"21","st":"4"}]}  
] 

,我試圖通過組從SNAME部門 - 我試過以下代碼,它失敗..無法弄清楚如何實現如下的結果 -

所需的結果:

[ 
    {"sname": "Retail","count":3,"ids": [{"id":"20","st":"4"}, {"id":"24","st":"1"}, {"id":"25","st":"4"}]}, 
    {"sname": "Manufacturing","count":2,"ids": [{"id":"20","st":"4"}, {"id":"22","st":"6"}]}, 
    {"sname": "Banking","count":2,"ids": [{"id":"22","st":"6"},{"id":"23","st":"2"}]}, 
    {"sname": "Agri","count":4,"ids": [{"id":"22","st":"6"},{"id":"23","st":"2"},{"id":"24","st":"1"},{"id":"25","st":"4"}]}  
] 

我嘗試下面的代碼,它不爲嵌套表工作字典的鍵 -

bysname = lambda x: x['sectors'][0]['sname'] 

it = (
    (sname, list(user_grp)) 
    for sname, user_grp in itertools.groupby(
     sorted(dataset['users'], key=bysname), key=bysname 
    ) 
) 
fs_sname= [ 
    {'sname': sname, 'ids': [{'id':x['id'],'st':x['st']} for x in grp], 'count': len(grp)} 
    for sname, grp in it 
] 

print(fs_sname) 

編輯 - 上面的代碼工作,但只考慮部門列表中的第一項。即,它給出以下結果 -

[ 
    {"sname": "","count":1,"ids": [{"id":"21","st":"4"}]}, 
    {"sname": "Manufacturing","count":1,"ids": [{"id":"22","st":"6"}]}, 
    {"sname": "Banking","count":1,"ids": [{"id":"23","st":"2"}]}, 
    {"sname": "Retail","count":3,"ids": [{"id":"20","st":"4"},{"id":"24","st":"1"},{"id":"25","st":"4"}]}  
] 

如何才能達到所期望的結果?

+0

我不明白你期望的結果是什麼適應summarize功能。你能提供一個最小的例子嗎? – timgeb

+0

我已經添加了它..你可以用chk突出顯示嗎? –

+0

什麼是'null'? – timgeb

回答

1

這應該工作 - 根據需要

allsectornames = set(sec['sname'] for record in dataset['users'] for sec in record['sectors']) 

summarize = lambda record: record[ 'id' ] # customize this to return whatever details you want (even just return the whole record itself if you prefer) 

result = [ 
    { 
     'sname':sname, 
     'count':len(matches), 
     'matches':[ summarize(match) for match in matches ] 
    } 
    for sname in allsectornames 
    for matches in [[ 
     record for record in dataset['users'] if sname in [ sec['sname'] for sec in record['sectors'] ] 
    ]] 
] 

print(result) 
+0

非常感謝!我添加了st字段以獲取id,st的集合 –