2015-12-02 76 views
0

我想通過郵編和受害者計數按犯罪類型對犯罪總數進行排序。我通過報告編號構建了字典。這是我的數據的一個小樣本的輸出,當我打印詞典:Python:如何對字典數據進行排序和組織

{'100065070': ['64130', '18', 'VIC', 'VIC', 'VIC'], '20003319': ['64130', '13', 'VIC'], '60077156': ['64130', '18', 'VIC'], '100057708': ['99999', '17', 'VIC', 'VIC'], '40024161': ['64108', '17', 'VIC', 'VIC']} 

字典被構建爲如下:{Report_number:郵政編碼,進攻類型,受害者的數]}

I」 m全新的編碼,我只是學習字典。我將如何通過字典排序來將數據整理爲這種格式?

Zip Codes Crime totals 

====================

64126 809 
    64127 3983 

    64128 1749 
    64129 1037 
    64130 4718 
    64131 2080 
    64132 2060 
    64133 2005 
    64134 2928 

任何幫助將非常感激。以下是我的代碼到目前爲止。我使用大約50,000行數據訪問兩個文件,所以效率非常重要。

from collections import Counter 

incidents_f = open('incidents.csv', mode = "r") 

crime_dict = dict() 

for line in incidents_f: 
    line_1st = line.strip().split(",") 
    if line_1st[0].upper() != "REPORT_NO": 
     report_no = line_1st[0] 
     offense = line_1st[3] 
     zip_code = line_1st[4] 
     if len(zip_code) < 5: 
      zip_code = "99999" 

     if report_no in crime_dict: 
      crime_dict[report_no].append(zip_code).append(offense) 
     else: 
      crime_dict[report_no] = [zip_code]+[offense] 

#close File 
incidents_f.close 

details_f = open('details.csv',mode = 'r') 
for line in details_f: 
    line_1st = line.strip().split(",") 
    if line_1st[0].upper() != "REPORT_NO": 
     report_no = line_1st[0] 
     involvement = line_1st[1] 
     if involvement.upper() == 'VIC': 
      victims = "VIC" 

     if report_no in crime_dict: 
      crime_dict[report_no].append(victims) 
     else: 
      continue 


#close File 
details_f.close 



print(crime_dict) 
+1

這將有助於如果你可以編輯的問題,包括幾個示例行從您的CSV文件。 –

回答

1

這是一種比@更多的代碼亞歷山大的解決方案來做到這一點:

crime_dict ={ 
    '100065070': ['64130', '18', 'VIC', 'VIC', 'VIC'], 
    '20003319': ['64130', '13', 'VIC'], 
    '60077156': ['64130', '18', 'VIC'], 
    '100057708': ['99999', '17', 'VIC', 'VIC'], 
    '40024161': ['64108', '17', 'VIC', 'VIC'] 
    } 

crimes_by_zip = {} 
for k, v in crime_dict.items(): 
    zip = v[0] 
    if zip not in crimes_by_zip.keys(): 
     crimes_by_zip[zip] = 0 
    crimes_by_zip[zip] += 1 

for zip in sorted(crimes_by_zip.keys()): 
    print(zip, crimes_by_zip[zip]) 

64108 1 
64130 3 
99999 1 
+0

謝謝史蒂夫。這工作完美,完全合理。我感謝您的幫助。 – Wakedude

0
D = {'100065070': ['64130', '18', 'VIC', 'VIC', 'VIC'], '20003319': ['64130', '13', 'VIC'], '60077156': ['64130', '18', 'VIC'], '100057708': ['99999', '17', 'VIC', 'VIC'], '40024161': ['64108', '17', 'VIC', 'VIC']} 

data_with_zip_duplicate = [(D[key][0],key) for key in sorted(D.keys(), key = lambda x:D[x][0])] 
print(*data_with_zip_duplicate, sep = "\n") 
+0

謝謝你的幫助。我們還沒有開始使用Lambda函數,但在查找此問題時,我已經看到了這些解決方案。我很好奇,lambda函數比Steve提出的zip解決方案更有效率嗎? – Wakedude

相關問題