2015-05-15 49 views
1

只是想開始道歉,如果我接近JSON文件的創建都是錯誤的,我只是試圖拼湊我所能做的。如果有,請提供更好的建議。這裏是我的問題:從CSV創建特定格式的JSON文件

我試圖創建一個CSV一個JSON文件有3列,如這樣的:

000024F14CF24E42A5F36D7CB7A07C26,Name One,action-1 
000024F14CF24E42A5F36D7CB7A07C26,Name One Variant,action-1 
000042F8F69C4A048DDD4770DB7966C8,Name Two,action-2 

的JSON格式,我需要做到的是:

{ 
"topics": [ 
    { 
     "id": "000024f14cf24e42a5f36d7cb7a07c26", 
     "label": [ 
      "Name One", 
      "Name One Variant" 
     ] 
     "meta": { 
      "action": "action-1" 
     } 
    } 
    { 
     "id": "000042F8F69C4A048DDD4770DB7966C8", 
     "label": [ 
      "Name Two" 
     ] 
     "meta": { 
      "action": "action-2" 
     } 
    } 
    ] 
} 

所以基本上我需要將這些名稱合併到一個保存所有變體的列表中,如果它們具有相同的ID,並且我只需要保留一個動作,因爲它們每個ID始終是相同的。

到目前爲止,我將粘貼到下面的腳本將會貼近,但我被卡住了。這個腳本輸出的JSON看起來像這樣,你可以看到動作被添加到標籤數組中。我怎樣才能在行動分離出來?:

{ 
    "topics": [ 
     { 
      "id": "000024f14cf24e42a5f36d7cb7a07c26", 
      "label": [ 
       "Name One", 
       "action-1", 
       "Name One Variant", 
       "action-1" 
      ] 
     } 
    ] 
} 

腳本:

import csv 
import json 
from collections import defaultdict 

def convert2json(): 
    # open the CSV file and loop through each row and append to the uniques list 
    uniques = [] 
    with open('uploads/test.csv','rb') as data_file: 
     reader = csv.reader(data_file) 
     for row in reader: 
      itemids = row[0] 
      values = row[1] 
      actions = row[2] 
      uniques.append((itemids, values, actions)) 

    # using defaultdict create a list, then loop through uniques and append 
    output = defaultdict(list) 
    for itemid, value, action in uniques: 
     output[itemid].append(value) 
     output[itemid].append(action) 


    # loop through the defaultdict list and append values to a dictionary 
    # then add values with labels to the done list 

    done = [] 
    for out in output.items(): 
     jsonout = {} 
     ids = out[0] 
     jsonout['id'] = ids.lower() 
     vals = out[1] 
     jsonout['label'] = vals 
     done.append(jsonout) 

    # create a dictionary and add the "done" list to it so it outputs 
    # an object with a JSON array named 'topics' 
    dones = {} 
    dones['topics'] = done 

    print json.dumps(dones, indent=4, encoding='latin1')        

if __name__ == "__main__": 
    convert2json() 
+1

請問您的CSV文件有一個第一行與列名? –

+1

你甚至無法嘗試構建一個'meta'部分。你只是將所有東西都拍成'out [1]' –

+0

@StefanPochmann沒有標題行,但如果有幫助,我可以添加一行。 – zrdunlap

回答

3

你確實接近。我只是馬上建立結構。第一次看到一個itemid時,準備它的條目並記住它,隨後只需將該值添加到標籤。

import csv 

summary = {} 
with open('test.csv', 'rb') as data_file: 
    reader = csv.reader(data_file) 
    for itemid, value, action in reader: 
     if itemid not in summary: 
      summary[itemid] = dict(id=itemid, label=[value], meta={'action': action}) 
     else: 
      summary[itemid]['label'].append(value) 

data = {"topics": list(summary.values())} 
+0

哇謝謝你!這太簡單了。 – zrdunlap

2

變化的東西有點

def convert2json2(): 
    # open the CSV file and loop through each row and append to the uniques list 
    # uniques = [] 

    topics = dict() 

    # new_entry = dict(id) 

    with open('uploads/test.csv','rb') as data_file: 
     reader = csv.reader(data_file) 

     #000024F14CF24E42A5F36D7CB7A07C26,Name One,action-1 
     for row in reader: 
      #can't use id thats a builtin function, but use all your other final 
      #json attribute names. 
      id_ = row[0].lower() 
      #you might have had the columns wrong before 
      label = row[1] 
      action = row[2] 
      # uniques.append((itemids, values, actions)) 


      #skip the unique, a dictionary is already unique 
      #populate it with a dictionary made out of your final desired json 
      #field names. action is always same so populated on first pass 
      #ditto id_ 
      topic = topics.setdefault(id_, dict(
               id=id_, 
               label=[], 
               meta=dict(action=action) 
               ) 
      ) 


      #after the first insert above, you have an empty label list 
      #add to it on each pass... 
      topic["label"].append(label) 


    # create a dictionary and add the "done" list to it so it outputs 
    # an object with a JSON array named 'topics' 
    dones = {} 

    #nope... 
    #dones['topics'] = topics 
    dones['topics'] = topics.values() 

    print json.dumps(dones, indent=4, encoding='latin1')        

和輸出

{ 
    "topics": [ 
     { 
      "meta": { 
       "action": "action-1" 
      }, 
      "id": "000024f14cf24e42a5f36d7cb7a07c26", 
      "label": [ 
       "Name One", 
       "Name One Variant" 
      ] 
     }, 
     { 
      "meta": { 
       "action": "action-2" 
      }, 
      "id": "000042f8f69c4a048ddd4770db7966c8", 
      "label": [ 
       "Name Two" 
      ] 
     } 
    ] 
} 
+0

這比我所在的地方要簡單得多。這次真是萬分感謝! – zrdunlap