2017-02-13 72 views
0

我有一個電子表格,我試圖將數據導入一組嵌套python字典。本質上,電子表格中有用於網站,建築物,地板,房間,排和機架的列。我想數據結構是這樣的:將excel數據導入嵌套字典

sites = [ 
    { 
    "name": "", 
    "descr": "", 
    "buildings": [ 
     { 
     "name": "", 
     "descr": "", 
     "floors": [ 
      { 
      "name": "", 
      "descr": "", 
      "rooms": [ 
       { 
       "name": "", 
       "descr": "", 
       "rows": [ 
        { 
        "name": "", 
        "descr": "", 
        "racks": [ 
         { 
         "name": "", 
         "descr": "" 
         } 
        ] 
        } 
       ] 
       } 
      ] 
      } 
     ] 
     } 
    ] 
    } 
] 

電子表格的一個例子是:

+------+---------------+----------+----------------+-------+-------------+------+------------+-----+-----------+------+------------+-------------------------------------------------------------------+ 
| site | site_descr | building | building_descr | floor | floor_descr | room | room_descr | row | row_descr | rack | rack_descr | rack_dn                | 
+------+---------------+----------+----------------+-------+-------------+------+------------+-----+-----------+------+------------+-------------------------------------------------------------------+ 
| dc1 | Data Center 1 | alpha | Alpha Building | 1  | Floor 1  | 100 | Room 100 | A | Row A  | A5 | Rack A5 | uni/fabric/site-dc1/building-alpha/floor-1/room-100/row-A/rack-A5 | 
+------+---------------+----------+----------------+-------+-------------+------+------------+-----+-----------+------+------------+-------------------------------------------------------------------+ 
| dc1 | Data Center 1 | alpha | Alpha Building | 1  | Floor 1  | 100 | Room 100 | A | Row A  | A5 | Rack A5 | uni/fabric/site-dc1/building-alpha/floor-1/room-100/row-A/rack-A5 | 
+------+---------------+----------+----------------+-------+-------------+------+------------+-----+-----------+------+------------+-------------------------------------------------------------------+ 
| dc1 | Data Center 1 | alpha | Alpha Building | 1  | Floor 1  | 200 | Room 200 | A | Row A  | A5 | Rack A5 | uni/fabric/site-dc1/building-alpha/floor-1/room-200/row-A/rack-A5 | 
+------+---------------+----------+----------------+-------+-------------+------+------------+-----+-----------+------+------------+-------------------------------------------------------------------+ 
| dc1 | Data Center 1 | alpha | Alpha Building | 1  | Floor 1  | 100 | Room 100 | B | Row B  | B5 | Rack B5 | uni/fabric/site-dc1/building-alpha/floor-1/room-100/row-B/rack-B5 | 
+------+---------------+----------+----------------+-------+-------------+------+------------+-----+-----------+------+------------+-------------------------------------------------------------------+ 
| dc1 | Data Center 1 | alpha | Alpha Building | 2  | Floor 2  | 100 | Room 100 | A | Row A  | A7 | Rack A7 | uni/fabric/site-dc1/building-alpha/floor-2/room-100/row-A/rack-A7 | 
+------+---------------+----------+----------------+-------+-------------+------+------------+-----+-----------+------+------------+-------------------------------------------------------------------+ 
| dc2 | Data Center 2 | beta  | Beta Building | 5  | Floor 5  | 200 | Room 200 | B | Row B  | B5 | Rack B5 | uni/fabric/site-dc2/building-beta/floor-5/room-200/row-B/rack-B5 | 
+------+---------------+----------+----------------+-------+-------------+------+------------+-----+-----------+------+------------+-------------------------------------------------------------------+ 

什麼是進入我的數據結構的最佳方式? pyexcel模塊可以導入記錄,這些記錄基本上創建了每行的詞典列表作爲列表中的條目。我在重構這個邏輯時遇到了問題...

我應該在for循環之前初始化結構嗎?我是否應該在填充它時構建結構?如果我用下面的空白初始化結構,我需要確保我的第一行填入這些空白,這讓我認爲後者可能是更好的選擇。

+0

這個問題很大程度上受益於[最小,完整和可驗證](http://stackoverflow.com/help/mcve)示例。這使我們更容易幫助你。特別是沒有示例數據,也沒有任何代碼已經嘗試過。 –

+0

@StephenRauch我用我正在使用的電子表格更新了我的文章。我還沒有真正的代碼,因爲即使開始我也遇到了麻煩。我覺得我需要使用「setdefault」字典選項做一些這個..... – mikey

回答

0

我覺得我們該做的是遍歷列名,尋找與該列的正確的名稱字典,創建它,如果它不存在,然後推進到其陣兒:

import pprint 

columns = ['site', 'building', 'floor', 'room', 'row', 'rack'] 
keys = ['buildings', 'floors', 'rooms', 'rows', 'racks'] 

def find(seq, pred): 
    try: 
     found = next(x for x in seq if pred(x)) 
    except StopIteration: 
     found = None 
    return found 

def add_record(sites, record): 
    array = sites 
    for index, column in enumerate(columns): 
     name = record[column] 
     descr = record[column + '_descr'] 
     dictionary = find(array, lambda x: x['name'] == name) 
     if dictionary is None: 
      dictionary = {'name': name, 'descr' : descr} 
      if column != 'rack': 
       dictionary[keys[index]] = [] 
      array.append(dictionary) 
     if column != 'rack': 
      array = dictionary[keys[index]] 
     else: 
      dictionary['rack_dn'] = record['rack_dn'] 



def main(): 
    records = [{'site': 'dc1', 'site_descr' : 'Data Center 1', 'building' : 'alpha', 
       'building_descr': 'Alpha Building', 'floor' : 1, 'floor_descr' : 'Floor 1', 
       'room' : 100, 'room_descr' : 'Room 100', 'row' : 'A', 'row_descr': 'Row A', 
       'rack': 'A5', 'rack_descr' : 'Rack A5', 
       'rack_dn' : 'uni/fabric/site-dc1/building-alpha/floor-1/room-100/row-A/rack-A5'}, 
       {'site': 'dc1', 'site_descr' : 'Data Center 1', 'building' : 'alpha', 
       'building_descr': 'Alpha Building', 'floor' : 1, 'floor_descr' : 'Floor 1', 
       'room' : 200, 'room_descr' : 'Room 200', 'row' : 'A', 'row_descr': 'Row A', 
       'rack': 'A5', 'rack_descr' : 'Rack A5', 
       'rack_dn' : 'uni/fabric/site-dc1/building-alpha/floor-1/room-200/row-A/rack-A5'}, 
       {'site': 'dc2', 'site_descr' : 'Data Center 2', 'building' : 'beta', 
       'building_descr': 'Beta Building', 'floor' : 5, 'floor_descr' : 'Floor 5', 
       'room' : 200, 'room_descr' : 'Room 200', 'row' : 'B', 'row_descr': 'Row B', 
       'rack': 'B5', 'rack_descr' : 'Rack B5', 
       'rack_dn' : 'uni/fabric/site-dc2/building-beta/floor-5/room-200/row-B/rack-B5'}] 
    sites = [] 
    for record in records: 
     add_record(sites, record) 
    pp = pprint.PrettyPrinter() 
    pp.pprint(sites) 
+0

我認爲這是非常接近。當我運行它時,它看起來會創建我需要的結構,但是當我查看輸出時,我所有的'descr'鍵都是空的。我嘗試重新修改列對象以添加site_descr,building_descr等,因爲它們是列,但是出現「IndexError:list index超出範圍」錯誤的錯誤。 – mikey

+0

@mikey,我已經更新了我的答案以檢索'* _descr'列和'rack_dn' –

+0

這看起來很好。非常感謝!我刪除了rack_dn部分,因爲我一直想從電子表格中刪除它。現在我只需要看看我是否能夠弄清楚代碼實際上在做什麼...... – mikey