2014-04-15 116 views
0

我有文件的列表形式如下:創建嵌套字典從平面列表與蟒蛇

base/images/graphs/one.png 
base/images/tikz/two.png 
base/refs/images/three.png 
base/one.txt 
base/chapters/two.txt 

我想將它們轉換爲這種嵌套的字典:

{ "name": "base" , "contents": 
    [{"name": "images" , "contents": 
    [{"name": "graphs", "contents":[{"name":"one.png"}] }, 
    {"name":"tikz",  "contents":[{"name":"two.png"}]} 
    ] 
    }, 
    {"name": "refs", "contents": 
    [{"name":"images", "contents": [{"name":"three.png"}]}] 
    }, 
    {"name":"one.txt", }, 
    {"name": "chapters", "contents":[{"name":"two.txt"}] 
    ] 
} 

麻煩是我的嘗試解決方案,給予一些輸入,如images/datasetone/grapha.png" ,"images/datasetone/graphb.png"每個人都將結束在一個名爲「datasetone」的不同字典中,但是我希望它們在同一個父目錄中,因爲它們在同一個目錄中,我如何創建此嵌套結構而不重複父語言當共同路徑中有多個文件時,白羊座?

這裏就是我想出了和失敗:

def path_to_tree(params): 
    start = {} 
    for item in params: 
     parts = item.split('/') 
     depth = len(parts) 
     if depth > 1: 
      if "contents" in start.keys(): 
       start["contents"].append(create_base_dir(parts[0],parts[1:])) 
      else: 
       start ["contents"] = [create_base_dir(parts[0],parts[1:]) ] 
     else: 
      if "contents" in start.keys(): 
       start["contents"].append(create_leaf(parts[0])) 
      else: 
       start["contents"] =[ create_leaf(parts[0]) ] 
    return start 


def create_base_dir(base, parts): 
    l={} 
    if len(parts) >=1: 
     l["name"] = base 
     l["contents"] = [ create_base_dir(parts[0],parts[1:]) ] 
    elif len(parts)==0: 
     l = create_leaf(base) 
    return l 


def create_leaf(base): 
    l={} 
    l["name"] = base 
    return l 

b=["base/images/graphs/one.png","base/images/graphs/oneb.png","base/images/tikz/two.png","base/refs/images/three.png","base/one.txt","base/chapters/two.txt"] 
d =path_to_tree(b) 
from pprint import pprint 
pprint(d) 

在這個例子中,你可以看到我們最終命名爲「基地」,因爲在列表文件中儘可能多的詞典,但只有一個是必要的,子目錄應該列在「內容」數組中。

+1

請檢查你的示例輸出 - 「內容」:[「name」:「one.png」]'沒有意義 – jonrsharpe

+0

爲什麼one.png沒有內容但是one.txt呢?不應該one.txt不被視爲一個目錄? – Matthew

+0

@jonrsharpe完成 – mike

回答

1

這並不認爲所有的路徑開始同樣的事情,所以我們需要爲它的列表:

from pprint import pprint 
def addBits2Tree(bits, tree): 
    if len(bits) == 1: 
     tree.append({'name':bits[0]}) 
    else: 
     for t in tree: 
      if t['name']==bits[0]: 
       addBits2Tree(bits[1:], t['contents']) 
       return 
     newTree = [] 
     addBits2Tree(bits[1:], newTree) 
     t = {'name':bits[0], 'contents':newTree} 
     tree.append(t) 

def addPath2Tree(path, tree): 
    bits = path.split("/") 
    addBits2Tree(bits, tree) 

tree = [] 
for p in b: 
    print p 
    addPath2Tree(p, tree) 
pprint(tree) 

產生用於你的榜樣路徑列表如下:

[{'contents': [{'contents': [{'contents': [{'name': 'one.png'}, 
              {'name': 'oneb.png'}], 
           'name': 'graphs'}, 
          {'contents': [{'name': 'two.png'}], 
           'name': 'tikz'}], 
       'name': 'images'}, 
       {'contents': [{'contents': [{'name': 'three.png'}], 
           'name': 'images'}], 
       'name': 'refs'}, 
       {'name': 'one.txt'}, 
       {'contents': [{'name': 'two.txt'}], 'name': 'chapters'}], 
    'name': 'base'}] 
+0

看起來不錯,滿足我需要的東西, – mike

0

省略多餘name標籤,你可以去上:

import json 

result = {} 

records = ["base/images/graphs/one.png", "base/images/tikz/two.png", 
     "base/refs/images/three.png", "base/one.txt", "base/chapters/two.txt"] 

recordsSplit = map(lambda x: x.split("/"), records) 

for record in recordsSplit: 
    here = result 
    for item in record[:-1]: 
     if not item in here: 
      here[item] = {} 
     here = here[item] 
    if "###content###" not in here: 
     here["###content###"] = [] 
    here["###content###"].append(record[-1]) 

print json.dumps(result, indent=4) 

#字符用於唯一性(有可能是它的名字是content層次結構中的文件夾)。只需運行它並查看結果。

編輯:修正了一些錯別字,增加了輸出。

+0

for循環開始處的結果是什麼? – mike

+1

記錄也是undefined – Vorsprung

+0

錯字是這樣一個主要缺陷,答案需要被低估? – Danstahr