2010-09-26 45 views
21

我正在構建一些Python代碼來讀取和操作深層嵌套的字典(最終與JSON服務進行交互,但是對於其他用途來說它會很棒) m尋找一種方法來輕鬆讀取/設置/更新字典深處的值,而不需要太多的代碼。Python:輕鬆訪問深層嵌套的字典(獲取和設置)

@ see also Python: Recursively access dict via attributes as well as index access? - Curt Hagenlocher的「DotDictify」解決方案相當有說服力。我也喜歡Ben Alman在http://benalman.com/projects/jquery-getobject-plugin/中爲JavaScript提供的內容。以某種方式將這兩者結合起來會很棒。

大廈關閉柯特Hagenlocher和Ben Alman的例子,這將是在Python巨大的,有像能力:

>>> my_obj = DotDictify() 
>>> my_obj.a.b.c = {'d':1, 'e':2} 
>>> print my_obj 
{'a': {'b': {'c': {'d': 1, 'e': 2}}}} 
>>> print my_obj.a.b.c.d 
1 
>>> print my_obj.a.b.c.x 
None 
>>> print my_obj.a.b.c.d.x 
None 
>>> print my_obj.a.b.c.d.x.y.z 
None 

這是可能的,如果是的話,如何去修改任何想法DotDictify解決方案?

或者,可以使get方法接受點符號(並添加補充集方法),但該對象符號確實更清晰。

>>> my_obj = DotDictify() 
>>> my_obj.set('a.b.c', {'d':1, 'e':2}) 
>>> print my_obj 
{'a': {'b': {'c': {'d': 1, 'e': 2}}}} 
>>> print my_obj.get('a.b.c.d') 
1 
>>> print my_obj.get('a.b.c.x') 
None 
>>> print my_obj.get('a.b.c.d.x') 
None 
>>> print my_obj.get('a.b.c.d.x.y.z') 
None 

這種類型的交互對處理深度嵌套的字典非常有用。有人知道另一個策略(或示例代碼片段/庫)來嘗試嗎?

回答

33

屬性樹

與你的第一個規範的問題是,Python不能在__getitem__告訴我們,如果在my_obj.a.b.c.d,你接下來會進一步繼續向下一個不存在的樹,在這種情況下,需要返回一個具有__getitem__方法的對象,因此您不會得到AttributeError,或者如果您需要值,則需要返回None

我會爭辯說,在上述每種情況下,您都應該期望它會拋出KeyError而不是返回None。原因是你不知道None是否意味着「沒有鑰匙」或「在該位置實際存儲了None」。對於這種行爲,所有你需要做的就是採取dotdictify,刪除,並將其替換__getitem__

def __getitem__(self, key): 
    return self[key] 

因爲你真正想要的是一個dict__getattr____setattr__

可能有辦法完全去除__getitem__和這樣說__getattr__ = dict.__getitem__,但我覺得這可能是過度優化,並且將是一個問題,如果你以後決定要__getitem__創建樹,因爲它是這樣dotdictify本來呢,在這種情況下,你將其更改爲:

def __getitem__(self, key): 
    if key not in self: 
     dict.__setitem__(self, key, dotdictify()) 
    return dict.__getitem__(self, key) 

我不喜歡業務在原有dotdictify

路徑支持

的第二個參數(覆蓋get()set())是正常的dictget(),從你的描述不同地工作,甚至沒有一個set(儘管它有一個setdefault()這是與get()相反的操作)。人們預計get需要兩個參數,第二個參數是默認值,如果找不到密鑰。

如果你想擴展__getitem____setitem__處理點鍵符號,你將需要修改doctictify到:

class dotdictify(dict): 
    def __init__(self, value=None): 
     if value is None: 
      pass 
     elif isinstance(value, dict): 
      for key in value: 
       self.__setitem__(key, value[key]) 
     else: 
      raise TypeError, 'expected dict' 

    def __setitem__(self, key, value): 
     if '.' in key: 
      myKey, restOfKey = key.split('.', 1) 
      target = self.setdefault(myKey, dotdictify()) 
      if not isinstance(target, dotdictify): 
       raise KeyError, 'cannot set "%s" in "%s" (%s)' % (restOfKey, myKey, repr(target)) 
      target[restOfKey] = value 
     else: 
      if isinstance(value, dict) and not isinstance(value, dotdictify): 
       value = dotdictify(value) 
      dict.__setitem__(self, key, value) 

    def __getitem__(self, key): 
     if '.' not in key: 
      return dict.__getitem__(self, key) 
     myKey, restOfKey = key.split('.', 1) 
     target = dict.__getitem__(self, myKey) 
     if not isinstance(target, dotdictify): 
      raise KeyError, 'cannot get "%s" in "%s" (%s)' % (restOfKey, myKey, repr(target)) 
     return target[restOfKey] 

    def __contains__(self, key): 
     if '.' not in key: 
      return dict.__contains__(self, key) 
     myKey, restOfKey = key.split('.', 1) 
     target = dict.__getitem__(self, myKey) 
     if not isinstance(target, dotdictify): 
      return False 
     return restOfKey in target 

    def setdefault(self, key, default): 
     if key not in self: 
      self[key] = default 
     return self[key] 

    __setattr__ = __setitem__ 
    __getattr__ = __getitem__ 

測試代碼:

>>> life = dotdictify({'bigBang': {'stars': {'planets': {}}}}) 
>>> life.bigBang.stars.planets 
{} 
>>> life.bigBang.stars.planets.earth = { 'singleCellLife' : {} } 
>>> life.bigBang.stars.planets 
{'earth': {'singleCellLife': {}}} 
>>> life['bigBang.stars.planets.mars.landers.vikings'] = 2 
>>> life.bigBang.stars.planets.mars.landers.vikings 
2 
>>> 'landers.vikings' in life.bigBang.stars.planets.mars 
True 
>>> life.get('bigBang.stars.planets.mars.landers.spirit', True) 
True 
>>> life.setdefault('bigBang.stars.planets.mars.landers.opportunity', True) 
True 
>>> 'landers.opportunity' in life.bigBang.stars.planets.mars 
True 
>>> life.bigBang.stars.planets.mars 
{'landers': {'opportunity': True, 'vikings': 2}} 
+0

非常感謝,邁克。我添加了一個get函數,它接受點符號(以及默認值,如您所注意到的),我認爲這個新的dotdictify類將使處理深度嵌套字典的生活變得更容易。非常感謝。 – Hal 2010-09-27 01:23:56

+0

你需要一個'get()'函數嗎?它做什麼,現有的'get()'不?你在問題中描述的'get()'相當於你從'dict'免費獲得的'get(key,None)'。 – 2010-09-27 01:40:02

+0

當我使用dottimetify類「as-is」w/my Python 2.5安裝(Google App Engine SDK)時,get函數由於某種原因未處理點符號請求。所以我爲get()函數編寫了一個快速封裝來檢查點符號,如果是,傳遞給__getattr__(返回默認的異常),否則傳遞給dict.get(self,key,default) – Hal 2010-09-27 18:15:44

2

我曾經使用類似的東西來構建類似的應用程序的類似Trie。我希望它有幫助。

class Trie: 
    """ 
    A Trie is like a dictionary in that it maps keys to values. 
    However, because of the way keys are stored, it allows 
    look up based on the longest prefix that matches. 

    """ 

    def __init__(self): 
     # Every node consists of a list with two position. In 
     # the first one,there is the value while on the second 
     # one a dictionary which leads to the rest of the nodes. 
     self.root = [0, {}] 


    def insert(self, key): 
     """ 
     Add the given value for the given key. 

     >>> a = Trie() 
     >>> a.insert('kalo') 
     >>> print(a) 
     [0, {'k': [1, {'a': [1, {'l': [1, {'o': [1, {}]}]}]}]}] 
     >>> a.insert('kalo') 
     >>> print(a) 
     [0, {'k': [2, {'a': [2, {'l': [2, {'o': [2, {}]}]}]}]}] 
     >>> b = Trie() 
     >>> b.insert('heh') 
     >>> b.insert('ha') 
     >>> print(b) 
     [0, {'h': [2, {'a': [1, {}], 'e': [1, {'h': [1, {}]}]}]}] 

     """ 

     # find the node to append the new value. 
     curr_node = self.root 
     for k in key: 
      curr_node = curr_node[1].setdefault(k, [0, {}]) 
      curr_node[0] += 1 


    def find(self, key): 
     """ 
     Return the value for the given key or None if key not 
     found. 

     >>> a = Trie() 
     >>> a.insert('ha') 
     >>> a.insert('ha') 
     >>> a.insert('he') 
     >>> a.insert('ho') 
     >>> print(a.find('h')) 
     4 
     >>> print(a.find('ha')) 
     2 
     >>> print(a.find('he')) 
     1 

     """ 

     curr_node = self.root 
     for k in key: 
      try: 
       curr_node = curr_node[1][k] 
      except KeyError: 
       return 0 
     return curr_node[0] 

    def __str__(self): 
     return str(self.root) 

    def __getitem__(self, key): 
     curr_node = self.root 
     for k in key: 
      try: 
       curr_node = curr_node[1][k] 
      except KeyError: 
       yield None 
     for k in curr_node[1]: 
      yield k, curr_node[1][k][0] 

if __name__ == '__main__': 
    a = Trie() 
    a.insert('kalo') 
    a.insert('kala') 
    a.insert('kal') 
    a.insert('kata') 
    print(a.find('kala')) 
    for b in a['ka']: 
     print(b) 
    print(a) 
1

同胞的Google:我們現在有addict

pip install addict 

mapping.a.b.c.d.e = 2 
mapping 
{'a': {'b': {'c': {'d': {'e': 2}}}}} 

我用它廣泛。

要爲虛路徑工作,我發現dotted

obj = DottedDict({'hello': {'world': {'wide': 'web'}}}) 
obj['hello.world.wide'] == 'web' # true