2017-04-13 16 views
1

對於我的C++ RTOS我正在Python中使用pyparsing模塊編寫devicetree「源」文件(.dts)的解析器。我能夠將devicetree的結構解析爲(嵌套)字典,其中屬性名稱或節點名稱是字典鍵(字符串),屬性值或節點是字典值(字符串或嵌套字典)。用pyparsing解析devicetree到結構化詞典中

假設我有下面的例子中的DeviceTree結構:

/ { 
    property1 = "string1"; 
    property2 = "string2"; 
    node1 { 
     property11 = "string11"; 
     property12 = "string12"; 
     node11 { 
      property111 = "string111"; 
      property112 = "string112"; 
     }; 
    }; 
    node2 { 
     property21 = "string21"; 
     property22 = "string22"; 
    }; 
}; 

我能夠解析到類似的東西:

{'/': {'node1': {'node11': {'property111': ['string111'], 'property112': ['string112']}, 
       'property11': ['string11'], 
       'property12': ['string12']}, 
     'node2': {'property21': ['string21'], 'property22': ['string22']}, 
     'property1': ['string1'], 
     'property2': ['string2']}} 

但是我需要我寧願這數據結構不同。我想將所有屬性作爲關鍵「屬性」的嵌套字典,並將所有子節點作爲關鍵「子」的嵌套字典。原因在於devicetree(特別是節點)有一些我希望只具有鍵值對的「元數據」,這要求我將節點的實際「內容」移動到「較低」的一個級別以避免任何名稱衝突爲關鍵。所以,我寧願上面的例子是這樣的:

{'/': { 
    'properties': { 
    'property1': ['string1'], 
    'property2': ['string2'] 
    }, 
    'nodes': { 
    'node1': { 
     'properties': { 
     'property11': ['string11'], 
     'property12': ['string12'] 
     } 
     'nodes': { 
     'node11': { 
      'properties': { 
      'property111': ['string111'], 
      'property112': ['string112'] 
      } 
      'nodes': { 
      } 
     } 
     } 
    }, 
    'node2': { 
     'properties': { 
     'property21': ['string21'], 
     'property22': ['string22'] 
     } 
     'nodes': { 
     } 
    } 
    } 
} 
} 

我試圖「名」添加到解析令牌,但這會導致「翻番」字典元素(這是意料之中的,因爲這種行爲在pyparsing文檔中描述)。這可能不是問題,但從技術上講節點或屬性可以被命名爲「屬性」或「孩子」(或任何我選擇的),所以我不認爲這樣的解決方案是健壯的。

我也試圖用setParseAction()令牌轉換成字典片段(我希望我能轉化成{'key': 'value'}{'properties': {'key': 'value'}}),但這並沒有在所有的工作......

這是在所有可能直接與pyparsing?我準備只做第二階段來將原始字典轉換爲我需要的任何結構,但作爲完美主義者,如果可能的話,我寧願使用單運行pyparsing-only解決方案。

有關此處參考的示例代碼(Python 3),它將devicetree源代碼轉換爲「非結構化」字典。請注意,此代碼只是一種簡化,不支持.dts(除字符串,值列表,單元地址,標籤等之外的任何數據類型)中的所有功能 - 它只支持字符串屬性和節點嵌套。

#!/usr/bin/env python 

import pyparsing 
import pprint 

nodeName = pyparsing.Word(pyparsing.alphas, pyparsing.alphanums + ',._+-', max = 31) 
propertyName = pyparsing.Word(pyparsing.alphanums + ',._+?#', max = 31) 
propertyValue = pyparsing.dblQuotedString.setParseAction(pyparsing.removeQuotes) 
property = pyparsing.Dict(pyparsing.Group(propertyName + pyparsing.Group(pyparsing.Literal('=').suppress() + 
     propertyValue) + pyparsing.Literal(';').suppress())) 
childNode = pyparsing.Forward() 
rootNode = pyparsing.Dict(pyparsing.Group(pyparsing.Literal('/') + pyparsing.Literal('{').suppress() + 
     pyparsing.ZeroOrMore(property) + pyparsing.ZeroOrMore(childNode) + 
     pyparsing.Literal('};').suppress())) 
childNode <<= pyparsing.Dict(pyparsing.Group(nodeName + pyparsing.Literal('{').suppress() + 
     pyparsing.ZeroOrMore(property) + pyparsing.ZeroOrMore(childNode) + 
     pyparsing.Literal('};').suppress())) 

dictionary = rootNode.parseString(""" 
/{ 
    property1 = "string1"; 
    property2 = "string2"; 
    node1 { 
     property11 = "string11"; 
     property12 = "string12"; 
     node11 { 
      property111 = "string111"; 
      property112 = "string112"; 
     }; 
    }; 
    node2 { 
     property21 = "string21"; 
     property22 = "string22"; 
    }; 
}; 
""").asDict() 
pprint.pprint(dictionary, width = 120) 

回答

1

你真的很親密。我只是做了以下內容:

  • 添加Group S和結果的名稱爲您的「屬性」和「節點」小節
  • 改變了一些標點符號文字常量的(Literal("};")將無法​​匹配,如果有右括號和分號之間的空間,但RBRACE + SEMI將容納空格)
  • rootNode

代碼除去最外面的Dict

LBRACE,RBRACE,SLASH,SEMI,EQ = map(pyparsing.Suppress, "{}/;=") 
nodeName = pyparsing.Word(pyparsing.alphas, pyparsing.alphanums + ',._+-', max = 31) 
propertyName = pyparsing.Word(pyparsing.alphanums + ',._+?#', max = 31) 
propertyValue = pyparsing.dblQuotedString.setParseAction(pyparsing.removeQuotes) 
property = pyparsing.Dict(pyparsing.Group(propertyName + EQ 
              + pyparsing.Group(propertyValue) 
              + SEMI)) 
childNode = pyparsing.Forward() 
rootNode = pyparsing.Group(SLASH + LBRACE 
          + pyparsing.Group(pyparsing.ZeroOrMore(property))("properties") 
          + pyparsing.Group(pyparsing.ZeroOrMore(childNode))("children") 
          + RBRACE + SEMI) 
childNode <<= pyparsing.Dict(pyparsing.Group(nodeName + LBRACE 
              + pyparsing.Group(pyparsing.ZeroOrMore(property))("properties") 
              + pyparsing.Group(pyparsing.ZeroOrMore(childNode))("children") 
              + RBRACE + SEMI)) 

轉換爲與asDict和印刷用pprint的字典給出:

pprint.pprint(result[0].asDict()) 
{'children': {'node1': {'children': {'node11': {'children': [], 
               'properties': {'property111': ['string111'], 
                   'property112': ['string112']}}}, 
         'properties': {'property11': ['string11'], 
             'property12': ['string12']}}, 
       'node2': {'children': [], 
         'properties': {'property21': ['string21'], 
             'property22': ['string22']}}}, 
'properties': {'property1': ['string1'], 'property2': ['string2']}} 

您還可以使用附帶pyparsing的ParseResults類,以幫助可視化的列表和字典/命名空間中的dump()方法按原樣訪問結果,不需要任何轉換呼叫

print(result[0].dump()) 

[[['property1', ['string1']], ['property2', ['string2']]], [['node1', [['property11', ['string11']], ['property12', ['string12']]], [['node11', [['property111', ['string111']], ['property112', ['string112']]], []]]], ['node2', [['property21', ['string21']], ['property22', ['string22']]], []]]] 
- children: [['node1', [['property11', ['string11']], ['property12', ['string12']]], [['node11', [['property111', ['string111']], ['property112', ['string112']]], []]]], ['node2', [['property21', ['string21']], ['property22', ['string22']]], []]] 
    - node1: [[['property11', ['string11']], ['property12', ['string12']]], [['node11', [['property111', ['string111']], ['property112', ['string112']]], []]]] 
    - children: [['node11', [['property111', ['string111']], ['property112', ['string112']]], []]] 
     - node11: [[['property111', ['string111']], ['property112', ['string112']]], []] 
     - children: [] 
     - properties: [['property111', ['string111']], ['property112', ['string112']]] 
      - property111: ['string111'] 
      - property112: ['string112'] 
    - properties: [['property11', ['string11']], ['property12', ['string12']]] 
     - property11: ['string11'] 
     - property12: ['string12'] 
    - node2: [[['property21', ['string21']], ['property22', ['string22']]], []] 
    - children: [] 
    - properties: [['property21', ['string21']], ['property22', ['string22']]] 
     - property21: ['string21'] 
     - property22: ['string22'] 
- properties: [['property1', ['string1']], ['property2', ['string2']]] 
    - property1: ['string1'] 
    - property2: ['string2'] 
+0

非常感謝!還有一個問題 - 是否可以將空鍵作爲空字典'{}'而不是空列表'[]'(這裏可以看到 - 'node11':{'children':[] ...' )?或者,如果它們是空的,也許根本就沒有這樣的鑰匙? –