2016-08-07 73 views
4

使用pyparsing,有沒有一種方法來提取您在遞歸下降過程中的上下文。讓我解釋我的意思。我有以下代碼:保留文本結構信息 - pyparsing

import pyparsing as pp 

openBrace = pp.Suppress(pp.Literal("{")) 
closeBrace = pp.Suppress(pp.Literal("}")) 
ident = pp.Word(pp.alphanums + "_" + ".") 
comment = pp.Literal("//") + pp.restOfLine 
messageName = ident 
messageKw = pp.Suppress(pp.Keyword("msg")) 
text = pp.Word(pp.alphanums + "_" + "." + "-" + "+") 
otherText = ~messageKw + pp.Suppress(text) 
messageExpr = pp.Forward() 
messageExpr << (messageKw + messageName + openBrace + 
       pp.ZeroOrMore(otherText) + pp.ZeroOrMore(messageExpr) + 
       pp.ZeroOrMore(otherText) + closeBrace).ignore(comment) 
testStr = "msg msgName1 { some text msg msgName2 { some text } some text }" 
print messageExpr.parseString(testStr) 

產生以下輸出:['msgName1', 'msgName2']

在輸出中,我想跟蹤嵌入匹配的結構的。我的意思是說,例如,我想用上面的測試字符串輸出以下輸出:['msgName1', 'msgName1.msgName2']以跟蹤文本中的層次結構。然而,我是pyparsing的新手,尚未找到一種方法來提取「msgName2」嵌入在「msgName1」結構中的事實。

有沒有辦法使用ParserElementsetParseAction()方法來做到這一點,或者使用命名結果?

有用的建議,將不勝感激。

+1

附加解析動作'messageName'這個名字推到外部堆棧,並將一個分析操作附加到closeBrace上,以便將該姓氏從堆棧中彈出。在第一個解析操作中,在將當前名稱推入堆棧後,可以用'tokens [0] ='。'。join(stack)'替換輸入標記中的名稱。 – PaulMcG

回答

2

感謝Paul McGuire的忠告。下面是我做這解決了這個問題的增加/修改,:

msgNameStack = [] 

def pushMsgName(str, loc, tokens): 
    msgNameStack.append(tokens[0]) 
    tokens[0] = '.'.join(msgNameStack) 

def popMsgName(str, loc, tokens): 
    msgNameStack.pop() 

closeBrace = pp.Suppress(pp.Literal("}")).setParseAction(popMsgName) 
messageName = ident.setParseAction(pushMsgName) 

這裏是完整的代碼:

import pyparsing as pp 

msgNameStack = [] 


def pushMsgName(str, loc, tokens): 
    msgNameStack.append(tokens[0]) 
    tokens[0] = '.'.join(msgNameStack) 


def popMsgName(str, loc, tokens): 
    msgNameStack.pop() 

openBrace = pp.Suppress(pp.Literal("{")) 
closeBrace = pp.Suppress(pp.Literal("}")).setParseAction(popMsgName) 
ident = pp.Word(pp.alphanums + "_" + ".") 
comment = pp.Literal("//") + pp.restOfLine 
messageName = ident.setParseAction(pushMsgName) 
messageKw = pp.Suppress(pp.Keyword("msg")) 
text = pp.Word(pp.alphanums + "_" + "." + "-" + "+") 
otherText = ~messageKw + pp.Suppress(text) 
messageExpr = pp.Forward() 
messageExpr << (messageKw + messageName + openBrace + 
       pp.ZeroOrMore(otherText) + pp.ZeroOrMore(messageExpr) + 
       pp.ZeroOrMore(otherText) + closeBrace).ignore(comment) 

testStr = "msg msgName1 { some text msg msgName2 { some text } some text }" 
print messageExpr.parseString(testStr)