2013-06-30 82 views
2

我有一個形式爲name(sum(value1,sum(value2,value3)), "sumname")的文本,pyparsing返回相應的令牌,但是,我有興趣獲取真正的文本,我找不到如何。如何從pyparsing令牌返回原始文本

我已經嘗試setParseAction函數,但由於它只返回字符串和位置,我無法應付尾隨部分。 一樣,我只會越來越:

"sum(value2,value3)), "sumname")" 
"sum(value1,sum(value2,value3)), "sumname")" 
"name(sum(value1,sum(value2,value3)), "sumname")" 

這是不理想的,我不想手動重新解析字符串來獲得實際的原始字符串。

我想大氣壓的方法是:

tokens = grammar.parseString(target_string) 
print >>sys.stderr, pyparsing.originalTextFor(tokens) 

但是這並沒有真正的工作:

AttributeError: 'NoneType' object has no attribute 'setParseAction' 

回答

3

裹在pyparsing幫手originalTextFor你的表達。

from pyparsing import makeHTMLTags, originalTextFor 

sample = '<tag attr1="A1" attr2="B3">' 

openTag = makeHTMLTags('tag')[0] 

# the expression returned by makeHTMLTags parses the tag and 
# attributes into a list (along with a series of helpful 
# results names) 
print (openTag.parseString(sample).asList()) 

# prints 
# ['tag', ['attr1', 'A1'], ['attr2', 'B3'], False] 

# wrap in 'originalTextFor' to get back the original source text 
print (originalTextFor(openTag).parseString(sample).asList()) 

# prints 
# ['<tag attr1="A1" attr2="B3">'] 
+0

嗨,這我失敗了''tokens = grammar.parseString(target_string)' 'print >> sys.stderr,pyparsing.originalTextFor(tokens)''''''我懷疑這是因爲parseStrin g返回ParseResults而不是ParseElements –

+0

'originalTextFor'包裝語法,而不是結果。試試'originalTextFor(grammar).parseString(tokens)'。 – PaulMcG

+0

我明白了,但是,您始終提供原始字符串,而我所擁有的只是parseString的結果,而且我需要返回字符串。你傳遞一個字符串給parseString,是有道理的,但我沒有一個字符串,只是一個令牌集合。 –

0

根據您正在試圖通過獲得原匹配的文本來完成的,你最好使用scanStringtransformString找到更好的解決方案:

from pyparsing import makeHTMLTags, replaceWith 

sample = '<other><div></div><tag attr1="A1" attr2="B3"><something>' 
openTag = makeHTMLTags('tag')[0] 

# grammar.scanString is a generator, yielding tokens,start,end tuples 
# from the start:end values you can slice the original text from the 
# source string 
for tokens,start,end in openTag.scanString(sample): 
    print tokens.dump() 
    print sample[start:end] 

# if your goal in getting the original data is to do some kind of string 
# replacement, use transformString - here we convert all <TAG> tags to <REPLACE> tags 
print openTag.setParseAction(replaceWith("<REPLACE>")).transformString(sample) 

打印:

['tag', ['attr1', 'A1'], ['attr2', 'B3'], False] 
- attr1: A1 
- attr2: B3 
- empty: False 
- startTag: ['tag', ['attr1', 'A1'], ['attr2', 'B3'], False] 
    - attr1: A1 
    - attr2: B3 
    - empty: False 
    - tag: tag 
- tag: tag 
<tag attr1="A1" attr2="B3"> 
<other><div></div><REPLACE><something>