我正在使用Antlr4和python3運行時。在我嘗試分析語言,有許多操作(約50)接受形式的參數固定數量OPNAME [ parameter1, parameter2, parameter3 ]
正確解析antlr4中的n個參數
我曾經有這樣的規則語法:
statement: OP1 '[' NUM ']'
| OP2 '[' NUM ',' NUM ']'
| OP3 '[' NUM ',' NUM ',' NUM ']'
| OP2or3 (('[' NUM ',' NUM ']')|('[' NUM ',' NUM ',' NUM ']'))
;
但是,爲了更清楚起見,我決定創建一個,它完全接受n
參數。因此,我的(完整的例子)的語法如下:
grammar test;
program: (statement? NEWLINE)* EOF;
statement: OP1 parameter[1]
| OP2 parameter[2]
| OP3 parameter[3]
| OP2or3 (parameter[2]|parameter[3])
;
parameter[n]
locals[i = 1]
: '[' NUM
(',' NUM {$i += 1})*
']'
{$i == $n}?
;
OP1 : 'OP1' ;
OP2 : 'OP2' ;
OP3 : 'OP3' ;
OP2or3 : 'OP2or3' ;
NUM : ('0'..'9')+;
NEWLINE : '\n' ;
WS : [ \t\r] -> channel(1);
以下testfile.txt
運行這個語法幾乎工程。我在OP1,OP2和OP3中測試了更多或更少的參數,並且如果我沒有完全相應的參數數量,則失敗。 然而,這不適用於OP2or3,它總是失敗的3個參數。我猜Antlr解析器試圖首先檢查2個參數,失敗謂詞,然後無法正確回溯(錯誤消息是Error at [5:16] : rule parameter failed predicate: {$i == $n}?
)。的testfile.txt
內容:
OP1 [1]
OP2 [32, 52]
OP3 [1, 2, 3]
OP2or3 [1, 2]
OP2or3 [1, 2, 3]
我試圖與在門口斷言一個更明確的規則來代替,但仍然無法正常工作(錯誤消息是Error at [5:7] : no viable alternative at input '['
)
parameter[n]
: {$n == 1}? '[' NUM ']'
| {$n == 2}? '[' NUM ',' NUM ']'
| {$n == 3}? '[' NUM ',' NUM ',' NUM ']'
;
的信息,這裏是我用它來測試我的語法Python代碼:
import codecs
from antlr4 import *
from antlr4.error.ErrorListener import ErrorListener
from testParser import testParser as Parser
from testLexer import testLexer as Lexer
class SimpleErrorThrower(ErrorListener):
def syntaxError(self, recognizer, offendingSymbol, line, column, msg, e):
msg = msg.replace('\n', '\\n')
raise RuntimeError("Error at [%s:%s] : %s" % (line, column, msg))
def load_code(filename):
return codecs.decode(open(filename, 'rb').read(), 'utf-8')
def ParseFromRule(input_string, rule_to_call='program'):
'''Try to parse a given string (case insensitive) from a given rule.
Raises 'AttrivuteError' if rule does not exist.
Raises 'ParsingException' if parsing failed.
Returns the parse tree if parsing was successfull.'''
source = InputStream(input_string)
lexer = Lexer(source)
stream = CommonTokenStream(lexer)
parser = Parser(stream)
parser.removeErrorListeners()
parser.addErrorListener(SimpleErrorThrower())
parseTree = getattr(parser, rule_to_call)()
return parseTree
if __name__ == '__main__':
from argparse import ArgumentParser
args = ArgumentParser()
args.add_argument("-p", "--print", help="Print resulting tree.", action='store_true')
args.add_argument("filename", metavar="Source filename", help="file containing the code to test.", type=str)
options = args.parse_args()
input_string = load_code(options.filename)
try:
tree = ParseFromRule(input_string, 'program')
except RuntimeError as e:
print(str(e))
exit(1)
if options.print:
print(tree.toStringTree(recog=tree.parser))
這裏是我的Makefile
:
ANTLR_CP=/usr/local/bin/antlr-4.5.1-complete.jar
ANTLR=java -Xmx500M -cp "$(ANTLR_CP):$$CLASSPATH" org.antlr.v4.Tool
all: testParser.py
clean:
rm -f *Lexer.py *Listener.py *Parser.py *.tokens *.pyc
testParser.py: *.g4
$(ANTLR) -Dlanguage=Python3 test.g4
你有什麼想法,如果我可以做一個規則parameter[n]
,也將用於OP2or3
工作?有了這樣子規則真正清晰的幫助,上會改變很多時候(一些運營商在添加或刪除每隔幾個月)