2013-03-29 37 views
2

早些時候,我創建了this question問如何使用ANTLR 4創建if/else語句。我得到了一個很好的答案,它也展示了while循環如何做。我已經用我的語言實現了這一點,現在我正嘗試使用幾乎相同的原則創建一個do-while循環。做同時和同時使用ANTLR

我的語法如下的while循環:

count is 0 
while count is less than 10 
    count+ 
    if count not equals 10 
    write " " + count + ": Getting there..." 
    else if count equals 10 
    write count + ": The end!" 
    end if 
end while 

這是我想什麼做-while循環:

count is 0 
do 
    count+ 
    write "count is " + count 
    if count equals 10 
    write "The end!" 
    end if 
while count is less than 10 

我已經測試過它,他們都工作,但是,而不是在同一時間。以下是我的語法(抱歉發佈所有內容,但我認爲這是必要的)。

如果我的WHILEEND_WHILE令牌高於我的DO_WHILEDO_WHILE_CONDITION令牌,則while循環有效。但是,如果我將它們切換到我的do-while循環工作。如果我將DO_WHILE_CONDITION令牌更改爲而不是,則兩者都可以使用。

無論如何,我可以讓他們都與當前的語法工作?我知道這可能是一個問題,因爲我使用相同的關鍵字來處理多件事情,但我希望有一種方法可以做到這一點。

////////////////////////////////// 
// PARSER 
////////////////////////////////// 

program 
: block EOF 
; 

block 
: (statement (NEW_LINE+ | EOF))* 
; 

statement 
: assignment 
| if_statement 
| while_statement 
| until_statement 
| do_while_statement 
| write 
; 

assignment 
: ID ASSIGN expression # expressionAssignment 
| ID PLUS    # incrementAssignment 
| ID MINUS    # decrementAssignment 
; 

if_statement 
: IF condition_block (ELSE_IF condition_block)* (ELSE NEW_LINE statement_block)? END_IF 
; 

condition_block 
: expression NEW_LINE statement_block 
; 

statement_block 
: block 
; 

while_statement 
: WHILE expression NEW_LINE statement_block END_WHILE 
; 

until_statement 
: UNTIL expression NEW_LINE statement_block END_UNTIL 
; 

do_while_statement 
: DO_WHILE NEW_LINE statement_block DO_WHILE_CONDITION expression 
; 

expression 
: atom            # atomExpression 
| expression PLUS expression      # plusExpression 
| expression MINUS expression      # minusExpression 
| expression MULTIPLY expression     # multiplicationExpression 
| expression DIVIDE expression      # divisionExpression 
| expression PLUS         # incrementExpression 
| expression MINUS         # decrementExpression 
| expression AND expression      # andExpression 
| expression OR expression       # orExpression 
| expression EQUALS expression      # equalityExpression 
| expression NOT_EQUALS expression     # notEqualityExpression 
| expression LESS_THAN expression     # lessThanExpression 
| expression NOT_LESS_THAN expression    # notLessThanExpression 
| expression GREATER_THAN expression    # greaterThanExpression 
| expression NOT_GREATER_THAN expression   # notGreaterThanExpression 
| expression GREATER_THAN_OR_EQUAL expression  # greaterThanOrEqualExpression 
| expression LESS_THAN_OR_EQUAL expression   # lessThanOrEqualExpression 
; 

atom 
: INT        # integerAtom 
| FLOAT       # floatAtom 
| BOOLEAN       # boolAtom 
| ID        # idAtom 
| STRING       # stringAtom 
| OPEN_PAR expression CLOSE_PAR # expressionAtom 
; 

write 
: WRITE expression 
; 

////////////////////////////////// 
// LEXER 
////////////////////////////////// 

PLUS      : '+'; 
MINUS      : '-'; 
MULTIPLY     : '*'; 
DIVIDE      : '/'; 

ASSIGN      : 'is'; 
OPEN_CURLY     : '{'; 
CLOSE_CURLY     : '}'; 
OPEN_PAR     : '('; 
CLOSE_PAR     : ')'; 
COLON      : ':'; 
NEW_LINE     : '\r'? '\n'; 

IF       : 'if'; 
ELSE_IF      : 'else if'; 
ELSE      : 'else'; 
END_IF      : 'end if'; 

WHILE      : 'while'; 
END_WHILE     : 'end while'; 

UNTIL      : 'until'; 
END_UNTIL     : 'end until'; 

DO_WHILE     : 'do'; 
DO_WHILE_CONDITION   : 'while'; 

EQUALS      : 'equals'; 
NOT_EQUALS     : 'not equals'; 
LESS_THAN     : 'is less than'; 
NOT_LESS_THAN    : 'is not less than'; 
GREATER_THAN    : 'is greater than'; 
NOT_GREATER_THAN   : 'is not greater than'; 
GREATER_THAN_OR_EQUAL  : 'is greater than or equals'; 
LESS_THAN_OR_EQUAL   : 'is less than or equals'; 
WRITE      : 'write'; 

AND       : 'and'; 
OR       : 'or'; 
NOT       : 'not'; 

BOOLEAN 
: 'TRUE' | 'true' | 'YES' | 'yes' 
| 'FALSE' | 'false' | 'NO' | 'no' 
; 

INT 
: (PLUS | MINUS)? NUMBER+ 
; 

FLOAT 
: (PLUS | MINUS)? NUMBER+ ('.' | ',') (NUMBER+)? 
| (PLUS | MINUS)? (NUMBER+)? ('.' | ',') NUMBER+ 
; 

NUMBER 
: '0'..'9' 
; 

STRING 
: '"' ('\\"' | ~["])* '"' 
; 

ID 
: ('a'..'z' | 'A'..'Z' | '0'..'9')+ 
; 

WHITESPACE 
: [ \t]+ -> skip 
; 

COMMENT 
: (';;' .*? ';;' | ';' ~[\r\n]*) -> skip 
; 

回答

1

當創建令牌,詞法分析器不顧及什麼解析器可能需要在某一點上。檢查這種問答&一個描述規則(對於V3和V4):Antlr v3 error with parser/lexer rules

這意味着,在你的情況下,規則DO_WHILE_CONDITION

WHILE      : 'while'; 
... 
DO_WHILE_CONDITION   : 'while'; 

永遠不會被匹配。

除此之外,將關鍵字與「空格」互相粘接通常不是一個好主意。考慮何時輸入是"end  if"(2個空格)。更好地創建2個令牌:ENDIF,並在解析器規則中使用它們。

嘗試這樣:

program 
: block 
; 

block 
: NEW_LINE* (statement (NEW_LINE+ | EOF))* 
; 

statement 
: assignment 
| if_statement 
| while_statement 
| until_statement 
| do_while_statement 
| write 
; 

assignment 
: ID IS expression # expressionAssignment 
| ID PLUS   # incrementAssignment 
| ID MINUS   # decrementAssignment 
; 

if_statement 
: IF condition_block (ELSE IF condition_block)* (ELSE NEW_LINE statement_block)? END IF 
; 

condition_block 
: expression NEW_LINE statement_block 
; 

statement_block 
: block 
; 

while_statement 
: WHILE expression NEW_LINE statement_block END WHILE 
; 

until_statement 
: UNTIL expression NEW_LINE statement_block END UNTIL 
; 

do_while_statement 
: DO NEW_LINE statement_block WHILE expression 
; 

// Added unary expressions instead of combining them in the lexer. 
expression 
: atom           # atomExpression 
| MINUS expression        # unaryMinusExpression 
| PLUS expression         # unaryPlusExpression 
| expression PLUS expression      # plusExpression 
| expression MINUS expression      # minusExpression 
| expression MULTIPLY expression     # multiplicationExpression 
| expression DIVIDE expression     # divisionExpression 
| expression PLUS         # incrementExpression 
| expression MINUS        # decrementExpression 
| expression AND expression      # andExpression 
| expression OR expression      # orExpression 
| expression EQUALS expression     # equalityExpression 
| expression NOT EQUALS expression    # notEqualityExpression 
| expression IS LESS THAN expression    # lessThanExpression 
| expression IS NOT LESS THAN expression   # notLessThanExpression 
| expression IS GREATER THAN expression   # greaterThanExpression 
| expression IS NOT GREATER THAN expression  # notGreaterThanExpression 
| expression IS GREATER THAN OR EQUALS expression # greaterThanOrEqualExpression 
| expression IS LESS THAN OR EQUALS expression # lessThanOrEqualExpression 
; 

atom 
: INT        # integerAtom 
| FLOAT       # floatAtom 
| bool        # boolAtom 
| ID        # idAtom 
| STRING       # stringAtom 
| OPEN_PAR expression CLOSE_PAR # expressionAtom 
; 

write 
: WRITE expression 
; 

// By making this a parser rule, you needn't inspect the lexer rule 
// to see if it's true or false. 
bool 
: TRUE 
| FALSE 
; 

////////////////////////////////// 
// LEXER 
////////////////////////////////// 

PLUS      : '+'; 
MINUS      : '-'; 
MULTIPLY     : '*'; 
DIVIDE      : '/'; 

OPEN_CURLY     : '{'; 
CLOSE_CURLY     : '}'; 
OPEN_PAR     : '('; 
CLOSE_PAR     : ')'; 
COLON      : ':'; 
NEW_LINE     : '\r'? '\n'; 

IF       : 'if'; 
ELSE      : 'else'; 
END       : 'end'; 
WHILE      : 'while'; 
UNTIL      : 'until'; 
DO       : 'do'; 
EQUALS      : 'equals'; 
NOT       : 'not'; 
IS       : 'is'; 
LESS      : 'less'; 
THAN      : 'than'; 
GREATER      : 'greater'; 
WRITE      : 'write'; 
AND       : 'and'; 
OR       : 'or'; 

TRUE : 'TRUE' | 'true' | 'YES' | 'yes'; 
FALSE : 'FALSE' | 'false' | 'NO' | 'no'; 

INT 
: DIGIT+ 
; 

// (DIGIT+)? is the same as: DIGIT* 
FLOAT 
: DIGIT+ [.,] DIGIT* 
| DIGIT* [.,] DIGIT+ 
; 

// If a rule can never become a token on its own (an INT will always 
// be created instead of a DIGIT), mark it as a 'fragment'. 
fragment DIGIT 
: [0-9] 
; 

// Added support for escaped backslashes. 
STRING 
: '"' ('\\"' | '\\\\' | ~["\\])* '"' 
; 

// Can it start with a digit? Maybe this is better: [a-zA-Z] [a-zA-Z0-9]* 
ID 
: [a-zA-Z0-9]+ 
; 

WHITESPACE 
: [ \t]+ -> skip 
; 

COMMENT 
: (';;' .*? ';;' | ';' ~[\r\n]*) -> skip 
; 

哪個解析器都同時-結構沒有問題。另請注意,我對語法進行了細微調整(請參閱內聯評論)。一元表達式很重要,否則1-2將被標記爲2 INT標記,在解析器中無法與expression匹配!

+0

謝謝!這解決了這個問題,這對語法來說確實是一些很大的修正。我沒有想過需要分割使用空格到多個令牌以及一元表達式的令牌。我很高興你抓到了。我不完全確定究竟是什麼解決了while/do-while問題。這兩個表達式是否使用相同的標記? – simonbs

+0

@simonbs,'do-while'解析器規則使用了標記'DO_WHILE_CONDITION',但是這個標記永遠不會被詞法分析器創建,因爲規則'WHILE'匹配相同的字符*和*它放在'DO_WHILE_CONDITION'之前。 –

+0

好的,我明白了。謝謝!在附註中我發現規則必須是'program:block;塊:NEW_LINE *(語句(NEW_LINE + | EOF))*'如果輸入沒有以新行結束,''有'EOF'兩次使其期待兩次。 – simonbs