2012-04-14 81 views
2

我們的編譯器理論類的最後一項任務是爲Java的一小部分(不是MiniJava)創建一個編譯器。我們的教授給了我們選擇使用我們希望的任何工具的選擇,經過大量的探索後,我決定使用ANTLR。我設法讓掃描器和解析器啓動並運行,並且解析器輸出一個AST。我現在試圖獲得一個樹語法文件來編譯。我理解的基本思想是從解析器中複製語法規則並消除大部分代碼,使重寫規則保留原位,但似乎並不想編譯(違規錯誤)。我在正確的軌道上嗎?我錯過了一些微不足道的東西嗎?ANTLR解析語法 - >樹語法

樹語法:

tree grammar J0_SemanticAnalysis; 

options { 
    language = Java; 
    tokenVocab = J0_Parser; 
    ASTLabelType = CommonTree; 
} 

@header 
{ 
    package ritterre.a4; 
    import java.util.Map; 
    import java.util.HashMap; 
} 

@members 
{ 

} 

walk 
    : compilationunit 
    ; 

compilationunit 
    : ^(UNIT importdeclaration* classdeclaration*) 
    ; 

importdeclaration 
    : ^(IMP_DEC IDENTIFIER+) 
    ; 

classdeclaration 
    : ^(CLASS IDENTIFIER ^(EXTENDS IDENTIFIER)? fielddeclaration* methoddeclaration*) 
    ; 

fielddeclaration 
    : ^(FIELD_DEC IDENTIFIER type visibility? STATIC?) 
    ; 

methoddeclaration 
    : ^(METHOD_DEC IDENTIFIER type visibility? STATIC? ^(PARAMS parameter+)? body) 
    ; 

visibility 
    : PRIVATE 
    | PUBLIC 
    ; 

parameter 
    : ^(PARAM IDENTIFIER type) 
    ; 

body 
    : ^(BODY ^(DECLARATIONS localdeclaration*) ^(STATEMENTS statement*)) 
    ; 

localdeclaration 
    : ^(DECLARATION type IDENTIFIER) 
    ; 

statement 
    : assignment  
    | ifstatement  
    | whilestatement 
    | returnstatement 
    | callstatement 
    | printstatement 
    | block 
    ; 

assignment 
    : ^(ASSIGN IDENTIFIER+ expression? expression) 
    ; 

ifstatement 
    : ^(IF relation statement ^(ELSE statement)?) 
    ; 

whilestatement 
    : ^(WHILE relation statement) 
    ; 

returnstatement 
    : ^(RETURN expression?) 
    ; 

callstatement 
    : ^(CALL IDENTIFIER+ expression+) 
    ; 

printstatement 
    : ^(PRINT expression) 
    ; 

block 
    : ^(STATEMENTS statement*) 
    ; 

relation 
// : expression (LTHAN | GTHAN | EQEQ | NEQ)^ expression 
    : ^(LTHAN expression expression) 
    | ^(GTHAN expression expression) 
    | ^(EQEQ expression expression) 
    | ^(NEQ expression expression) 
    ; 

expression 
// : (PLUS | MINUS)? term ((PLUS | MINUS)^ term)* 
    : ^(PLUS term term) 
    | ^(MINUS term term) 
    ; 

term 
// : factor ((MULT | DIV)^ factor)* 
    : ^(MULT factor factor) 
    | ^(DIV factor factor) 
    ; 

factor 
    : NUMBER 
    | IDENTIFIER (DOT IDENTIFIER | LBRAC expression RBRAC)? 
    | NULL 
    | NEW IDENTIFIER LPAREN RPAREN 
    | NEW (INT | IDENTIFIER) (LBRAC RBRAC)? 
    ; 

type 
    : (INT | IDENTIFIER) (LBRAC RBRAC)? 
    | VOID 
    ; 

分析器語法:

parser grammar J0_Parser; 

options 
{ 
    output = AST;    // Output an AST 
    tokenVocab = J0_Scanner; // Pull Tokens from Scanner 
    //greedy = true; // forcing this throughout?! success!! 
    //cannot force greedy true throughout. bad things happen and the parser doesnt build 
} 

tokens 
{ 
    UNIT; 
    IMP_DEC; 
    FIELD_DEC; 
    METHOD_DEC; 
    PARAMS; 
    PARAM; 
    BODY; 
    DECLARATIONS; 
    STATEMENTS; 
    DECLARATION; 
    ASSIGN; 
    CALL; 
} 

@header { package ritterre.a4; } 

// J0 - Extended Specification - EBNF 
parse 
    : compilationunit EOF -> compilationunit 
    ; 

compilationunit 
    : importdeclaration* classdeclaration* 
    -> ^(UNIT importdeclaration* classdeclaration*) 
    ; 

importdeclaration 
    : IMPORT IDENTIFIER (DOT IDENTIFIER)* SCOLON 
    -> ^(IMP_DEC IDENTIFIER+) 
    ; 

classdeclaration 
    : (PUBLIC)? CLASS n=IDENTIFIER (EXTENDS e=IDENTIFIER)? LBRAK (fielddeclaration|methoddeclaration)* RBRAK 
    -> ^(CLASS $n ^(EXTENDS $e)? fielddeclaration* methoddeclaration*) 
    ; 

fielddeclaration 
    : visibility? STATIC? type IDENTIFIER SCOLON 
    -> ^(FIELD_DEC IDENTIFIER type visibility? STATIC?) 
    ; 

methoddeclaration 
    : visibility? STATIC? type IDENTIFIER LPAREN (parameter (COMMA parameter)*)? RPAREN body 
    -> ^(METHOD_DEC IDENTIFIER type visibility? STATIC? ^(PARAMS parameter+)? body) 
    ; 

visibility 
    : PRIVATE 
    | PUBLIC 
    ; 

parameter 
    : type IDENTIFIER 
    -> ^(PARAM IDENTIFIER type) 
    ; 

body 
    : LBRAK localdeclaration* statement* RBRAK 
    -> ^(BODY ^(DECLARATIONS localdeclaration*) ^(STATEMENTS statement*)) 
    ; 

localdeclaration 
    : type IDENTIFIER SCOLON 
    -> ^(DECLARATION type IDENTIFIER) 
    ; 

statement 
    : assignment 
    | ifstatement 
    | whilestatement 
    | returnstatement 
    | callstatement 
    | printstatement 
    | block 
    ; 

assignment 
    : IDENTIFIER (DOT IDENTIFIER | LBRAC a=expression RBRAC)? EQ b=expression SCOLON 
    -> ^(ASSIGN IDENTIFIER+ $a? $b) 
    ; 

ifstatement 
    : IF LPAREN relation RPAREN statement (options {greedy=true;} : ELSE statement)? 
    -> ^(IF relation statement ^(ELSE statement)?) 
    ; 

whilestatement 
    : WHILE LPAREN relation RPAREN statement 
    -> ^(WHILE relation statement) 
    ; 

returnstatement 
    : RETURN expression? SCOLON 
    -> ^(RETURN expression?) 
    ; 

callstatement 
    : IDENTIFIER (DOT IDENTIFIER)? LPAREN (expression (COMMA expression)*)? RPAREN SCOLON 
    -> ^(CALL IDENTIFIER+ expression+) 
    ; 

printstatement 
    : PRINT LPAREN expression RPAREN SCOLON 
    -> ^(PRINT expression) 
    ; 

block 
    : LBRAK statement* RBRAK 
    -> ^(STATEMENTS statement*) 
    ; 

relation 
    : expression (LTHAN | GTHAN | EQEQ | NEQ)^ expression 
    ; 

expression 
    : (PLUS | MINUS)? term ((PLUS | MINUS)^ term)* 
    ; 

term 
    : factor ((MULT | DIV)^ factor)* 
    ; 

factor 
    : NUMBER 
    | IDENTIFIER (DOT IDENTIFIER | LBRAC expression RBRAC)? 
    | NULL 
    | NEW IDENTIFIER LPAREN RPAREN 
    | NEW (INT | IDENTIFIER) (LBRAC RBRAC)? 
    ; 

type 
    : (INT | IDENTIFIER) (LBRAC RBRAC)? 
    | VOID 
    ; 
+0

如果您還發布了整個錯誤消息,我們將能夠提供更多幫助:) – huon 2012-04-14 11:08:29

+0

不幸的是,Eclipse的ANTLR插件顯然沒有給出錯誤消息的行號。我所得到的是非常神祕的:java.lang.NoSuchFieldError:offendingToken。 – Eric 2012-04-14 11:16:15

+0

您是否嘗試刪除/註釋掉部分來縮小導致問題的原因? – huon 2012-04-14 11:25:05

回答

3

的問題是,在你的語法樹,請執行以下操作(3次,我相信):

classdeclaration 
    : ^(CLASS ... ^(EXTENDS IDENTIFIER)? ...) 
    ; 

^(EXTENDS IDENTIFIER)?部分是錯誤的:您需要將圓括號括起來,然後才讓我t可選:

classdeclaration 
    : ^(CLASS ... (^(EXTENDS IDENTIFIER))? ...) 
    ; 

但是,如果這就是全部,那會不是太簡單? :)

當你解決上面提到的問題時,ANTLR會抱怨樹語法在試圖從你的樹語法中生成一個tree-walker時是不明確的。 ANTLR會拋出以下向你:

error(211): J0_SemanticAnalysis.g:61:26: [fatal] rule assignment has non-LL(*) decision due to recursive rule invocations reachable from alts 1,2. Resolve by left-factoring or using syntactic predicates or using backtrack=true option.

它埋怨assignment規則在你的語法:

assignment 
    : ^(ASSIGN IDENTIFIER+ expression? expression) 
    ; 

因爲ANTLR是LL解析器生成,從左邊的解析令牌向右。因此,IDENTIFIER+ expression? expression中的可選表達式使得語法不明確。通過移動?最後expression解決這個問題:

assignment 
    : ^(ASSIGN IDENTIFIER+ expression expression?) 
    ; 

不要讓名字ANT 最後兩個字母LR誤導你,他們代表大號 anguage R ecognition,不是它生成的解析器類!

+0

非常感謝!一切都在建立,我可以進入下一個階段! – Eric 2012-04-14 21:49:32

+0

不客氣@Eric。 – 2012-04-16 11:56:32