2017-06-30 82 views
0

我有這個ANTLR3語法,它需要一個名爲title的對象,以純文本形式構造一個dom結構。這裏是一個有效的樣本:Antlr語法沒有正確驗證

Here is titlepart 1; (##BOLD##this is bold inside a reference text##/BOLD##) 

下面是一個無效的標題,應該失敗(它沒有這就是爲什麼我張貼):

Here is titlepart 1;(reference text with no ending parenthesis 

這裏是我使用的語法:

grammar Title; 

options { 
    output = AST; 
    ASTLabelType=CommonTree; 
    backtrack=false; 
} 



tokens { 
LPAREN='('; 
RPAREN=')'; 
LCURLY='{'; 
RCURLY='}'; 
BOLDSTART='##BOLD##'; 
BOLDEND='##/BOLD##'; 
UNDERLINESTART='##UNDERLINE##'; 
UNDERLINEEND='##/UNDERLINE##'; 
SYMBOLSTART='##SYMBOL##'; 
SYMBOLEND='##/SYMBOL##'; 
SUBSCRIPTSTART='##SUBSCRIPT##'; 
SUBSCRIPTEND='##/SUBSCRIPT##'; 


SUPERSCRIPTSTART='##SUPERSCRIPT##'; 
SUPERSCRIPTEND='##/SUPERSCRIPT##'; 
IMAGESTART='##IMG##'; 
IMAGEEND='##/IMG##'; 
SEMICOLON=';'; 
} 


title: titlepart+; 

titlepart: ((bold|anytext|specialtext|underline|symbolref|subscript|superscript|image)+referencetext?(SEMICOLON|EOF)); 

ANYCHAR: ~(';' 
     | '(' 
     | '{' 
     | '}' 
     | ')'); 

anytext: ANYCHAR+; 
specialtext: LCURLY(bold|referencetext|anytext|underline|symbolref|superscript|subscript|SEMICOLON)*RCURLY; 
referencetext: LPAREN(referencepart+)RPAREN; 
referencepart: (anytext|underline|bold|symbolref|specialtext|superscript|subscript)+SEMICOLON?; 

superscript: SUPERSCRIPTSTART(anytext)*SUPERSCRIPTEND; 
image: IMAGESTART(anytext)*IMAGEEND; 
subscript: SUBSCRIPTSTART(anytext)*SUBSCRIPTEND; 
bold: BOLDSTART(anytext|underline|superscript|subscript)*BOLDEND; 
underline: UNDERLINESTART(anytext|bold|superscript|subscript)*UNDERLINEEND; 

symbolref: SYMBOLSTART(anytext)*SYMBOLEND; 

正如你所看到的參考文本對象需要一個結束paren,但如果我省略它,它不會失敗。

下面是分析的日誌:

enter ANYCHAR H line=1:0 
exit ANYCHAR e line=1:1 
enter title [@0,0:0='H',<4>,1:0] 
enter titlepart [@0,0:0='H',<4>,1:0] 
enter anytext [@0,0:0='H',<4>,1:0] 
enter ANYCHAR e line=1:1 
exit ANYCHAR r line=1:2 
enter ANYCHAR r line=1:2 
exit ANYCHAR e line=1:3 
enter ANYCHAR e line=1:3 
exit ANYCHAR line=1:4 
enter ANYCHAR line=1:4 
exit ANYCHAR i line=1:5 
enter ANYCHAR i line=1:5 
exit ANYCHAR s line=1:6 
enter ANYCHAR s line=1:6 
exit ANYCHAR line=1:7 
enter ANYCHAR line=1:7 
exit ANYCHAR t line=1:8 
enter ANYCHAR t line=1:8 
exit ANYCHAR i line=1:9 
enter ANYCHAR i line=1:9 
exit ANYCHAR t line=1:10 
enter ANYCHAR t line=1:10 
exit ANYCHAR l line=1:11 
enter ANYCHAR l line=1:11 
exit ANYCHAR e line=1:12 
enter ANYCHAR e line=1:12 
exit ANYCHAR p line=1:13 
enter ANYCHAR p line=1:13 
exit ANYCHAR a line=1:14 
enter ANYCHAR a line=1:14 
exit ANYCHAR r line=1:15 
enter ANYCHAR r line=1:15 
exit ANYCHAR t line=1:16 
enter ANYCHAR t line=1:16 
exit ANYCHAR line=1:17 
enter ANYCHAR line=1:17 
exit ANYCHAR 1 line=1:18 
enter ANYCHAR 1 line=1:18 
exit ANYCHAR ; line=1:19 
enter SEMICOLON ; line=1:19 
exit SEMICOLON (line=1:20 
exit anytext [@19,19:19=';',<13>,1:19] 
enter LPAREN (line=1:20 
exit LPAREN r line=1:21 
exit titlepart [@20,20:20='(',<10>,1:20] 
exit title [@20,20:20='(',<10>,1:20] 
2017-06-30 01:29:35,957 DEBUG [TitleConverter]:317 (<grammar title> (title (titlepart (anytex 
t H e r e i s t i t l e p a r t 1) ;))) 

正如你所看到的,它得到了(;後只是停止解析。有趣的是,如果我在;之後添加了一個空格,它會按預期失敗。誰能告訴我發生了什麼事?

+0

如果我移動EOF到標題末尾它突然解決沒有完全解析的問題,但我該如何定義一個以SEMICOLON或EOF結束的標題部分?這一點很重要的原因是因爲所有標題部分必須以除最後一個之外的SEMICOLON結尾!我如何定義? –

+0

'title:titlepart(SEMICOLON titlepart)* EOF;'並從'titlepart'中移除'(SEMICOLON | EOF)'。 –

回答

1

如果ANTLR有一個,那麼這應該是真正的常見問題解答。如果你想你的整個輸入解析然後添加結束標記,以您的主要規則(這是內置EOF令牌):

title: titlepart+ EOF;