2017-07-07 56 views
1

我想創建一個簡單的HOCON解析器(從現有的JSON之一開始)。Antlr4 - 沒有可行的替代輸入

語法定義爲:

/** Taken from "The Definitive ANTLR 4 Reference" by Terence Parr */ 

// Derived from http://json.org 
grammar HOCON; 

hocon 
    : value 
    | pair 
    ; 

obj 
    : object_begin pair (','? pair)* object_end 
    | object_begin object_end 
    ; 

pair 
    : STRING KV? value {fmt.Println("pairstr",$STRING.GetText())} 
    | KEY KV? value {fmt.Println("pairkey",$KEY.GetText())} 
    ; 

array 
    : array_begin value (',' value)* array_end 
    | array_begin array_end 
    ; 

value 
    : STRING {fmt.Println($STRING.GetText())} 
    | REFERENCE {fmt.Println($REFERENCE.GetText())} 
    | RAWSTRING {fmt.Println($RAWSTRING.GetText())} 
    | NUMBER {fmt.Println($NUMBER.GetText())} 
    | obj 
    | array 
    | 'true' 
    | 'false' 
    | 'null' 
    ; 

COMMENT 
    : '#' ~('\r' | '\n')* -> skip 
    ; 

STRING 
    : '"' (ESC | ~ ["\\])* '"' 
    | '\'' (ESC | ~ ['\\])* '\'' 
    ; 

RAWSTRING 
    : (ESC | ALPHANUM)+ 
    ; 

KEY 
    : ('.' | ALPHANUM | '-')+ 
    ; 

REFERENCE 
    : '${' (ALPHANUM|'.')+ '}' 
    ; 

fragment ESC 
    : '\\' (["\\/bfnrt] | UNICODE) 
    ; 


fragment UNICODE 
    : 'u' HEX HEX HEX HEX 
    ; 

fragment ALPHANUM 
    : [0-9a-zA-Z] 
    ; 

fragment HEX 
    : [0-9a-fA-F] 
    ; 

KV 
    : [=:] 
    ; 

array_begin 
    : '[' { fmt.Println("BEGIN [") } 
    ; 

array_end 
    : ']' { fmt.Println("] END") } 
    ; 

object_begin 
    : '{' { fmt.Println("OBJ {") } 
    ; 

object_end 
    : '}' { fmt.Println("} OBJ") } 
    ; 

NUMBER 
    : '-'? INT '.' [0-9] + EXP? | '-'? INT EXP | '-'? INT 
    ; 

fragment INT 
    : '0' | [1-9] [0-9]* 
    ; 

// no leading zeros 

fragment EXP 
    : [Ee] [+\-]? INT 
    ; 

// \- since - means "range" inside [...] 

WS 
    : [ \t\n\r] + -> skip 
    ; 

錯誤是:

line 2:2 no viable alternative at input '{journal' 
pairkey akka.persistence 

樣品輸入,使該錯誤是:

akka.persistence { 
    journal { 
    # Absolute path to the journal plugin configuration entry used by 
    # persistent actor or view by default. 
    # Persistent actor or view can override `journalPluginId` method 
    # in order to rely on a different journal plugin. 
    plugin = "" 
    } 
} 

但是如果我將更新它使用帶引號的字符串:

akka.persistence { 
    'journal' { 
    # Absolute path to the journal plugin configuration entry used by 
    # persistent actor or view by default. 
    # Persistent actor or view can override `journalPluginId` method 
    # in order to rely on a different journal plugin. 
    'plugin' = "" 
    } 
} 

一切按預期工作。

看起來我錯過了KEY定義中的一些東西,但我無法真正發現究竟是什麼。

的Go代碼來測試它是:

package main 

import (
    "github.com/antlr/antlr4/runtime/Go/antlr" 
    "go-hocon/parser" 
) 

func main() { 
    is, _ := antlr.NewFileStream("test/simple1.conf") 

    lex := parser.NewHOCONLexer(is) 
    p := parser.NewHOCONParser(antlr.NewCommonTokenStream(lex, 0)) 
    p.BuildParseTrees = true 
    p.Hocon() 
} 

回答

1

你的第一個輸入使雜誌法作爲RAWSTRING

[@0,0:15='akka.persistence',<KEY>,1:0] 
[@1,17:17='{',<'{'>,1:17] 
[@2,22:28='journal',<RAWSTRING>,2:2] 
[@3,30:30='{',<'{'>,2:10] 
[@4,277:282='plugin',<RAWSTRING>,7:4] 
[@5,284:284='=',<KV>,7:11] 
[@6,286:287='""',<STRING>,7:13] 
[@7,292:292='}',<'}'>,8:2] 
[@8,295:295='}',<'}'>,9:0] 
[@9,298:297='<EOF>',<EOF>,10:0] 
line 2:2 no viable alternative at input '{journal' 

在另一方面,'雜誌' LEXES作爲一個字符串,但那些單引號,你顯然不希望:

[@0,0:15='akka.persistence',<KEY>,1:0] 
[@1,17:17='{',<'{'>,1:17] 
[@2,22:30=''journal'',<STRING>,2:2] <-- now it's a string implicit token 
[@3,32:32='{',<'{'>,2:12] 
[@4,279:284='plugin',<RAWSTRING>,7:4] 
[@5,286:286='=',<KV>,7:11] 
[@6,288:289='""',<STRING>,7:13] 
[@7,294:294='}',<'}'>,8:2] 
[@8,297:297='}',<'}'>,9:0] 
[@9,300:299='<EOF>',<EOF>,10:0] 
line 7:4 no viable alternative at input '{plugin' 
line 8:2 mismatched input '}' expecting {'true', 'false', 'null', '[', '{', STRING, RAWSTRING, REFERENCE, KV, NUMBER} 

爲什麼?由於詞法分析器規則以下列方式綁定: 1.首先匹配最長輸入。 2.匹配隱式標記(如'journal') 3.如果輸入匹配的長度相等,則基於詞法分析器規則的順序進行匹配。

在你的情況下,把'journal'作爲一個隱含的標記,所以它似乎工作正常。但是因爲這些單引號,這使得它按規則2匹配上面不帶引號的唯一,這兩個標記被匹配爲RAWSTRING,這不符合規則

pair 
    : STRING KV? value //{fmt.Println("pairstr",$STRING.GetText())} 

因此錯誤。

如何解決?嗯,我顛倒了詞法規則:

RAWSTRING 
    : (ESC | ALPHANUM)+ 
    ; 

STRING 
    : '"' (ESC | ~ ["\\])* '"' 
    | '\'' (ESC | ~ ['\\])* '\'' 
    ; 

而改變pair

pair 
    : RAWSTRING KV? value //{fmt.Println("pairstr",$STRING.GetText())} 

現在它解析罰款:

[@0,0:15='akka.persistence',<KEY>,1:0] 
[@1,17:17='{',<'{'>,1:17] 
[@2,22:28='journal',<RAWSTRING>,2:2] 
[@3,30:30='{',<'{'>,2:10] 
[@4,277:282='plugin',<RAWSTRING>,7:4] 
[@5,284:284='=',<KV>,7:11] 
[@6,286:287='""',<STRING>,7:13] 
[@7,292:292='}',<'}'>,8:2] 
[@8,295:295='}',<'}'>,9:0] 
[@9,298:297='<EOF>',<EOF>,10:0] 
+0

謝謝!我創建了repo https://github.com/jdevelop/go-hocon - 我可以通過將'rawstring'從令牌轉換爲解析器來解決這個問題。這有助於 - 但是我現在正在與測試案例交戰,應該將原始字符串作爲對象中的值捕獲。例如,如果第11行更新時不帶引號,則此測試[config](https://github.com/jdevelop/go-hocon/blob/master/test/simple1.conf)失敗。 – jdevelop

相關問題