基本上,它速度慢,消耗的內存太多,因爲你的語法是非常低效的。
讓我們考慮的第二行:B = A:(1+2)
。它會嘗試解析此行是這樣的:
- term4
OR
term4然後term4。
- term3
AND
term3然後term3。
- term2
<>
term2,然後=
,然後NE
然後EQ
然後term2。
- term1 8個不同的運營商term1,then term1。
- 長期
+
來看,長期-
來看,長期:
項,然後項。
- 因子
*
因子,因子/
因子,因子MOD
因子,然後因子。
- 括號表達式,一元因子,函數,數字文字,字符串文字,ident。
的第一次嘗試是這樣的:
ident * factor + term < term1 <> term2 AND term3 OR term4
我跳過括號,一元,功能,數字和字符串,因爲它們不匹配A
- 雖然function
可能做,但它的定義不可用。現在,由於:
不匹配*
,它會在下一個嘗試:
ident/factor + term < term1 <> term2 AND term3 OR term4
ident MOD factor + term < term1 <> term2 AND term3 OR term4
ident + term < term1 <> term2 AND term3 OR term4
現在轉到下term1
:
ident * factor - term < term1 <> term2 AND term3 OR term4
ident/factor - term < term1 <> term2 AND term3 OR term4
ident MOD factor - term < term1 <> term2 AND term3 OR term4
ident - term < term1 <> term2 AND term3 OR term4
而接下來:
ident * factor : term < term1 <> term2 AND term3 OR term4
ident/factor : term < term1 <> term2 AND term3 OR term4
ident MOD factor : term < term1 <> term2 AND term3 OR term4
ident : term < term1 <> term2 AND term3 OR term4
啊哈!我們終於在term1上獲得了一場比賽!但(
不匹配<
,所以它必須嘗試下一個詞條2:
ident * factor + term > term1 <> term2 AND term3 OR term4
等等
所有,因爲每行的第一項每學期會始終保持一致!要匹配一個簡單的數字,它必須解析factor
2 * 2 * 5 * 9 * 4 * 4 = 2880次!
但這不是故事的一半!你看,因爲termX重複了兩次,它會在這兩個兩側重複所有這些東西。例如,對於A:(1+2)
的第一場比賽是這樣的:
ident : term < term1 <> term2 AND term3 OR term4
where ident = A
and term = (1+2)
這是不正確,所以它會嘗試>
代替<
,然後<=
,等等,等等
我把一個記錄版本下面的解析器。試着運行它,看看它試圖分析的所有東西。
同時,有很多關於如何編寫這些可用解析器的例子。使用sbaz
,嘗試:
sbaz install scala-devel-docs
然後看看doc/scala-devel-docs/examples/parsing
目錄Scala發行的內部,你會發現幾個例子。
這是你的解析器的版本(無function
),它記錄它會嘗試一切:
sealed trait Expression
case class Variable(id: String) extends Expression
case class Literal(s: String) extends Expression
case class Number(x: String) extends Expression
case class UnaryOp(op: String, rhs: Expression) extends Expression
case class BinaryOp(op: String, lhs: Expression, rhs: Expression) extends Expression
object TestParser extends scala.util.parsing.combinator.syntactical.StdTokenParsers {
import scala.util.parsing.combinator.lexical.StdLexical
type Tokens = StdLexical
val lexical = new StdLexical
lexical.delimiters ++= List("(", ")", "+", "-", "*", "/", "=", "OR", "AND", "NE", "EQ", "LT", "GT", "LE", "GE", ":", "MOD")
def stmts: Parser[Any] = log(expr.*)("stmts")
def stmt: Parser[Expression] = log(expr <~ "\n")("stmt")
def expr: Parser[Expression] = log(term5)("expr")
def term5: Parser[Expression] = (
log((term4 ~ "OR" ~ term4) ^^ { case lhs~o~rhs => BinaryOp("OR", lhs, rhs) })("term5 OR")
| log(term4)("term5 term4")
)
def term4: Parser[Expression] = (
log((term3 ~ "AND" ~ term3) ^^ { case lhs~a~rhs => BinaryOp("AND", lhs, rhs) })("term4 AND")
| log(term3)("term4 term3")
)
def term3: Parser[Expression] = (
log((term2 ~ "<>" ~ term2) ^^ { case lhs~ne~rhs => BinaryOp("NE", lhs, rhs) })("term3 <>")
| log((term2 ~ "=" ~ term2) ^^ { case lhs~eq~rhs => BinaryOp("EQ", lhs, rhs) })("term3 =")
| log((term2 ~ "NE" ~ term2) ^^ { case lhs~ne~rhs => BinaryOp("NE", lhs, rhs) })("term3 NE")
| log((term2 ~ "EQ" ~ term2) ^^ { case lhs~eq~rhs => BinaryOp("EQ", lhs, rhs) })("term3 EQ")
| log(term2)("term3 term2")
)
def term2: Parser[Expression] = (
log((term1 ~ "<" ~ term1) ^^ { case lhs~lt~rhs => BinaryOp("LT", lhs, rhs) })("term2 <")
| log((term1 ~ ">" ~ term1) ^^ { case lhs~gt~rhs => BinaryOp("GT", lhs, rhs) })("term2 >")
| log((term1 ~ "<=" ~ term1) ^^ { case lhs~le~rhs => BinaryOp("LE", lhs, rhs) })("term2 <=")
| log((term1 ~ ">=" ~ term1) ^^ { case lhs~ge~rhs => BinaryOp("GE", lhs, rhs) })("term2 >=")
| log((term1 ~ "LT" ~ term1) ^^ { case lhs~lt~rhs => BinaryOp("LT", lhs, rhs) })("term2 LT")
| log((term1 ~ "GT" ~ term1) ^^ { case lhs~gt~rhs => BinaryOp("GT", lhs, rhs) })("term2 GT")
| log((term1 ~ "LE" ~ term1) ^^ { case lhs~le~rhs => BinaryOp("LE", lhs, rhs) })("term2 LE")
| log((term1 ~ "GE" ~ term1) ^^ { case lhs~ge~rhs => BinaryOp("GE", lhs, rhs) })("term2 GE")
| log(term1)("term2 term1")
)
def term1: Parser[Expression] = (
log((term ~ "+" ~ term) ^^ { case lhs~plus~rhs => BinaryOp("+", lhs, rhs) })("term1 +")
| log((term ~ "-" ~ term) ^^ { case lhs~minus~rhs => BinaryOp("-", lhs, rhs) })("term1 -")
| log((term ~ ":" ~ term) ^^ { case lhs~concat~rhs => BinaryOp(":", lhs, rhs) })("term1 :")
| log(term)("term1 term")
)
def term: Parser[Expression] = (
log((factor ~ "*" ~ factor) ^^ { case lhs~times~rhs => BinaryOp("*", lhs, rhs) })("term *")
| log((factor ~ "/" ~ factor) ^^ { case lhs~div~rhs => BinaryOp("/", lhs, rhs) })("term /")
| log((factor ~ "MOD" ~ factor) ^^ { case lhs~div~rhs => BinaryOp("MOD", lhs, rhs) })("term MOD")
| log(factor)("term factor")
)
def factor: Parser[Expression] = (
log("(" ~> expr <~ ")")("factor (expr)")
| log(("+" | "-") ~ factor ^^ { case op~rhs => UnaryOp(op, rhs) })("factor +-")
//| function |
| log(numericLit ^^ { case x => Number(x/*.toFloat*/) })("factor numericLit")
| log(stringLit ^^ { case s => Literal(s) })("factor stringLit")
| log(ident ^^ { case id => Variable(id) })("factor ident")
)
def parse(s: String) = stmts(new lexical.Scanner(s))
}
什麼是測試案例?你使用什麼語法分析器? – 2011-12-19 14:18:40
我正在使用StdTokenParsers。 – 2011-12-19 14:36:14
測試用例:的println(TestParser.parse( \t 「\ NA = \」 MY測試字符串\ 「」 + \t 「\ NB = A:(1 + 2)」 + \t 「\ NC = 3」 + \t 「\ ND = -4」 + \t 「\ NE = 3 + 1 /(5 - 4)」 + \t 「\ NF = -5 + 1」 + \t「\ NG = -3 + 1 /( -5 - -3)「+ \t」\ nH = 4.99「+ \t」\ nI = -4.99「)) – 2011-12-19 14:38:50