2015-04-28 42 views
1

我需要使用正則表達式來在Scala中匹配一個模式,我現在有一個正則表達式是Scala的正則表達式匹配

InputPattern: scala.util.matching.Regex = put (.*) in (.*) 

當我運行follwing出現這種情況:

scala> val InputPattern(verb, item, prep, obj) = "put a in b"; 
scala.MatchError: put a in b (of class java.lang.String) 
... 33 elided 

我希望它最終以verb("put"), item("a"), prep("in"), and obj("b")爲輸入「put a in b」並且verb("put"), item(""), prep("in"), and obj("")爲輸入「放入」

感謝

+0

我猜你需要用4個捕獲組,例如模式'(M)^(\ S *)\ S *(\ S *?)\ S *(\ S *)\ S *(\ S *?)$'。看看演示:https://regex101.com/r/mC5eI5/1 –

回答

1

這適用於你的特殊情況:

scala> val InputPattern = "(put) (.*?) ?(in) ?(.*?)".r 
InputPattern: scala.util.matching.Regex = (put) (.*) ?(in) ?(.*) 

scala> val InputPattern(verb, item, prep, obj) = "put a in b" 
verb: String = put 
item: String = a 
prep: String = in 
obj: String = b 

scala> val InputPattern(verb, item, prep, obj) = "put in" 
verb: String = put 
item: String = "" 
prep: String = in 
obj: String = "" 

putin這裏在團體參與模式匹配還抓獲。我也用懶惰正則表達式(.*?)儘可能少的捕獲,你可以用(\S*)替換它。 ?爲您提供了可選空間,以匹配 「放入」(putin之間的一個空格,並且末尾沒有空格)。

但要注意的是:

scala> val InputPattern(verb, item, prep, obj) = "put ainb" 
verb: String = put 
item: String = a 
prep: String = in 
obj: String = b 

scala> val InputPattern(verb, item, prep, obj) = "put aininb" 
verb: String = put 
item: String = a 
prep: String = in 
obj: String = inb 

scala> val InputPattern(verb, item, prep, obj) = "put ain" 
verb: String = put 
item: String = a 
prep: String = in 
obj: String = "" 

如果你有簡單的命令解釋器可能就算不錯了,否則你應分別符合您的特殊情況。

要處理一個簡單的(不自然)語言,你也可以考慮StandardTokenParsers,因爲它們是上下文無關(Chomsky type 2):

import scala.util.parsing.combinator.syntactical._ 

val p = new StandardTokenParsers { 
    lexical.reserved ++= List("put", "in") 
    def p = "put" ~ opt(ident) ~ "in" ~ opt(ident) 
} 

scala> p.p(new p.lexical.Scanner("put a in b")) 
warning: there was one feature warning; re-run with -feature for details 
res13 = [1.11] parsed: (((put~Some(a))~in)~Some(b)) 

scala> p.p(new p.lexical.Scanner("put in")) 
warning: there was one feature warning; re-run with -feature for details 
res14 = [1.7] parsed: (((put~None)~in)~None) 
1

你可以寫所有情形之一的正則表達式,但我不知道這將是可讀性和可維護性。我更喜歡簡單的方法:

val pattern1 = "(put) (.*) (in) (.*)".r 
val pattern2 = "(put) (in)".r 
def parse(text: String) = text match { 
    case pattern1(verb, item, prep, obj) => (verb, item, prep, obj); 
    case pattern2(verb, prep) => (verb, "", prep, "") 
} 
scala> parse("put a in b") 
res6: (String, String, String, String) = (put,a,in,b) 

scala> parse("put in") 
res7: (String, String, String, String) = (put,"",in,"") 

和一個額外的概念:我希望你知道你在做什麼! RegEx是Chomsky Type 3 grammar,自然語言要複雜得多。如果您需要自然語言解析器,則可以使用已有的解決方案,如Stanford NLP parser

+0

謝謝!雖然這有效,但其他答案在我的代碼中效果更好。感謝「額外概念」,但這僅僅是一個簡單的大學作業與一個給定的變量命令列表。這是我第一次做Scala,我剛剛遇到了RegEx的一些小問題。 –