2016-08-28 47 views
1

當寫的正則表達式我找到"""語法階的非常方便,因爲我可以在一個新的行寫我的正則表達式一步一步斯卡拉三重引號的字符串和評論

例如:

val foo = 
""" 
    (
    |(
    |\d{3} 
    || 
    |\(\d{3}\) 
    |)? 
    |(
    |\s|-|\. 
    |)? 
    |\d{3} 
    |(\s|-|\.) 
    |\d{4} 
    |(
    |\s* 
    |(
    |ext|x|extn|extn. 
    |) 
    |\s* 
    |\d{2,6} 
    |)? 
    |)""".stripMargin.replace("\n", "").r 

但是我希望我可以寫評論來解釋我在每條線上做什麼,如

val foo = 
    """(      // start group to capture the phone number 
     |(      // start of optional area code choices 
     |\d{3}     // bare three digits 
     ||      // or 
     |\(\d{3}\)    // three digits enclosed in parentheses 
     |)?      // end of optional area code choices 
     |(      // start of optional separator 
     |\s|-|\.     // start of optional separator 
     |)?      // separator can be whitespace, dash or period 
     |\d{3}     // exchange number (required) 
     |(\s|-|\.)    // same separator but required this time 
     |\d{4}     // final digits (required) 
     |(      // start of optional extension 
     |\s*      // zero or more characters of white space 
     |(      // start of extention indicator 
     |ext.|x.|ext.|extn.  // extention can be indicated by "ext", "x", or extn followed by any character 
     |)      // end of extension indicator 
     |\s*      // zero or more characters of white space 
     |\d{2,6}     // two to five digits of extension number 
     |)?      // end of optional estension 
     |)""".stripMargin.replace("\n", "").trim 
    println(foo) 
    val regex = foo.r 
    val input = "(888)-456-7890 extn: 12345" 
    regex.findAllIn(input).foreach(println) 

但是scala使得註釋成爲字符串本身的一部分。所以,我怎麼能寫註釋和多行字符串喜歡這裏的蟒蛇

verboseRegex = re.compile(r''' 
    (   # start group to capture the phone number 
    (   # start of optional area code choices 
    \d{3}   # bare three digits 
    |    # or 
    \(\d{3}\)  # three digits enclosed in parentheses 
    )?   # end of optional area code choices 
    (   # start of optional separator 
    \s|-|\.  # separator can be whitespace, dash or period 
    )?   # end of optional separator 
    \d{3}   # exchange number (required) 
    (\s|-|\.)  # same separator but required this time 
    \d{4}   # final digits (required) 
    (   # start of optional extension 
    \s*   # zero or more characters of white space 
    (   # start of extention indicator 
    ext|x|ext. # extention can be indicated by "ext", "x", or 
        #  "ext" followed by any character 
    )    # end of extension indicator 
    \s*   # zero or more characters of white space 
    \d{2,5}  # two to five digits of extension number 
    )?   # end of optional estension 
    )    # end phone number capture group 
    ''', re.VERBOSE) 

所以在Python代碼上面我們用的好像我們的斯卡拉"""''',但我們也能寫評論。

回答

2

顯然,(?x)支持忽略空格和註釋:

scala> val r = """(?x)abc 
    | # works ok 
    | def""".r 
r: scala.util.matching.Regex = 
(?x)abc 
# works ok 
def 

scala> "abcdef" match { case r(_*) => } 

scala> val r = s"""(?x)abc\n |def #works, I hope\n |123""".stripMargin.r 
r: scala.util.matching.Regex = 
(?x)abc 
def #works, I hope 
123 

scala> "abcdef123" match { case r(_*) => } 

另一個想法:

scala> val r = s"abc${ "" // comment this 
    | }def${ "" // not pretty 
    | }".r 
r: scala.util.matching.Regex = abcdef 

scala> "abcdef" match { case r(_*) => } 

這可能是很方便的有comment"interpolator"返回在那些洞空字符串。

scala> val r = s"abc${ comment"empty words here" }".r 

如果忽略捕獲組,然後將多餘的括號不是一個麻煩:

scala> val r = s"abc${ // comment 
    | }".r 
r: scala.util.matching.Regex = abc() 

scala> "abc" match { case r(_*) => } 

這太糟糕了它插入,而不是空字符串單元。

+0

正則表達式內插函數「r」pattern「'可以將單位內插爲空字符串,將'r」abc $ {}「'插入'」abc「.r'而不是'」abc()「。r'。 –