2014-09-23 53 views
0

我想使用regex.findAllMatchIn和Iterator [Match]匹配下面的支撐文本。以下代碼顯示在某些情況下matchesOne的長度非零,但接着說它是空的迭代器。我覺得我在這裏錯過了一些基本的東西。有什麼想法嗎?爲什麼在Scala中,我的Iterator [Match]給出了一個長度,但沒有數據?

import scala.util.matching.Regex.Match 
    import scala.xml._ 

    val xmldata = <document> 
    <content> 
     <headers> 
     </headers> 
     <body> 
     Foo [1], then another foo[2]; then lots of other things here 
     And add a few other lines[2][3] of test data[3][5] (Foo 1234) 
     </body> 
    </content> 
    </document> 

    val bodyIterator : Iterator[String]= ((xmldata \ "content" \ "body").text).linesWithSeparators 

    while (bodyIterator.hasNext) { 
    val line = bodyIterator.next() 

    println(s"***** Line is: $line") 

    val citationOne = """(\[[0-9]+\])(,\[[0-9]+\])*""".r 
    val citationTwo = """(\([A-Z, -.]+[0-9]{4}\))""".r 
    /* search the line for citations */ 

    val matchesOne: Iterator[Match] = citationOne.findAllMatchIn(line) 
    val matchesTwo: Iterator[Match] = citationTwo.findAllMatchIn(line) 

    println("matchesOne found: " + matchesOne.length) 
    println("matchesTwo found: " + matchesTwo.length) 
    for (m <- matchesOne) {println(s"match is $m")} 

    println("matchesOne Matches: ") 
    matchesOne.foreach(x => println("1: " + x.matched)) 
    //while (matchesOne.hasNext) { 
    // println("matchesOne: " + matchesOne.next()) 
    // } 

    while (matchesTwo.hasNext) { 
     println("matchesTwo: " + matchesTwo.next().matched) 
    } 

    println("\n\n") 
    } 

輸出:

import scala.util.matching.Regex.Match 
    import scala.xml._ 

    xmldata: scala.xml.Elem = <document> 
    <content> 
     <headers> 
     </headers> 
     <body> 
     Foo [1], then another foo[2]; then lots of other things here 
     And add a few other lines[2][3] of test data[3][5] (Foo 1234) 
     </body> 
     </content> 
    </document> 

    bodyIterator: Iterator[String] = non-empty iterator 

    ***** Line is: 

    matchesOne found: 0 
    matchesTwo found: 0 
    matchesOne Matches: 



    ***** Line is:  Foo [1], then another foo[2]; then lots of other things here 

    matchesOne found: 2 
    matchesTwo found: 0 
    matchesOne Matches: 



    ***** Line is:  And add a few other lines[2][3] of test data[3][5] (Foo 1234) 

    matchesOne found: 4 
    matchesTwo found: 0 
    matchesOne Matches: 



    ***** Line is:  
    matchesOne found: 0 
    matchesTwo found: 0 
+0

現在所有的設置。感謝大家!! – 2014-09-23 20:30:27

回答

5

調用Iterator.length耗盡了Iterator,如documentation說:

- 再利用:調用此方法後,應該丟棄迭代器它被呼籲。

+0

啊,錯過了文檔中的那一部分。給出一個迭代器,這完全合理。謝謝!! – 2014-09-23 20:29:03

+0

很高興幫助:) – 2014-09-23 20:29:42

3

計算迭代器的長度會消耗它(因爲它必須處理所有元素才能看到它的長度)。所以在知道長度後,迭代器現在是空的 !

1

當你得到你的迭代器的長度時,你已經在它的末尾,所以你以後不能得到任何數據。在你的情況下,一個解決方案是將其轉換爲像List這樣的東西。

val matchesOne: List[Match] = citationOne.findAllMatchIn(line).toList 
    val matchesTwo: List[Match] = citationTwo.findAllMatchIn(line).toList 

然後你會得到預期的輸出,例如:

scala> val line = "Foo [1], then another foo[2]; then lots of other things here" 
line: String = Foo [1], then another foo[2]; then lots of other things here 

scala> val result = citationOne.findAllMatchIn(line).toList 
result: List[scala.util.matching.Regex.Match] = List([1], [2]) 

scala> val matchesOne = citationOne.findAllMatchIn(line).toList 
matchesOne: List[scala.util.matching.Regex.Match] = List([1], [2]) 

scala> println("matchesOne found: " + matchesOne.length) 
matchesOne found: 2 

scala> for (m <- matchesOne) {println(s"match is $m")} 
match is [1] 
match is [2] 
相關問題