如何合併相鄰的線與scalaz流不失分割線

假設我輸入文件myInput.txt如下所示：如何合併相鄰的線與scalaz流不失分割線

~~~ text1 
bla bla 
some more text 
~~~ text2 
lorem ipsum 
~~~ othertext 
the wikipedia 
entry is not 
up to date

也就是說，有通過~~~分開的文件。期望的輸出如下：

text1: bla bla some more text 
text2: lorem ipsum 
othertext: the wikipedia entry is not up to date

我該怎麼做？下面似乎很自然的，再加上我失去了冠軍：

val converter: Task[Unit] = 
    io.linesR("myInput.txt") 
     .split(line => line.startsWith("~~~")) 
     .intersperse(Vector("\nNew document: ")) 
     .map(vec => vec.mkString(" ")) 
     .pipe(text.utf8Encode) 
     .to(io.fileChunkW("flawedOutput.txt")) 
     .run 

    converter.run

來源

2015-05-05 mitchus

這不是一個真正的答案，但我得到了[一個小scalaz流分裂庫]（https://github.com/travisbrown/syzygist）[使這種事情很容易]（https://gist.github.com/travisbrown/42f28afbc0bc4c5ff28a）。 –

@TravisBrown看起來很有趣 – mitchus

下工作正常，但如果我上運行出奇的慢多了玩具的例子（〜5分鐘處理70MB）。那是因爲我正在創造Process的地方嗎？另外，它似乎只使用一個核心。

val converter2: Task[Unit] = { 
    val docSep = "~~~" 
    io.linesR("myInput.txt") 
     .flatMap(line => { val words = line.split(" "); 
      if (words.length==0 || words(0)!=docSep) Process(line) 
      else Process(docSep, words.tail.mkString(" ")) }) 
     .split(_ == docSep) 
     .filter(_ != Vector()) 
     .map(lines => lines.head + ": " + lines.tail.mkString(" ")) 
     .intersperse("\n") 
     .pipe(text.utf8Encode) 
     .to(io.fileChunkW("correctButSlowOutput.txt")) 
     .run 
    }

來源

2015-05-06 13:12:09 mitchus

如何合併相鄰的線與scalaz流不失分割線

回答

相關問題