2012-01-30 28 views
17

是否有一個高級API用於在Haskell中使用正則表達式進行搜索和替換?特別是,我正在查看Text.Regex.TDFAText.Regex.Posix包。我真的很喜歡類型的東西:Haskell正則表達式庫替換/替換

f :: Regex -> (ResultInfo -> m String) -> String -> m String 

因此,舉例來說,以取代「狗」與「貓」你可以寫

runIdentity . f "dog" (return . const "cat") -- :: String -> String 

或做更多先進的東西與單子一樣,計數發生次數等。

Haskell文檔很缺乏。一些低級的API說明是here

回答

4

我不知道,創建此功能的任何現有的功能,但我認爲我會最終使用像AllMatches [] (MatchOffset, MatchLength) instance of RegexContent東西來模擬它:

replaceAll :: RegexLike r String => r -> (String -> String) -> String -> String 
replaceAll re f s = start end 
    where (_, end, start) = foldl' go (0, s, id) $ getAllMatches $ match re s 
     go (ind,read,write) (off,len) = 
      let (skip, start) = splitAt (off - ind) read 
       (matched, remaining) = splitAt len matched 
      in (off + len, remaining, write . (skip++) . (f matched ++)) 

replaceAllM :: (Monad m, RegexLike r String) => r -> (String -> m String) -> String -> m String 
replaceAllM re f s = do 
    let go (ind,read,write) (off,len) = do 
     let (skip, start) = splitAt (off - ind) read 
     let (matched, remaining) = splitAt len matched 
     replacement <- f matched 
     return (off + len, remaining, write . (skip++) . (replacement++)) 
    (_, end, start) <- foldM go (0, s, return) $ getAllMatches $ match re s 
    start end 
28

如何在包裝文字的subRegex .Regex?

Prelude Text.Regex> :t subRegex 
subRegex :: Regex -> String -> String -> String 

Prelude Text.Regex> subRegex (mkRegex "foo") "foobar" "123" 
"123bar" 
1

也許這種方法適合你。

import Data.Array (elems) 
import Text.Regex.TDFA ((=~), MatchArray) 

replaceAll :: String -> String -> String -> String   
replaceAll regex new_str str = 
    let parts = concat $ map elems $ (str =~ regex :: [MatchArray]) 
    in foldl (replace' new_str) str (reverse parts) 

    where 
    replace' :: [a] -> [a] -> (Int, Int) -> [a] 
    replace' new list (shift, l) = 
     let (pre, post) = splitAt shift list 
     in pre ++ new ++ (drop l post) 
3

基於@風鈴草的答案,但有固定的錯字所以它不只是<<loop>>

replaceAll :: Regex -> (String -> String) -> String -> String 
replaceAll re f s = start end 
    where (_, end, start) = foldl' go (0, s, id) $ getAllMatches $ match re s 
     go (ind,read,write) (off,len) = 
      let (skip, start) = splitAt (off - ind) read 
       (matched, remaining) = splitAt len start 
      in (off + len, remaining, write . (skip++) . (f matched ++)) 
1

您可以使用replaceAllData.Text.ICU.Replace module

Prelude> :set -XOverloadedStrings 
Prelude> import Data.Text.ICU.Replace 
Prelude Data.Text.ICU.Replace> replaceAll "cat" "dog" "Bailey is a cat, and Max is a cat too." 
"Bailey is a dog, and Max is a dog too."