爲了學習更多的哈斯克爾(特別是Monads)我試圖建立一個拼寫檢查器。我的目標是能夠通過LaTeX文檔並對不在詞典列表中的單詞進行操作。哈斯克爾分析器與拼寫檢查
我已經寫了解析器(字符串到AST),我粘貼下面的代碼。它基本上返回分割成相關片段(文本,公式,命令等)的LaTeX源代碼。我想知道如何建立一個程序,以便在列表中找不到的每個單詞,我們要求用戶用什麼詞替代。
(我們真正關心的LaTeX的是,我們有源的某些部分是文字和必須拼寫檢查,這是公式,而不是簡單的英語其他部分)
設我更清楚地與期望的行爲的一些例子(爲了簡化公式$ HERE IS THE FORMULA $
之間)解釋它
來源:
This is my frst file and here
we have a formula: $\forall x \quad x$
渴望Desir ED行爲:
In file 'first.tex' at line 1: 'frst' unknown
1 This is my **frst** file and here
2 we have a formula: $\forall x \quad x$
Action [Add word to dictionary/Change word]?
的主要問題是,我已經解析文件後,我留下了一個AST,並有線條沒有更多的引用,所以我不能像上面顯示出來例。
代碼分析器:
import System.Environment
import Text.Parsec (ParseError)
import Text.Parsec.String (Parser, parseFromFile)
import Text.Parsec.String.Parsec (try)
import Text.Parsec.String.Char (oneOf, char, digit, string, letter, satisfy, noneOf, anyChar)
import Text.Parsec.String.Combinator (many1, choice, chainl1, between, count, option, optionMaybe, optional, manyTill, eof, lookAhead)
import Control.Applicative ((<$>), (<*>), (<*), (*>), (<|>), many, (<$))
import Control.Monad (void, ap, mzero)
import Data.Char (isLetter, isDigit)
import FunctionsAndTypesForParsing
data TexFile = Items [TexTerm]
deriving (Eq, Show)
data TexTerm = Comment String
| Formula String
| Command String [TexFile]
| Text String
| Block TexFile
deriving (Eq, Show)
-- We get the AST as output
texFile :: Parser TexFile
texFile = Items <$> (many texTerm) <* (optional (try $ eof))
texTerm :: Parser TexTerm
texTerm = lexeme $ (try comment <|> text <|> formula <|> command <|> block)
whitespace :: Parser()
whitespace = void $ try $ oneOf " \n\t"
lexeme :: Parser a -> Parser a
lexeme p = p <* (many $ whitespace)
comment :: Parser TexTerm
comment = Comment <$> between (string "%") (string "\n") (many $ noneOf "\n")
formula :: Parser TexTerm
formula = Formula <$> (try singledollar <|> doubledollar <|> equation <|> align)
where
singledollar = between (string "$") (string "$") (many1 $ noneOf "$")
doubledollar = between (string "$$") (string "$$") (many1 $ noneOf "$$")
equation = try $ between (try $ string "\\begin{equation}") (string "\\end{equation}") (manyTill anyChar (lookAhead $ try $ string "\\end{equation}"))
align = try $ between (try $ string "\\begin{align*}") (string "\\end{align*}") (manyTill anyChar (lookAhead $ try $ string "\\end{align*}"))
command :: Parser TexTerm
command = Command <$> com <*> (many arg)
where
com = char '\\' *> (manyTill (try letter <|> oneOf "*") (lookAhead $ try $ oneOf "[{ \\\n\t"))
arg = (try (between (string "{") (string "}") texFile)
<|> (between (string "[") (string "]") texFile)
)
text :: Parser TexTerm
text = Text <$> many1 textualchars
where
textualchars = try letter <|> digit <|> oneOf " \n\t\r,.*:;-<>#@()`_!'?"
block :: Parser TexTerm
block = Block <$> between (string "{") (string "}") texFile
接受,因爲它適合我已經做的更好。當我有更多的時間,我也會檢查megaparsec,正如其他答案一樣 – trenta3