2014-04-22 47 views
-2

我是哈斯克爾的初學者,如何與attoparsec成開放數組解析,高陣列等如何解析雅虎歷史CSV與Attoparsec

module CsvParser (
     Quote (..) 
    , csvFile 
    , quote 
    ) where 
import System.IO 
import Data.Attoparsec.Text 
import Data.Attoparsec.Combinator 
import Data.Text (Text, unpack) 
import Data.Time 
import System.Locale 
import Data.Maybe 

data Quote = Quote { 
     qTime  :: LocalTime, 
     qAsk  :: Double, 
     qBid  :: Double, 
     qAskVolume :: Double, 
     qBidVolume :: Double 
    } deriving (Show, Eq) 

csvFile :: Parser [Quote] 
csvFile = do 
    q <- many1 quote 
    endOfInput 
    return q 

quote :: Parser Quote 
quote = do 
    time  <- qtime 
    qcomma 
    ask   <- double 
    qcomma 
    bid   <- double 
    qcomma 
    askVolume <- double 
    qcomma 
    bidVolume <- double 
    endOfLine 
    return $ Quote time ask bid askVolume bidVolume 

qcomma :: Parser() 
qcomma = do 
    char ',' 
    return() 

qtime :: Parser LocalTime 
qtime = do 
    tstring  <- takeTill (\x -> x == ',') 
    let time = parseTime defaultTimeLocale "%d.%m.%Y %H:%M:%S%Q" (unpack tstring) 
    return $ fromMaybe (LocalTime (fromGregorian 0001 01 01) (TimeOfDay 00 00 00)) time 

--testString :: Text 
--testString = "01.10.2012 00:00:00.741,1.28082,1.28077,1500000.00,1500000.00\n" 

quoteParser = parseOnly quote 

main = do 
    handle <- openFile "C:\\Users\\ivan\\Downloads\\0005.HK.csv" ReadMode 
    contents <- hGetContents handle 
    let allLines = lines contents 
    map (\line -> quoteParser line) allLines 
    --putStr contents 
    hClose handle 

錯誤消息:

testhaskell.hs:89:5: 
    Couldn't match type `[]' with `IO' 
    Expected type: IO (Either String Quote) 
     Actual type: [Either String Quote] 
    In the return type of a call of `map' 
    In a stmt of a 'do' block: 
     map (\ line -> quoteParser line) allLines 
    In the expression: 
     do { handle <- openFile 
         "C:\\Users\\ivan\\Downloads\\0005.HK.csv" ReadMode; 

      contents <- hGetContents handle; 
      let allLines = lines contents; 
      map (\ line -> quoteParser line) allLines; 
      .... } 

testhaskell.hs:89:37: 
    Couldn't match type `[Char]' with `Text' 
    Expected type: [Text] 
     Actual type: [String] 
    In the second argument of `map', namely `allLines' 
    In a stmt of a 'do' block: 
     map (\ line -> quoteParser line) allLines 
    In the expression: 
     do { handle <- openFile 
         "C:\\Users\\ivan\\Downloads\\0005.HK.csv" ReadMode; 

      contents <- hGetContents handle; 
      let allLines = lines contents; 
      map (\ line -> quoteParser line) allLines; 
      .... } 
+0

可能重複[如何解析雅虎csv與parsec](http://stackoverflow.com/questions/23211685/how-to-parse-yahoo-csv-with-parsec) – Sibi

+0

這個問題是關於使用另一個庫attoparsec,在閱讀示例之後,我發現使用起來很困難,任何簡單的例子 – Moyes

+1

正如Michael Snoyman在該答案中的建議,您應該使用'csv-conduit'。 'csv-conduit'內部使用'attoparsec'來完成解析任務。如果你是Haskell的新手,我建議你從基礎開始,然後開始使用這些庫。 – Sibi

回答

0

您可以使用attoparsec-csv包,或者你可以看看它的source code有一些想法如何寫你自己。

的代碼會像

import qualified Data.Text.IO as T 
import Text.ParseCSV 

main = do 
    txt <- T.readFile "file.csv" 
    case parseCSV txt of 
    Left err -> error err 
    Right csv -> mapM_ (print . mkQuote) csv 

mkQuote :: [T.Text] -> Quote 
mkQuote = error "Not implemented yet" 
2

在錯誤無關秒差距或attoparsec。該錯誤消息指出該生產線是不是IO動作,所以當您嘗試使用它作爲一個它會導致錯誤:

main = do 
    handle <- openFile "C:\\Users\\ivan\\Downloads\\0005.HK.csv" ReadMode 
    contents <- hGetContents handle 
    let allLines = lines contents 
    map (\line -> quoteParser line) allLines -- <== This is not an IO action 
    --putStr contents 
    hClose handl 

你忽略map調用的結果。您應該將其存儲在let的變量中,就像您使用lines的結果一樣。

第二個錯誤是因爲您試圖使用Text作爲String這是不同的類型,即使它們都表示有序的字符集合(它們也有不同的內部表示形式)。你可以用packunpack兩種類型之間的轉換:http://hackage.haskell.org/package/text/docs/Data-Text.html#g:5

此外,你應該始終明確給出main類型簽名main :: IO()。如果你不這樣做,它有時會導致微妙的問題。

但正如其他人所說,你應該使用csv解析器包。