2014-04-07 94 views
0

我正在編寫簡單的sitemap.xml爬行程序。代碼如下。我的問題是爲什麼main末尾的代碼不能打印任何內容。我懷疑這是因爲Haskell的lazyness但不知道這裏怎麼對付它:Haskell不評估塊

import Network.HTTP.Conduit 
import qualified Data.ByteString.Lazy as L 
import Text.XML.Light 
import Control.Monad.Trans (liftIO) 
import Control.Monad 
import Data.String.Utils 
import Control.Exception 

download :: Manager -> Request -> IO (Either HttpException L.ByteString) 
download manager req = do 
    try $ 
    fmap responseBody (httpLbs req manager) 

downloadUrl :: Manager -> String -> IO (Either HttpException L.ByteString) 
downloadUrl manager url = do 
    request <- parseUrl url 
    download manager request 

getPages :: Manager -> [String] -> IO [Either HttpException L.ByteString] 
getPages manager urls = 
    sequence $ map (downloadUrl manager) urls 

main = withManager $ \ manager -> do 
    -- I know simpleHttp is bad here 
    mapSource <- liftIO $ simpleHttp "http://example.com/sitemap.xml" 

    let elements = (parseXMLDoc mapSource) >>= Just . findElements (mapElement "loc") 
     Just urls = liftM (map $ (replace "/#!" "?_escaped_fragment_=") . strContent) elements 
     mapElement name = QName name (Just "http://www.sitemaps.org/schemas/sitemap/0.9") Nothing 

    return $ 
    getPages manager urls >>= \ pages -> do 
     print "evaluate me!" 
     sequence $ map print pages 
+0

你爲什麼要把'getPages'包裝在'return'中?這似乎沒有必要。 – arrowd

+0

@arrowdodger沒有返回我得到編譯錯誤:'無法匹配類型'IO' 'Control.Monad.Trans.Resource.Internal.ResourceT m' 預期類型:Control.Monad.Trans.Resource.Internal.ResourceT m [要麼是HttpException L.ByteString] 實際類型:IO [要麼HttpException L.ByteString]' –

回答

2

你遇到了我在這裏描述的同樣的問題,至少就錯誤的代碼來說,它應該實際上給出類型錯誤:Why is the type of "Main.main", "IO()" and not "IO a"?。這就是爲什麼你應該總是明確地給main輸入簽名main :: IO()

爲了解決這個問題,你會想取代returnlift(見http://hackage.haskell.org/package/transformers/docs/Control-Monad-Trans-Class.html#v:lift),並與mapM_取代sequence $ map ...mapM_ f相當於sequence_ . map f

+0

謝謝,它有幫助!現在我知道明確的打字和模塊化 - 無處不在是取勝的方法。這裏是工作版本:https://gist.github.com/wishbear/10119638 –

2

替換你的最後returnrunResourceThttp://hackage.haskell.org/package/resourcet-1.1.1/docs/Control-Monad-Trans-Resource.html#v:runResourceT)。因爲它是類型暗示,它會將ResourceT變成IO操作。

+0

但編譯器(和文檔)說我需要ResourceT那裏,而不是IO(http://hackage.haskell.org/package/ http-conduit-1.2.0/docs/Network-HTTP-Conduit.html#g:4) –

+0

哦,現在我明白了。您可能需要'safeFromIOBase':http://hackage.haskell.org/package/conduit-0.1.1.1/docs/Control-Monad-Trans-Resource.html#v:safeFromIOBase – arrowd