2012-07-07 83 views
1

我使用http://hackage.haskell.org/package/sqlite-0.5.2.2綁定到SQLite數據庫。在* .db文件裏面有UTF-8編碼的文本,我可以在文本編輯器和sqlite CLI工具中保證這一點。來自SQLite數據庫的Unicode文本似乎被破壞

當連接到數據庫並檢索數據時 - 文本內容被破壞。簡單的測試如下:

import qualified Database.SQLite as SQL 
import Control.Applicative ((<$>)) 
import System.IO 

buildSkypeMessages dbh = 
    (go <$> (SQL.execStatement dbh "select chatname,author,timestamp,body_xml from messages order by chatname, timestamp")) >>= 
    writeIt 
    where 
    writeIt content = withFile "test.txt" WriteMode (\handle -> mapM_ (\(c:a:t:[]) -> hPutStrLn handle c) content) 
    go (Left msg) = fail msg 
    go (Right rows) = map f $ concat rows 
     where 
     f' (("chatname",SQL.Text chatname): 
      ("author",SQL.Text author): 
      ("timestamp",SQL.Int timestamp): 
      r) = ([chatname, author], r) 
     f xs = let (partEntry, (item:_)) = f' xs 
       in case item of 
       ("body_xml",SQL.Text v) -> v:partEntry 
       ("body_xml",SQL.Null) -> "":partEntry 
     escape (_,SQL.Text v) = v 
     escape (_,SQL.Null) = "" 
     escape (_,SQL.Int v) = show v 

那裏可能有什麼錯?我是否缺少Sqlite或Haskell I/O和編碼?

+0

一個地方這可能出問題是在寫入文件:GHC將使用當前區域設置選擇此操作的默認編碼。你可以通過調用[hSetEncoding](http://hackage.haskell.org/packages/archive/base/latest/doc/html/System-IO.html#v:hSetEncoding)來測試這是否是問題。 – 2012-07-08 06:11:32

+0

@DanielWagner我當前的語言環境是en_US.UTF-8,所以不應該如此。文本文件中的數據看起來像雙編碼爲utf-8 – jdevelop 2012-07-08 06:34:33

+0

@DanielWagner設置二進制模式有所幫助。謝謝! – jdevelop 2012-07-08 07:04:34

回答

1

實際上,問題與SQLite綁定無關,而是與Haskell中的字符串處理有關。有什麼解決的問題 - 穿上它之前的數據手柄上調用hSetBinaryMode:

writeIt content = withFile "test.txt" WriteMode (\handle -> hSetBinaryMode handle True >> mapM_ (\(c:a:t:[]) -> hPutStrLn handle c) content)