2014-02-24 35 views
2

我試圖讓我的腦袋繞過HXT,一個解析使用箭頭的XML的Haskell庫。對於我的具體使用情況,我寧願不使用deep,因爲有些情況下<outer_tag><payload_tag>value</payload_tag></outer_tag><outer_tag><inner_tag><payload_tag>value</payload_tag></inner_tag></outer_tag>不同,但我碰到了一些覺得它應該工作但不知道的奇怪。是否將因箭頭而產生的箭標記爲有效轉換?

我已經成功地拿出了基於this example從文檔測試用例:

{-# LANGUAGE Arrows, NoMonomorphismRestriction #-} 
module Main where 

import Text.XML.HXT.Core 

data Guest = Guest { firstName, lastName :: String } 
    deriving (Show, Eq) 


getGuest = deep (isElem >>> hasName "guest") >>> 
    proc x -> do 
    fname <- getText <<< getChildren <<< deep (hasName "fname") -< x 
    lname <- getText <<< getChildren <<< deep (hasName "lname") -< x 
    returnA -< Guest { firstName = fname, lastName = lname } 

getGuest' = deep (isElem >>> hasName "guest") >>> 
    proc x -> do 
    fname <- getText <<< getChildren <<< (hasName "fname") <<< getChildren -< x 
    lname <- getText <<< getChildren <<< (hasName "lname") <<< getChildren -< x 
    returnA -< Guest { firstName = fname, lastName = lname } 

getGuest'' = deep (isElem >>> hasName "guest") >>> getChildren >>> 
    proc x -> do 
    fname <- getText <<< getChildren <<< (hasName "fname") -< x 
    lname <- getText <<< getChildren <<< (hasName "lname") -< x 
    returnA -< Guest { firstName = fname, lastName = lname } 


driver finalArrow = runX (readDocument [withValidate no] "guestbook.xml" >>> finalArrow) 

main = do 
    guests <- driver getGuest 
    print "getGuest" 
    print guests 

    guests' <- driver getGuest' 
    print "getGuest'" 
    print guests' 

    guests'' <- driver getGuest'' 
    print "getGuest''" 
    print guests'' 

之間getGuestgetGuest'我擴大deep到正確數量的getChildren。由此產生的功能仍然有效。然後我把do塊以外的getChildren分解,但這會導致產生的功能失敗。輸出是:

"getGuest" 
[Guest {firstName = "John", lastName = "Steinbeck"},Guest {firstName = "Henry", lastName = "Ford"},Guest {firstName = "Andrew", lastName = "Carnegie"},Guest {firstName = "Anton", lastName = "Chekhov"},Guest {firstName = "George", lastName = "Washington"},Guest {firstName = "William", lastName = "Shakespeare"},Guest {firstName = "Nathaniel", lastName = "Hawthorne"}] 
"getGuest'" 
[Guest {firstName = "John", lastName = "Steinbeck"},Guest {firstName = "Henry", lastName = "Ford"},Guest {firstName = "Andrew", lastName = "Carnegie"},Guest {firstName = "Anton", lastName = "Chekhov"},Guest {firstName = "George", lastName = "Washington"},Guest {firstName = "William", lastName = "Shakespeare"},Guest {firstName = "Nathaniel", lastName = "Hawthorne"}] 
"getGuest''" 
[] 

我覺得這應該是一個有效的轉換來執行,但我對箭頭的理解有點不穩定。難道我做錯了什麼?這是我應該報告的錯誤嗎?

我正在使用HXT版本9.3.1.3(寫作時的最新版本)。 ghc --version打印「The Glorious Glasgow Haskell Compilation System,version 7.4.1」。我也用ghc 7.6.3在一個盒子上測試過,並得到了相同的結果。

XML文件有下列重複結構(完整的文件可以發現here

<guestbook> 
    <guest> 
    <fname>John</fname> 
    <lname>Steinbeck</lname> 
    </guest> 
    <guest> 
    <fname>Henry</fname> 
    <lname>Ford</lname> 
    </guest> 
    <guest> 
    <fname>Andrew</fname> 
    <lname>Carnegie</lname> 
    </guest> 
</guestbook> 
+1

你可以發佈一個示例XML文件去與此? – bheklilr

+0

@bheklilr好的,做到了。 –

回答

3

getGuest''你有

... (hasName "fname") -< x 
... (hasName "lname") -< x 

也就是說,你是限制到的情況下x"fname"x"lname",這不被任何x所滿足!

+0

那麼,不是一個有效的保理?我將閱讀關於箭頭符號如何轉化爲常規Haskell的文檔。 –

+0

事實上,在這種特殊情況下,確實無法得到一般的有效結果。 –

+0

如果你想考慮翻譯,重要的一點是'f >>>(g1 &&& g2)'和'(f >>> g1)&&&(f >>> g2)'之間的區別。 –

2

我已經設法解決了建築被解釋的方式的具體原因。下面的箭頭翻譯發現here提供了一個基礎,從

addA :: Arrow a => a b Int -> a b Int -> a b Int 
addA f g = proc x -> do 
       y <- f -< x 
       z <- g -< x 
       returnA -< y + z 

工作變得:

addA :: Arrow a => a b Int -> a b Int -> a b Int 
addA f g = arr (\ x -> (x, x)) >>> 
      first f >>> arr (\ (y, x) -> (x, y)) >>> 
      first g >>> arr (\ (z, y) -> y + z) 

由此我們可以通過類比,得出:

getGuest''' = preproc >>> 
      arr (\ x -> (x, x)) >>> 
      first f >>> arr (\ (y, x) -> (x, y)) >>> 
      first g >>> arr (\ (z, y) -> Guest {firstName = z, lastName = y}) 

    where preproc = deep (isElem >>> hasName "guest") >>> getChildren 
     f = getText <<< getChildren <<< (hasName "fname") 
     g = getText <<< getChildren <<< (hasName "lname") 

在HXT,箭頭可以想象成流過濾波器的值流。正如我所希望的那樣,arr (\x->(x,x))不會「分流」。相反,它會創建一個由f過濾的元組流,並通過g過濾倖存者。由於fg是相互排斥的,因此沒有幸存者。

getChildren例子裏奇蹟般地工作,因爲元組流包含進一步值高達XML文檔看起來像

<guest> 
    <fname>John</fname> 
    <lname>Steinbeck</lname> 
</guest> 

,因此並不相互排斥。