2014-04-22 62 views
1

HXT有問題。 我想解析一個貓頭鷹文件,我的箭頭有問題,因爲他不想解析一棵樹! 我看到的問題是一個: 首先,代碼:解析HXT庫中可能存在或可能不存在的元素

<owl:Class rdf:about="Damien"> 
    <rdfs:subClassOf rdf:resource="PurchaseableItem"/> 
</owl:Class> 

import System.Environment --para uso do getArgs 

import Data.List.Split (splitOn) 


data Class = Class { 
        name ::String, 
        subClassOf ::String 
       } deriving (Show,Eq) 


main = do 
    [src]<- getArgs 
    parser <- runX(readDocument [ withValidate no] src >>> getClass) 
    print parser 


parseClass = ifA (hasAttr "rdf:about") (getAttrValue "rdf:about") (getAttrValue "rdf:ID") 

parseSubClass = getAttrValue "rdf:resource" 



split l = if(length (splitOn "#" l) >1) then (splitOn "#" l !! 1) else l 


atTag tag = deep (isElem >>> hasName tag) 

getClass = atTag "owl:Class" >>> 
    proc l -> do 
    className <- parseClass -< l 
    s <- atTag "rdfs:subClassOf" -< l 
    subClass <- parseSubClass -< s 
    returnA -< Class { name = (split className), subClassOf = (split subClass) } 

與我應該能夠在貓頭鷹文件解析其中存在這個例子中,每個節點但是,當我想解析一個這樣的樹時,它不會計算並將其扔掉!

<owl:Class rdf:about="&camera;BodyWithNonAdjustableShutterSpeed"> 
    <owl:equivalentClass> 
     <owl:Class> 
      <owl:intersectionOf rdf:parseType="Collection"> 
       <rdf:Description rdf:about="&camera;Body"/> 
       <owl:Restriction> 
        <owl:onProperty rdf:resource="&camera;shutter-speed"/> 
        <owl:cardinality rdf:datatype="&xsd;nonNegativeInteger">0</owl:cardinality> 
       </owl:Restriction> 
      </owl:intersectionOf> 
     </owl:Class> 
    </owl:equivalentClass> 
</owl:Class> 

爲什麼?因爲子類節點不存在!但是我希望類在那裏可用,並將其放在我的數據上,即使子類不存在! 那麼,怎麼可能做到這一點?


我的最新版本:

import System.Environment --para uso do getArgs 
import Data.List.Split (splitOn) 

data Class = Class { 
        name ::String, 
        subClassOf :: String 
       } deriving (Show,Eq) 

main = do 
    [src]<- getArgs 
    parser <- runX(readDocument [ withValidate no] src >>> getClass) 
    print parser 

parseClass = ifA (hasAttr "rdf:about") (getAttrValue "rdf:about") (getAttrValue "rdf:ID") 
parseSubClass = (getAttrValue "rdf:resource") `orElse` arr (const "") 

--Test (é preciso rever esta definição) uma falha se o nome tiver o "#" 
split l = if(length (splitOn "#" l) >1) then (splitOn "#" l !! 1) else l 

atTag tag = deep (isElem >>> hasName tag) 
getClass = atTag "owl:Class" >>> 
    proc l -> do 
    className <- parseClass -< l 
    s <- atTag "rdfs:subClassOf" -< l 
    subClass <- parseSubClass -< s 
    returnA -< Class { name = (split className), subClassOf = split subClass } 

回答

1

你需要決定你想要什麼,當子類節點不存在。正如我所看到的,您有兩種選擇:

  • 缺少的子類節點意味着subClass是空字符串。在這種情況下,簡單地改變你的解析器回落到空字符串時,周圍atTag "rdfs:subClassOf"建箭頭失敗:

    getClass = atTag "owl:Class" >>> 
        proc l -> do 
        className <- parseClass -< l 
        subClass <- getSubClass -< l 
        returnA -< Class { name = split className, subClassOf = split subClass } 
        where 
         getSubClass = 
         (atTag "rdfs:subClassOf" >>> parseSubClass) `orElse` arr (const "") 
    
  • 一個丟失的子類節點意味着subClassNothing。這需要改變你的數據定義,以便subClassOfMaybe String型的,但之後,它是相當類似以前的答案:

    getClass = atTag "owl:Class" >>> 
        proc l -> do 
        className <- parseClass -< l 
        subClass <- getSubClass -< l 
        returnA -< Class { name = split className, subClassOf = fmap split subClass } 
        where 
         getSubClass = 
         (atTag "rdfs:subClassOf" >>> parseSubClass >>> arr Just) 
         `orElse` arr (const Nothing) 
    

只是讓我們很清楚,因爲你說這是不是在評論工作,這裏也正是完整的程序我跑,這對我來說工作得很好:

{-# LANGUAGE Arrows #-} 
import System.Environment --para uso do getArgs 
import Data.List.Split (splitOn) 
import Text.XML.HXT.Core 

data Class = Class { 
        name ::String, 
        subClassOf ::String 
       } deriving (Show,Eq) 

main = do 
    [src]<- getArgs 
    parser <- runX(readDocument [ withValidate no] src >>> getClass) 
    print parser 

parseClass = ifA (hasAttr "rdf:about") 
      (getAttrValue "rdf:about") 
      (getAttrValue "rdf:ID") 

parseSubClass = getAttrValue "rdf:resource" 

split l = if(length (splitOn "#" l) >1) then (splitOn "#" l !! 1) else l 

atTag tag = deep (isElem >>> hasName tag) 

getClass = atTag "owl:Class" >>> 
    proc l -> do 
    className <- parseClass -< l 
    subClass <- getSubClass -< l 
    returnA -< Class { name = split className, subClassOf = split subClass } 
    where 
     getSubClass = 
     (atTag "rdfs:subClassOf" >>> parseSubClass) 
     `orElse` arr (const "") 

請注意,如果你真的不想multple箭頭步驟結合起來>>><<<,另一種可能性是至 使用內部proc

getClass = atTag "owl:Class" >>> 
    proc l -> do 
    className <- parseClass -< l 
    subClass <- (proc l' -> do 
     s <- atTag "rdfs:subClassOf" -< l' 
     parseSubClass -< s) 
     `orElse` constA "" -< l 
    returnA -< Class { name = split className, subClassOf = split subClass} 
+0

我試過你的第一個版本,它仍然有相同的問題!它繼續扔掉 – Damiii

+0

而我做了你的第二個版本,使用了Maybe,它也不起作用。 :/ – Damiii

+1

我的'Maybe'版本確實存在一個錯誤(忘記了'fmap',現在已修復),但空字符串版本對我來說工作正常。我發佈了整個代碼,以便您可以檢查有什麼不同。 –

相關問題