2010-09-19 68 views
1

我想這是我閱讀使用HTTPBuilder的XmlSlurper一個HTML文檔進行解析。 Initialy我試圖做這樣說:的Groovy的XmlSlurper問題

def response = http.get(path: "index.php", contentType: TEXT) 
def slurper = new XmlSlurper() 
def xml = slurper.parse(response) 

但它會產生一個例外:

java.io.IOException: Server returned HTTP response code: 503 for URL: http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd 

我找到了一個解決辦法,以提供高速緩存DTD文件。我發現了一個簡單的實現類的,這應有助於here

class CachedDTD { 
/** 
* Return DTD 'systemId' as InputSource. 
* @param publicId 
* @param systemId 
* @return InputSource for locally cached DTD. 
*/ 
    def static entityResolver = [ 
      resolveEntity: { publicId, systemId -> 
      try { 
       String dtd = "dtd/" + systemId.split("/").last() 
       Logger.getRootLogger().debug "DTD path: ${dtd}" 
       new org.xml.sax.InputSource(CachedDTD.class.getResourceAsStream(dtd)) 
      } catch (e) { 
       //e.printStackTrace() 
       Logger.getRootLogger().fatal "Fatal error", e 
       null 
      } 
      } 
    ] as org.xml.sax.EntityResolver 

} 

我的包樹看起來如下圖所示:

alt text

我也修改瞭解析響應一些代碼,所以它看起來像這個:

def response = http.get(path: "index.php", contentType: TEXT) 
def slurper = new XmlSlurper() 
slurper.setEntityResolver(org.yuri.CachedDTD.entityResolver) 
def xml = slurper.parse(response) 

但是現在我得到java.net.MalformedURLException。從CachedDTD EntityResolver的記錄DTD路徑org/yuri/dtd/xhtml1-transitional.dtd,我無法得到它的工作...

回答