我想這是我閱讀使用HTTPBuilder的XmlSlurper一個HTML文檔進行解析。 Initialy我試圖做這樣說:的Groovy的XmlSlurper問題
def response = http.get(path: "index.php", contentType: TEXT)
def slurper = new XmlSlurper()
def xml = slurper.parse(response)
但它會產生一個例外:
java.io.IOException: Server returned HTTP response code: 503 for URL: http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
我找到了一個解決辦法,以提供高速緩存DTD文件。我發現了一個簡單的實現類的,這應有助於here:
class CachedDTD {
/**
* Return DTD 'systemId' as InputSource.
* @param publicId
* @param systemId
* @return InputSource for locally cached DTD.
*/
def static entityResolver = [
resolveEntity: { publicId, systemId ->
try {
String dtd = "dtd/" + systemId.split("/").last()
Logger.getRootLogger().debug "DTD path: ${dtd}"
new org.xml.sax.InputSource(CachedDTD.class.getResourceAsStream(dtd))
} catch (e) {
//e.printStackTrace()
Logger.getRootLogger().fatal "Fatal error", e
null
}
}
] as org.xml.sax.EntityResolver
}
我的包樹看起來如下圖所示:
我也修改瞭解析響應一些代碼,所以它看起來像這個:
def response = http.get(path: "index.php", contentType: TEXT)
def slurper = new XmlSlurper()
slurper.setEntityResolver(org.yuri.CachedDTD.entityResolver)
def xml = slurper.parse(response)
但是現在我得到java.net.MalformedURLException
。從CachedDTD EntityResolver的記錄DTD路徑org/yuri/dtd/xhtml1-transitional.dtd
,我無法得到它的工作...