2011-08-19 57 views
4

我使用的飛碟,它完美的作品,但現在我想添加書籤做轉換,從XHTML到PDF,並根據FS文件應該這樣做:閱讀XHTML和自定義標籤導入DOM樹

<bookmarks> 
    <bookmark name='1. Foo bar baz' href='#1'> 
     <bookmark name='1.1 Baz quux' href='#1.2'> 
     </bookmark> 
    </bookmark> 
    <bookmark name='2. Foo bar baz' href='#2'> 
     <bookmark name='2.1 Baz quux' href='#2.2'> 
     </bookmark> 
    </bookmark> 
</bookmarks> 

這應該被放到HEAD部分,我已經做到了,但的SAXParser不會讀取該文件了,說:

line 11 column 14 - Error: <bookmarks> is not recognized! 
line 11 column 25 - Error: <bookmark> is not recognized! 

我有一個本地的實體解析器建立和甚至還添加了書籤一個DTD,

<!--flying saucer bookmarks --> 
<!ELEMENT bookmarks (#PCDATA)> 
<!ATTLIST bookmarks %attrs;> 

<!ELEMENT bookmark (#PCDATA)> 
<!ATTLIST bookmark %attrs;> 

但它只是不會解析,我沒有想法,請幫助。

編輯

我使用下面的代碼來解析:

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); 
DocumentBuilder builder = dbf.newDocumentBuilder(); 
builder.setEntityResolver(new LocalEntityResolver()); 
document = builder.parse(is); 

編輯

這裏是LocalEntityResolver:

class LocalEntityResolver implements EntityResolver { 

    private static final Logger LOG = ESAPI.getLogger(LocalEntityResolver.class); 
    private static final Map<String, String> DTDS; 
    static { 
     DTDS = new HashMap<String, String>(); 
     DTDS.put("-//W3C//DTD XHTML 1.0 Strict//EN", 
       "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"); 
     DTDS.put("-//W3C//DTD XHTML 1.0 Transitional//EN", 
       "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"); 
     DTDS.put("-//W3C//ENTITIES Latin 1 for XHTML//EN", 
       "http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent"); 
     DTDS.put("-//W3C//ENTITIES Symbols for XHTML//EN", 
       "http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent"); 
     DTDS.put("-//W3C//ENTITIES Special for XHTML//EN", 
       "http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent"); 
    } 

    @Override 
    public InputSource resolveEntity(String publicId, String systemId) 
      throws SAXException, IOException { 
     InputSource input_source = null; 
     if (publicId != null && DTDS.containsKey(publicId)) { 
      LOG.debug(Logger.EVENT_SUCCESS, "Looking for local copy of [" + publicId + "]"); 

      final String dtd_system_id = DTDS.get(publicId); 
      final String file_name = dtd_system_id.substring(
        dtd_system_id.lastIndexOf('/') + 1, dtd_system_id.length()); 

      InputStream input_stream = FileUtil.readStreamFromClasspath(
        file_name, "my/class/path", 
        getClass().getClassLoader()); 
      if (input_stream != null) { 
       LOG.debug(Logger.EVENT_SUCCESS, "Found local file [" + file_name + "]!"); 
       input_source = new InputSource(input_stream); 
      } 
     } 

     return input_source; 
    } 
} 

我d禮儀建造者工廠實施是: com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl

+0

我認爲你需要提供更多的細節。我怎麼能或其他人重現這個問題? – mzjn

+0

基本上我想用某些未知元素將有效的XHTML解析爲使用W3C過渡DTD的DOM樹。如果你想重現任何有效的XHTML,添加書籤html並嘗試解析成dom樹 – epoch

+0

什麼是LocalEntityResolver?它從何而來?我無法在Xerces源代碼中找到符合'{元素}'的消息不被識別! –

回答

0

呃,我終於找到了問題。對不起,讓你們調試代碼,問題是在我的代碼中有一個調用JTidy.parse的DOM解析發生之前,這導致內容被解析爲空,我甚至沒有抓住,實際錯誤是,來自SAX的Premature End of file

感謝Matt Gibson,當我通過代碼編譯一個簡短的輸入文檔時,我發現了這個錯誤。

我的代碼現在包括一個檢查,看看是否含量爲空

/** 
* parses String content into a valid XML document. 
* @param content the content to be parsed. 
* @return the parsed document or <tt>null</tt> 
*/ 
private static Document parse(final String content) { 
    Document document = null; 
    try { 
     if (StringUtil.isNull(content)) { 
      throw new IllegalArgumentException("cannot parse null " 
        + "content into a DOM object!"); 
     } 

     InputStream is = new ByteArrayInputStream(content 
       .getBytes(CONTEXT.getEncoding())); 

     DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); 
     DocumentBuilder builder = dbf.newDocumentBuilder(); 
     builder.setEntityResolver(new LocalEntityResolver()); 
     document = builder.parse(is); 
    } catch (Exception ex) { 
     LOG.error(Logger.EVENT_FAILURE, "parsing failed " 
       + "for content[" + content + "]", ex); 
    } 

    return document; 
} 
+0

引用[SSCCE](http://sscce.org/);-)的另一個原因我確實試圖重現您的問題,並且很難(FileUtil來自哪個庫,例如..) – Wivani

+0

否後顧之憂。很高興你找到了! –