我的工作是採用HTML字符串的方法，並返回一個類似從創建的HTML（在Java中）的字符串

javax.swing.text.html.HTMLDocument

什麼是這樣做的最有效的方式的HTMLDocument的？

我目前這樣做的方式是使用SAX解析器來解析HTML字符串。我跟蹤何時打開標籤（例如，<i>）。當我點擊相應的關閉標記（例如，</i >）時，我將斜體樣式應用於我之間打的字符。

這當然有效，但速度不夠快。有沒有更快的方法來做到這一點？

來源

2011-07-14 Paul Reiners

嘗試使用HtmlEditorKit類。它支持解析可從String直接讀取的HTML內容（例如，通過StringReader）。 There seems to be an article關於如何做到這一點。

編輯：舉個例子，基本上我認爲這可能是這樣做（aftrer代碼被執行，htmlDoc應該包含加載文件...）：

Reader stringReader = new StringReader(string); 
HTMLEditorKit htmlKit = new HTMLEditorKit(); 
HTMLDocument htmlDoc = (HTMLDocument) htmlKit.createDefaultDocument(); 
HTMLEditorKit.Parser parser = new ParserDelegator(); 
parser.parse(stringReader, htmlDoc.getReader(0), true);

來源

2011-07-14 18:28:17 mouser

這看起來是正確的，但似乎並不奏效。考慮這個測試用例：公共無效testMakeHTMLDocument（）拋出異常{ \t \t最後字符串的HTML = 「 \ n」個 \t \t \t + 「 \ n」個 \t \t \t + 「\ n」個 \t \t \t + 「

我的第一個標題

\ n」個 \t \t \t + 「\ n」 \t \t \t +「

我的網絡連接第一段。

\ n 「個 \t \t \t + 」\ n「個 \t \t \t + 」 \ n「個 \t \t \t +」「; \t \t最終HTMLDocument的HTMLDocument的= \t \t \t MyHTMLDocumentLoader.makeHTMLDocument（HTML）; \t \t htmlDocument.dump（System。出）; \t} –

它轉儲這樣： <體名=身體 >

<內容名=含量 > [0,1] [ ] <比迪平 bidiLevel，則會= 0 > [0,1] [ ] –

我有點害怕，這是因爲HTMLEditorKit支持HTML的弱點;根據javadoc的說法，「默認支持是由這個類提供的，它支持HTML版本3.2（帶有一些擴展），並且正在向版本4.0遷移」 - 恐怕你需要在回調中手動處理標籤 - 不知道這是否比你的原始方法好一些:( – mouser

你可以嘗試使用方法HTMLDocument.setOuterHTML。只需添加一個隨機元素，然後將其替換爲HTML字符串。

來源

2011-07-14 18:33:07 nfechner

只是不要忘記：'爲了正確工作，文檔必須有一個HTMLEditorKit.Parser集合。如果文檔是通過createDefaultDocument方法從HTMLEditorKit創建的，那麼就是這種情況。' – mouser

同意Mouser的，但小幅盤整

Reader stringReader = new StringReader(string); 
HTMLEditorKit htmlKit = new HTMLEditorKit(); 
HTMLDocument htmlDoc = (HTMLDocument) htmlKit.createDefaultDocument(); 
htmlKit.read(stringReader, htmlDoc, 0);

來源

2011-07-15 07:17:44 StanislavL

從創建的HTML（在Java中）的字符串

回答

我的第一個標題

相關問題