有一個文件.doc包含一些圖像。如何將它轉換爲* .html,這樣圖像會保留下來?Apache POI - сonverting* .doc to * .html與圖像
我使用的例子,從這個話題 - Convert Word doc to HTML programmatically in Java
但圖像丟失。 這裏是我使用轉換器 -
public class Converter {
private File docFile;
private File file;
public Converter(File docFile) {
this.docFile = docFile;
}
public void convert(File file){
this.file = file;
try{
FileInputStream finStream=new FileInputStream(docFile.getAbsolutePath());
HWPFDocument doc=new HWPFDocument(finStream);
WordExtractor wordExtract=new WordExtractor(doc);
Document newDocument = DocumentBuilderFactory.newInstance() .newDocumentBuilder().newDocument();
WordToHtmlConverter wordToHtmlConverter = new WordToHtmlConverter(newDocument) ;
wordToHtmlConverter.processDocument(doc);
StringWriter stringWriter = new StringWriter();
Transformer transformer = TransformerFactory.newInstance()
.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty(OutputKeys.ENCODING, "utf-8");
transformer.setOutputProperty(OutputKeys.METHOD, "html");
transformer.transform(
new DOMSource(wordToHtmlConverter.getDocument()),
new StreamResult(stringWriter));
String html = stringWriter.toString();
FileOutputStream fos;
DataOutputStream dos;
try {
BufferedWriter out = new BufferedWriter
(new OutputStreamWriter(new FileOutputStream(file),"UTF-8"));
out.write(html);
out.close();
}
catch (IOException e) {
e.printStackTrace();
}
JEditorPane editorPane = new JEditorPane();
editorPane.setContentType("text/html");
editorPane.setEditable(false);
editorPane.setPage(file.toURI().toURL());
JScrollPane scrollPane = new JScrollPane(editorPane);
JFrame f = new JFrame("Display Html File");
f.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
f.getContentPane().add(scrollPane);
f.setSize(512, 342);
f.setVisible(true);
} catch(Exception e) {
e.printStackTrace();
}
}
}
它說這裏 - http://poi.apache.org/apidocs/org/apache/poi/hwpf/converter/WordToHtmlConverter.html
「此實現不創建圖片或鏈接到他們這可以通過重寫AbstractWordConverter.processImage改變(元素,布爾值,圖片)方法「
還有其他替代方法或轉換器的示例,支持圖像嗎?
非常感謝您對您的回覆,Gagravarr解析文檔中的新類! 現在我會盡力去做。 –
@Alexey,請你提供一些關於你如何解決這個問題的細節,任何有用的鏈接? –