0
任何人都知道如何使用Apache POI從word文檔中提取鏈接?甚至更好,從一個段落?Apache POI從word文檔中提取超鏈接
任何人都知道如何使用Apache POI從word文檔中提取鏈接?甚至更好,從一個段落?Apache POI從word文檔中提取超鏈接
Word 2003和更新:
//Links extractor
StringBuffer text = null;
try {
FileInputStream fis = new FileInputStream(new File("YOUR_DOCX_FULL_PATH_HERE));
XWPFDocument document = new XWPFDocument(fis);
text = new StringBuffer();
// First up, all our paragraph based text
Iterator<XWPFParagraph> i = document.getParagraphsIterator();
while(i.hasNext()) {
XWPFParagraph paragraph = i.next();
// Do the paragraph text
for(XWPFRun run : paragraph.getRuns()) {
if(run instanceof XWPFHyperlinkRun) {
text.append(run.toString());
bean.setName(run.toString());
XWPFHyperlink link = ((XWPFHyperlinkRun)run).getHyperlink(document);
if(link != null) {
text.append(" <" + link.getURL() + ">");
}
}
}
}
} catch (Exception e) {
e.printStackTrace();
}
如果你有這個文件,通常最好不要從InputStream中打開一個XWFP/XSSF/XSLF實例,因爲它會導致整個文件必須緩衝到內存中。而不是直接通過File打開 – Gagravarr
舊風格的.doc文件,或新的風格.DOCX的呢? (略有不同) – Gagravarr