2009-07-13 199 views
0

如何從Microsoft Word文檔讀取單詞註釋(註釋)?如何閱讀來自apache poi的word文檔中的註釋?

請提供一些示例代碼,如果可能的話...

感謝你......

+0

Word文檔有多種形式。你能澄清一下你想讀的文件類型嗎? Word 97/2003 .doc,Word 2007 XML等 – 2009-07-13 14:59:03

+0

我想閱讀97/2003/xp和2007 word文件中的評論... – Garudadwajan 2009-07-14 03:47:44

回答

2

給你一個SummaryInformation對象。最後,我找到了答案

這裏是代碼片段...

File file = null; 
    FileInputStream fis = null; 
    HWPFDocument document = null; 
    Range commentRange = null; 
    try { 
     file = new File(fileName); 
     fis = new FileInputStream(file); 
     document = new HWPFDocument(fis); 
     commentRange = document.getCommentsRange(); 
     int numComments = commentRange.numParagraphs(); 
     for (int i = 0; i < numComments; i++) { 
      String comments = commentRange.getParagraph(i).text(); 
      comments = comments.replaceAll("\\cM?\r?\n", "").trim(); 
      if (!comments.equals("")) { 
       System.out.println("comment :- " + comments); 
      } 
     } 
    } catch (Exception e) { 
     e.printStackTrace(); 
    } 

我正在使用Poi poi-3.5-beta7-20090719.jar,poi-scratchpad-3.5-beta7-20090717.jar。其他檔案 - poi-ooxml-3.5-beta7-20090717.jar和poi-dependencies-3.5-beta7-20090717.zip - 如果您希望在基於OpenXML的文件格式上工作,將需要其他檔案。

我很欣賞馬克B的幫助究竟是誰發現了這個解決方案....

0

獲取HWPFDocument對象(通過在輸入流中傳遞一個Word文檔,說的)。

然後你就可以通過getSummaryInformation()得到總結,這將通過getSummary()

+0

非常感謝Brian ... – Garudadwajan 2009-07-15 04:09:56

0

我也是新到apache poi。聽到是我的程序工作正常這個程序提取word格式的文本到文本...我希望這個程序將幫助你在你運行這個程序之前,你可以在你的類路徑中設置相應的lib文件。

/* 
* FileExtract.java 
* 
* Created on April 12, 2010, 9:46 AM 
* 
* To change this template, choose Tools | Template Manager 
* and open the template in the editor. 
*/ 
import java.io.File; 
import java.io.FileInputStream; 
import java.io.IOException; 
import java.io.InputStream; 
import javax.swing.text.BadLocationException; 
import javax.swing.text.DefaultStyledDocument; 
import javax.swing.text.rtf.RTFEditorKit; 
import java.io.*; 
import org.apache.poi.POIOLE2TextExtractor.*; 
import org.apache.poi.POIOLE2TextExtractor; 
import org.apache.poi.POITextExtractor; 
import org.apache.poi.extractor.ExtractorFactory; 
import org.apache.poi.hdgf.extractor.VisioTextExtractor; 
import org.apache.poi.hslf.extractor.PowerPointExtractor; 
import org.apache.poi.hssf.usermodel.HSSFWorkbook; 
import org.apache.poi.hwpf.extractor.WordExtractor; 
import org.apache.poi.poifs.filesystem.POIFSFileSystem; 
import org.apache.poi.ss.extractor.ExcelExtractor; 
import org.apache.poi.xwpf.extractor.XWPFWordExtractor; 
import javax.swing.text.Document; 
/** 
* 
* @author ChandraMouil V 
*/ 
public class RtfDocTextExtract { 
    /** Creates a new instance of FileExtract */ 
    static String filePath; 
    static String rtfFile; 
    static FileInputStream fis; 
    static int x=0; 
    public RtfDocTextExtract() { 
    } 
    //This function for .DOC File 
    public static void meth(String filePath) { 
     try { 
      if(x!=0){ 
       fis = new FileInputStream("D:/DummyRichTextFormat.doc"); 
       POIFSFileSystem fileSystem = new POIFSFileSystem(fis); 
       WordExtractor oleTextExtractor = (WordExtractor) ExtractorFactory.createExtractor(fileSystem); 
       String[] paragraphText = oleTextExtractor.getParagraphText(); 
       FileWriter fw = new FileWriter("E:/resume-template.txt"); 
       for (String paragraph : paragraphText) { 
        fw.write(paragraph); 
       } 
       fw.flush(); 
      } 
     }catch(Exception e){ 
      e.printStackTrace(); 
     } 
    } 
}