2016-02-16 16 views
0

我想使用此代碼提取字內容:提取使用文字互操作字的內容會導致錯誤

// Open a doc file. 
      Application application = new Application(); 
      Document document = application.Documents.Open("d:\\a.doc"); 

      // Loop through all words in the document. 
      int count = document.Words.Count; 
      for (int i = 1; i <= count; i++) 
      { 
       // Write the word. 
       string text = document.Words[i].Text; 
       Console.WriteLine("Word {0} = {1}", i, text); 
      } 
      // Close word. 
      application.Quit(); 

但是運行之後,我得到這個錯誤:

Unable to cast COM object of type 'Microsoft.Office.Interop.Word.ApplicationClass' to interface type 
'Microsoft.Office.Interop.Word._Application'. This operation failed 
because the QueryInterface call on the COM component for the 
interface with IID '{00020970-0000-0000-C000-000000000046}' 
failed due to the following error: 
No such interface supported (Exception from HRESULT: 0x80004002 (E_NOINTERFACE)). 

我已經安裝了Office 2013

+0

*但運行後,我得到這個錯誤*在哪一行? –

+0

@MattBurland Document document = application.Documents.Open(「d:\\ a.doc」); –

+0

嘗試'var application = new Word._Application();''var document = application.Documents.Open(@「D:\ a.doc」);' – Equalsk

回答

0

我終於下載aspire.doc nuget並用它來提取word文件的內容,如你所見:

Document document = new Document(); 
      document.LoadFromFile(@"d:\a.docx"); 

      //Initialzie StringBuilder Instance 
      StringBuilder sb = new StringBuilder(); 

      //Extract Text from Word and Save to StringBuilder Instance 
      foreach (Section section in document.Sections) 
      { 
       foreach (Paragraph paragraph in section.Paragraphs) 
       { 
        sb.AppendLine(paragraph.Text); 
       } 
      } 

      //Create a New TXT File to Save Extracted Text 
      Console.WriteLine(sb.ToString()); 
      Console.ReadLine(); 
相關問題