2013-07-22 19 views
4

我正在編寫一個程序,用於解析Microsoft Word 2010文檔中的文本數據。具體而言,我想從文檔中每個表的第一列中的每個單元格獲取文本。使用Microsoft.Office.Interop.Word解析表,只從第一列獲取文本?

作爲參考,文檔看起來喜歡這樣的: enter image description here

我只需要在每一頁上的第一列從細胞中的文本。我將把這個文本添加到內部數據表中。

我的代碼,到目前爲止,看起來是這樣的:

private void button1_Click(object sender, EventArgs e) 
    { 
     // Create an instance of the Open File Dialog Box 
     var openFileDialog1 = new OpenFileDialog(); 

     // Set filter options and filter index 
     openFileDialog1.Filter = "Word Documents (.docx)|*.docx|All files (*.*)|*.*"; 
     openFileDialog1.FilterIndex = 1; 
     openFileDialog1.Multiselect = false; 

     // Call the ShowDialog method to show the dialog box. 
     openFileDialog1.ShowDialog(); 
     txtDocument.Text = openFileDialog1.FileName; 

     var word = new Microsoft.Office.Interop.Word.Application(); 
     object miss = System.Reflection.Missing.Value; 
     object path = openFileDialog1.FileName; 
     object readOnly = true; 
     var docs = word.Documents.Open(ref path, ref miss, ref readOnly, 
             ref miss, ref miss, ref miss, ref miss, 
             ref miss, ref miss, ref miss, ref miss, 
             ref miss, ref miss, ref miss, ref miss, 
             ref miss); 

     // Datatable to store text from Word doc 
     var dt = new System.Data.DataTable(); 
     dt.Columns.Add("Text"); 

     // Loop through each table in the document, 
     // grab only text from cells in the first column 
     // in each table. 
     foreach (Table tb in docs.Tables) 
     { 
      // insert code here to get text from cells in first column 
      // and insert into datatable. 
     } 

     ((_Document)docs).Close(); 
     ((_Application)word).Quit(); 
    } 

我被困在那裏我抓住從每個單元的文本,並將其添加到我的DataTable中的一部分。有人可以給我一些指針嗎?我一定會很感激。

謝謝!

回答

12

我不知道你想如何將其存儲在數據庫中,但讀課文我想你可以循環出了行,並挑選在每個第一列:

foreach (Table tb in docs.Tables) { 
    for (int row = 1; row <= tb.Rows.Count; row++) { 
     var cell = tb.Cell(row, 1); 
     var text = cell.Range.Text; 

     // text now contains the content of the cell. 
    } 
}