2013-07-10 125 views
21

我想將Excel文件讀入Data.DataTable列表中,但使用當前方法可能需要很長時間。我實際上按工作表逐個轉到Worksheet,並且它往往需要很長時間。有沒有更快的方法來做到這一點?這裏是我的代碼:快速將Excel導入DataTable

List<DataTable> List = new List<DataTable>(); 

    // Counting sheets 
    for (int count = 1; count < WB.Worksheets.Count; ++count) 
    { 
     // Create a new DataTable for every Worksheet 
     DATA.DataTable DT = new DataTable(); 

     WS = (EXCEL.Worksheet)WB.Worksheets.get_Item(count); 

     textBox1.Text = count.ToString(); 

     // Get range of the worksheet 
     Range = WS.UsedRange; 


     // Create new Column in DataTable 
     for (cCnt = 1; cCnt <= Range.Columns.Count; cCnt++) 
     { 
      textBox3.Text = cCnt.ToString(); 


       Column = new DataColumn(); 
       Column.DataType = System.Type.GetType("System.String"); 
       Column.ColumnName = cCnt.ToString(); 
       DT.Columns.Add(Column); 

      // Create row for Data Table 
      for (rCnt = 0; rCnt <= Range.Rows.Count; rCnt++) 
      { 
       textBox2.Text = rCnt.ToString(); 

       try 
       { 
        cellVal = (string)(Range.Cells[rCnt, cCnt] as EXCEL.Range).Value2; 
       } 
       catch (Microsoft.CSharp.RuntimeBinder.RuntimeBinderException) 
       { 
        ConvertVal = (double)(Range.Cells[rCnt, cCnt] as EXCEL.Range).Value2; 
        cellVal = ConvertVal.ToString(); 
       } 

       // Add to the DataTable 
       if (cCnt == 1) 
       { 

        Row = DT.NewRow(); 
        Row[cCnt.ToString()] = cellVal; 
        DT.Rows.Add(Row); 
       } 
       else 
       { 

        Row = DT.Rows[rCnt]; 
        Row[cCnt.ToString()] = cellVal; 

       } 
      } 
     } 
     // Add DT to the list. Then go to the next sheet in the Excel Workbook 
     List.Add(DT); 
    } 
+0

遺憾的是沒有。 – gustavodidomenico

+0

「有沒有更快的方法來做到這一點?不幸的是沒有。」絕對垃圾。此代碼正在爲讀取的每個Excel單元格值創建(並錯誤地無法處理)COM對象。這是實現它的最慢的方法!將整個工作表一次讀入一個數組,然後迭代該數組中的項目會更快。 –

回答

12

Caling .Value2是昂貴的操作,因爲它是一個COM互操作調用。我反而通過陣列讀取整個範圍到一個數組,然後循環:

object[,] data = Range.Value2; 

// Create new Column in DataTable 
for (int cCnt = 1; cCnt <= Range.Columns.Count; cCnt++) 
{ 
    textBox3.Text = cCnt.ToString(); 

    var Column = new DataColumn(); 
    Column.DataType = System.Type.GetType("System.String"); 
    Column.ColumnName = cCnt.ToString(); 
    DT.Columns.Add(Column); 

    // Create row for Data Table 
    for (int rCnt = 0; rCnt <= Range.Rows.Count; rCnt++) 
    { 
     textBox2.Text = rCnt.ToString(); 

     string CellVal = String.Empty; 
     try 
     { 
      cellVal = (string)(data[rCnt, cCnt]); 
     } 
     catch (Microsoft.CSharp.RuntimeBinder.RuntimeBinderException) 
     { 
      ConvertVal = (double)(data[rCnt, cCnt]); 
      cellVal = ConvertVal.ToString(); 
     } 

     DataRow Row; 

     // Add to the DataTable 
     if (cCnt == 1) 
     { 

      Row = DT.NewRow(); 
      Row[cCnt.ToString()] = cellVal; 
      DT.Rows.Add(Row); 
     } 
     else 
     { 

      Row = DT.Rows[rCnt]; 
      Row[cCnt.ToString()] = cellVal; 

     } 
    } 
} 
+0

這仍然完美。我有4萬條記錄,處理時間從大約2分鐘下降到大約2秒。 –

+1

我對答案中的變量用法非常困惑。它似乎並不友好。 1.我不能在這個地方使用'Range.Value2',它顯示錯誤爲「不能隱式地將object []轉換爲object [*,*]」。 2.我不確定Convertval變量。 – parkourkarthik

+0

@parkourkarthik我現在無法驗證,但是如果您的範圍是單行或一列,您可能會得到一個1-D'object []',但我認爲它始終是一個二維數組。如果你還沒有,可以自由地提出這個問題。 –

3

MS Office的互操作是緩慢的,甚至微軟不建議在服務器端互操作使用,不能使用進口大量的Excel文件。有關更多詳細信息,請參閱Microsoft的觀點why not to use OLE Automation

取而代之,您可以使用任何Excel庫,例如EasyXLS。這是一個代碼示例,演示瞭如何讀取Excel文件:

ExcelDocument workbook = new ExcelDocument(); 
DataSet ds = workbook.easy_ReadXLSActiveSheet_AsDataSet("excel.xls"); 
DataTable dataTable = ds.Tables[0]; 

如果您的Excel文件有多個表或導入細胞的唯一範圍(更好的性能)來看看更多的代碼樣本上how to import Excel to DataTable in C# using EasyXLS

+5

Ouch。一個195美元的圖書館,只需在Excel工作表中閱讀? –

2

如果其他人正在使用EPPlus。這種實現非常幼稚,但有些評論引起了人們的注意。如果您要在頂部再添加一個方法GetWorkbookAsDataSet(),它將執行OP所要求的操作。

/// <summary> 
    /// Assumption: Worksheet is in table format with no weird padding or blank column headers. 
    /// 
    /// Assertion: Duplicate column names will be aliased by appending a sequence number (eg. Column, Column1, Column2) 
    /// </summary> 
    /// <param name="worksheet"></param> 
    /// <returns></returns> 
    public static DataTable GetWorksheetAsDataTable(ExcelWorksheet worksheet) 
    { 
     var dt = new DataTable(worksheet.Name); 
     dt.Columns.AddRange(GetDataColumns(worksheet).ToArray()); 
     var headerOffset = 1; //have to skip header row 
     var width = dt.Columns.Count; 
     var depth = GetTableDepth(worksheet, headerOffset); 
     for (var i = 1; i <= depth; i++) 
     { 
      var row = dt.NewRow(); 
      for (var j = 1; j <= width; j++) 
      { 
       var currentValue = worksheet.Cells[i + headerOffset, j].Value; 

       //have to decrement b/c excel is 1 based and datatable is 0 based. 
       row[j - 1] = currentValue == null ? null : currentValue.ToString(); 
      } 

      dt.Rows.Add(row); 
     } 

     return dt; 
    } 

    /// <summary> 
    /// Assumption: There are no null or empty cells in the first column 
    /// </summary> 
    /// <param name="worksheet"></param> 
    /// <returns></returns> 
    private static int GetTableDepth(ExcelWorksheet worksheet, int headerOffset) 
    { 
     var i = 1; 
     var j = 1; 
     var cellValue = worksheet.Cells[i + headerOffset, j].Value; 
     while (cellValue != null) 
     { 
      i++; 
      cellValue = worksheet.Cells[i + headerOffset, j].Value; 
     } 

     return i - 1; //subtract one because we're going from rownumber (1 based) to depth (0 based) 
    } 

    private static IEnumerable<DataColumn> GetDataColumns(ExcelWorksheet worksheet) 
    { 
     return GatherColumnNames(worksheet).Select(x => new DataColumn(x)); 
    } 

    private static IEnumerable<string> GatherColumnNames(ExcelWorksheet worksheet) 
    { 
     var columns = new List<string>(); 

     var i = 1; 
     var j = 1; 
     var columnName = worksheet.Cells[i, j].Value; 
     while (columnName != null) 
     { 
      columns.Add(GetUniqueColumnName(columns, columnName.ToString())); 
      j++; 
      columnName = worksheet.Cells[i, j].Value; 
     } 

     return columns; 
    } 

    private static string GetUniqueColumnName(IEnumerable<string> columnNames, string columnName) 
    { 
     var colName = columnName; 
     var i = 1; 
     while (columnNames.Contains(colName)) 
     { 
      colName = columnName + i.ToString(); 
      i++; 
     } 

     return colName; 
    } 
+0

這段代碼有幫助。解決了我的問題。非常感謝。 – Aditi

1
Dim sSheetName As String 
Dim sConnection As String 
Dim dtTablesList As DataTable 
Dim oleExcelCommand As OleDbCommand 
Dim oleExcelReader As OleDbDataReader 
Dim oleExcelConnection As OleDbConnection 

sConnection = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\Test.xls;Extended Properties=""Excel 12.0;HDR=No;IMEX=1""" 

oleExcelConnection = New OleDbConnection(sConnection) 
oleExcelConnection.Open() 

dtTablesList = oleExcelConnection.GetSchema("Tables") 

If dtTablesList.Rows.Count > 0 Then 
    sSheetName = dtTablesList.Rows(0)("TABLE_NAME").ToString 
End If 

dtTablesList.Clear() 
dtTablesList.Dispose() 

If sSheetName <> "" Then 

    oleExcelCommand = oleExcelConnection.CreateCommand() 
    oleExcelCommand.CommandText = "Select * From [" & sSheetName & "]" 
    oleExcelCommand.CommandType = CommandType.Text 

    oleExcelReader = oleExcelCommand.ExecuteReader 

    nOutputRow = 0 

    While oleExcelReader.Read 

    End While 

    oleExcelReader.Close() 

End If 

oleExcelConnection.Close()