2017-07-27 62 views
0

我使用下面的代碼將excel文件中的數據讀入DataTable對象以供進一步使用。由於它從100k到500k的條目處理,讀取可能會有點慢。有什麼我可以改變我的代碼,以優化過程?代碼如下。優化將Excel文件讀取到DataTable對象中

public static DataTable ReadAsDataTable(string filePath) 
    { 
     DataTable dataTable = new DataTable(); 
     using (SpreadsheetDocument spreadSheetDocument = SpreadsheetDocument.Open(filePath, false)) 
     { 
      WorkbookPart workbookPart = spreadSheetDocument.WorkbookPart; 
      IEnumerable<Sheet> sheets = spreadSheetDocument.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>(); 
      string relationshipId = sheets.First().Id.Value; 
      WorksheetPart worksheetPart = (WorksheetPart)spreadSheetDocument.WorkbookPart.GetPartById(relationshipId); 
      Worksheet workSheet = worksheetPart.Worksheet; 
      SheetData sheetData = workSheet.GetFirstChild<SheetData>(); 
      IEnumerable<Row> rows = sheetData.Descendants<Row>(); 

      foreach (Cell cell in rows.ElementAt(0)) 
      { 
       dataTable.Columns.Add(GetCellValue(spreadSheetDocument, cell)); 
      } 
      foreach (Row row in rows) 
      { 
       DataRow dataRow = dataTable.NewRow(); 
       for (int i = 0; i < row.Descendants<Cell>().Count(); i++) 
       { 
        dataRow[i] = GetCellValue(spreadSheetDocument, row.Descendants<Cell>().ElementAt(i)); 
       } 
       dataTable.Rows.Add(dataRow); 
      } 
     } 
     dataTable.Rows.RemoveAt(0); 
     return dataTable; 
    } 

    private static string GetCellValue(SpreadsheetDocument document, Cell cell) 
    { 
     SharedStringTablePart stringTablePart = document.WorkbookPart.SharedStringTablePart; 
     string value = cell.CellValue.InnerXml; 

     if (cell.DataType != null && cell.DataType.Value == CellValues.SharedString) 
     { 
      return stringTablePart.SharedStringTable.ChildElements[Int32.Parse(value)].InnerText; 
     } 
     else 
     { 
      return value; 
     } 
    } 
+0

您是否想過使用OleDb獲取整個表,如http://csharp.net-informations.com/excel/csharp-excel-oledb.htm中所述?我認爲這可能會更快。 – Fruchtzwerg

回答

0

我不知道什麼是對這個或那個API的性能特性的編譯器的行爲,但它會幫助,如果你打電話row.Descendants<Cell>()只有一次?它似乎是編譯器可以優化的東西,但可能會涉及副作用,所以它什麼也不做。

foreach (Row row in rows) 
{ 
      var cells = row.Descendants<Cell>().ToArray(); 
      DataRow dataRow = dataTable.NewRow(); 
      for (int i = 0; i < cells.Length; i++) 
      { 
       dataRow[i] = GetCellValue(spreadSheetDocument, cells[i]); 
      } 
      dataTable.Rows.Add(dataRow); 
}