2017-06-20 95 views
1

我想「乾淨」的CSV文件:刪除空行和列從一個CSV文件 - C#

  • 刪除空行
  • 刪除空列

的行或列它們不是完全空的,例如: 「」,「」,「」,「」,「」,「」,「」,「」,「」,「」,「」,「」,「」 ,「」, (成行) 或 「」,「」,「」,「」,「」,「」,「」,「」,「」,「」, ) OR

「」,

「」,

「」,

「」,

「」,

「」,

「 「,

(以列的形式)

這些行或列可以位於CSV文件的任何位置。

我有什麼至今:

private void button1_Click(object sender, EventArgs e) 
     { 

      string sourceFile = @"XXXXX.xlsx"; 
      string worksheetName = "Sample"; 
      string targetFile = @"C:\Users\xxxx\xls_test\XXXX.csv"; 

      // Creates the CSV file based on the XLS file 
      ExcelToCSVCoversion(sourceFile, worksheetName, targetFile); 

      // Manipulate the CSV: Clean empty rows 
      DeleteEmptyRoadFromCSV(targetFile); 
     } 

     static void ExcelToCSVCoversion(string sourceFile, string worksheetName, 
      string targetFile) 
     { 
      string connectionString = @"Provider =Microsoft.ACE.OLEDB.12.0;Data Source=" + sourceFile 
       + @";Extended Properties=""Excel 12.0 Xml;HDR=YES"""; 
      OleDbConnection connection = null; 
      StreamWriter writer = null; 
      OleDbCommand command = null; 
      OleDbDataAdapter dataAdapter = null; 

      try 
      { 
       // Represents an open connection to a data source. 
       connection = new OleDbConnection(connectionString); 
       connection.Open(); 

       // Represents a SQL statement or stored procedure to execute 
       // against a data source. 
       command = new OleDbCommand("SELECT * FROM [" + worksheetName + "$]", 
              connection); 
       // Specifies how a command string is interpreted. 
       command.CommandType = CommandType.Text; 
       // Implements a TextWriter for writing characters to the output stream 
       // in a particular encoding. 
       writer = new StreamWriter(targetFile); 
       // Represents a set of data commands and a database connection that are 
       // used to fill the DataSet and update the data source. 
       dataAdapter = new OleDbDataAdapter(command); 

       DataTable dataTable = new DataTable(); 
       dataAdapter.Fill(dataTable); 

       for (int row = 0; row < dataTable.Rows.Count; row++) 
       { 
        string rowString = ""; 
        for (int column = 0; column < dataTable.Columns.Count; column++) 
        { 
         rowString += "\"" + dataTable.Rows[row][column].ToString() + "\","; 
        } 
        writer.WriteLine(rowString); 
       } 

       Console.WriteLine(); 
       Console.WriteLine("The excel file " + sourceFile + " has been converted " + 
            "into " + targetFile + " (CSV format)."); 
       Console.WriteLine(); 
      } 
      catch (Exception exception) 
      { 
       Console.WriteLine(exception.ToString()); 
       Console.ReadLine(); 
      } 
      finally 
      { 
       if (connection.State == ConnectionState.Open) 
       { 
        connection.Close(); 
       } 
       connection.Dispose(); 
       command.Dispose(); 
       dataAdapter.Dispose(); 
       writer.Close(); 
       writer.Dispose(); 
      } 
     } 

     static void DeleteEmptyRoadFromCSV(string fileName) 
     { 
      //string nonEmptyLines = @"XXXX.csv"; 
      var nonEmptyLines = File.ReadAllLines(fileName) 
         .Where(x => !x.Split(',') 
            .Take(2) 
            .Any(cell => string.IsNullOrWhiteSpace(cell)) 
         // use `All` if you want to ignore only if both columns are empty. 
         ).ToList(); 

     File.WriteAllLines(fileName, nonEmptyLines); 
     } 

最後,我試圖從用意念: Remove Blank rows from csv c#。但是我的輸出完全沒有變化。

歡迎任何幫助!

謝謝。

+2

你爲什麼要重新發明車輪?當你可以使用文本文件解析器時,這是很多工作,而解析器也會更健壯。 –

+2

另外,'File.ReadAllLines'可能是危險的,除非你確定你正在處理小文件。 – GibralterTop

+0

Linq在這裏可能會有所幫助,您可以跳過空行/列。它也可能有助於清理你的代碼。 –

回答

1

在保存csv之前,您可以從表中刪除列/行。 方法沒有經過測試,但你應該明白這個概念。

static void ExcelToCSVCoversion(string sourceFile, string worksheetName, 
     string targetFile) 
    { 
     string connectionString = @"Provider =Microsoft.ACE.OLEDB.12.0;Data Source=" + sourceFile 
      + @";Extended Properties=""Excel 12.0 Xml;HDR=YES"""; 
     OleDbConnection connection = null; 
     StreamWriter writer = null; 
     OleDbCommand command = null; 
     OleDbDataAdapter dataAdapter = null; 

     try 
     { 
      // Represents an open connection to a data source. 
      connection = new OleDbConnection(connectionString); 
      connection.Open(); 

      // Represents a SQL statement or stored procedure to execute 
      // against a data source. 
      command = new OleDbCommand("SELECT * FROM [" + worksheetName + "$]", 
             connection); 
      // Specifies how a command string is interpreted. 
      command.CommandType = CommandType.Text; 
      // Implements a TextWriter for writing characters to the output stream 
      // in a particular encoding. 
      writer = new StreamWriter(targetFile); 
      // Represents a set of data commands and a database connection that are 
      // used to fill the DataSet and update the data source. 
      dataAdapter = new OleDbDataAdapter(command); 

      DataTable dataTable = new DataTable(); 
      dataAdapter.Fill(dataTable); 
      var emptyRows = 
       dataTable.Select() 
        .Where(
         row => 
          dataTable.Columns.Cast<DataColumn>() 
           .All(column => string.IsNullOrEmpty(row[column].ToString()))).ToArray(); 
      Array.ForEach(emptyRows, x => x.Delete()); 

      var emptyColumns = 
       dataTable.Columns.Cast<DataColumn>() 
        .Where(column => dataTable.Select().All(row => string.IsNullOrEmpty(row[column].ToString()))) 
        .ToArray(); 
      Array.ForEach(emptyColumns, column => dataTable.Columns.Remove(column)); 
      dataTable.AcceptChanges(); 

      for (int row = 0; row < dataTable.Rows.Count; row++) 
      { 
       string rowString = ""; 
       for (int column = 0; column < dataTable.Columns.Count; column++) 
       { 
        rowString += "\"" + dataTable.Rows[row][column].ToString() + "\","; 
       } 
       writer.WriteLine(rowString); 
      } 

      Console.WriteLine(); 
      Console.WriteLine("The excel file " + sourceFile + " has been converted " + 
           "into " + targetFile + " (CSV format)."); 
      Console.WriteLine(); 
     } 
     catch (Exception exception) 
     { 
      Console.WriteLine(exception.ToString()); 
      Console.ReadLine(); 
     } 
     finally 
     { 
      if (connection.State == ConnectionState.Open) 
      { 
       connection.Close(); 
      } 
      connection.Dispose(); 
      command.Dispose(); 
      dataAdapter.Dispose(); 
      writer.Close(); 
      writer.Dispose(); 
     } 
    } 
+0

嘿 這聽起來不錯,但我無法運行。 關於該部分: emptyRows.ForEach(x => x.Delete()); 和 emptyColumns.ForEach(column => dataTable.Columns.Remove(column)); 我一直有錯誤:「沒有參數對應於'Array.ForEach,T>(T [],Action )必需的形式參數'操作''」 - > Exceptions:ArgumentNullExpection 我試過,沒有成功: foreach(var etRows in emptyRows) { (x => x.Delete()); } –

+0

我已更新ForEach語句。請立即檢查。 –

+0

它正在工作! 我的想法是使用XLS而不是CSV來處理會更加複雜,但使用您的解決方案,它正在工作! 我真的很感激! –

0

請檢查下面的查詢是working.I我得到的所有行:

var nonEmptyLines = File.ReadAllLines(FileName) 
         .Where(x => !x.Split(',') 
            .Take(2) 
            .Any(cell => string.IsNullOrWhiteSpace(cell)) 
         // use `All` if you want to ignore only if both columns are empty. 
         ).ToList(); 

我想你可以使用的東西,如:

var nonEmptyLines = File.ReadAllLines(File). 
         SkipWhile(cell=>{var arr=cell.Split(',');if(string.IsNullOrWhiteSpace(cell)){ 
          return true; 
         } 
          else 
         { 
          return false; 
         } 
         }); 
+0

嘿 我試過了; 它們執行得很好,但輸出(CSV)保持不變。 調試可以看出: https://drive.google.com/drive/folders/0B98UpTa2n4XbeHZtOHU2cVotUms?usp=sharing(對不起,但我的圖像附件在這裏從來沒有工作過) –