2016-09-14 153 views
0

我正在使用此庫:CSV Reader但問題是.csv文件的畸形。CSV文件每行不同數量的記錄(CSV閱讀器)

實施例:

,UDEQPT,,PROMIS,,,,,,,,,,,,,,,,,,,,,,,,,10:20:15,27-Dec-2015, 
,UDEQPT,,DELAY,,,,,,,,,am24134_1_drift,am24134.1_drift,229,19,,,3176.00,164.78,,,,,,5, 1.00,1,06:16:16,15-Jun-2016,,,,,,, 
,UDEQPT,,DELAY,,,,,,,,,am24134_1_drift,am24134.1_drift,345,25,,,131.68,216.71,,,,,,6, 1.00,1,06:28:23,15-Jun-2016,,,,,,, 
,UDEQPT,,DELAY,,,,,,,,,am24134_1_drift,am24134.1_drift,346,25,,,170.18,210.93,,,,,,7, 1.00,1,06:31:18,15-Jun-2016,,,,,,, 
,UDEQPT,,DELAY,,,,,,,,,am24134_1_drift,am24134.1_drift,376,27,,,295.83,212.99,,,,,,8, 1.00,1,06:38:47,15-Jun-2016,,,,,,, 
,UDEQPT,,ENDLOT,,,,def,def,def,def,,am24134_1_drift,am24134.1_drift,385,27,,,1214.13,213.82, 3.48, 3.11, 1.64, 25.96,1,8, 1.00,1,06:59:46,15-Jun-2016,,4395.91,1465945186,,def,0,1,385, 3.48,357,385, 92.9,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, 

的列數是54,所以如果數據在一個行的數量小於所述固定數目的列,它給你錯誤。在上面的示例中,第一行僅在索引30之前。您如何正確處理此問題?

這裏是我的代碼:

using (var path = File.OpenRead(e.FullPath)) 
      { 
       using (var csv = new CachedCsvReader(new StreamReader(path), false)) 
       { 
        csv.Columns = new List<Column> 
        { 
         new Column { Name = "Delay_Code", Type = typeof(string) }, 
         new Column { Name = "PROMIS_Code", Type = typeof(string) }, 
         new Column { Name = "Tester_Mode", Type = typeof(string) }, 
         new Column { Name = "Event_Name", Type = typeof(string) }, 
         new Column { Name = "Test_Program", Type = typeof(string) }, 
         new Column { Name = "Temperature", Type = typeof(int?) }, 
         new Column { Name = "Lot_Size", Type = typeof(int?) }, 
         new Column { Name = "Part_Name", Type = typeof(string) }, 
         new Column { Name = "Procedure_Name", Type = typeof(string) }, 
         new Column { Name = "Handler_Id", Type = typeof(string) }, 
         new Column { Name = "Perf_Board", Type = typeof(string) }, 
         new Column { Name = "Sys_Part_Type", Type = typeof(string) }, 
         new Column { Name = "Lot_Id", Type = typeof(string) }, 
         new Column { Name = "Stage", Type = typeof(string) }, 
         new Column { Name = "Parts_Tested", Type = typeof(int?) }, 
         new Column { Name = "Parts_Failed", Type = typeof(int?) }, 
         new Column { Name = "Reprobes", Type = typeof(int?) }, 
         new Column { Name = "Successful_Reprobes", Type = typeof(int?) }, 
         new Column { Name = "Delay_Time", Type = typeof(float?) }, 
         new Column { Name = "UPH", Type = typeof(float?) }, 
         new Column { Name = "Test_Time_Pass", Type = typeof(float?) }, 
         new Column { Name = "Test_Time_Fail", Type = typeof(float?) }, 
         new Column { Name = "Avg_Index_Time", Type = typeof(float?) }, 
         new Column { Name = "Delays_30Sec_Avg", Type = typeof(float?) }, 
         new Column { Name = "Delays_30Sec_Count", Type = typeof(int?) }, 
         new Column { Name = "Delays_Count", Type = typeof(int?) }, 
         new Column { Name = "Avg_Num_Sites", Type = typeof(float?) }, 
         new Column { Name = "Active_Sites", Type = typeof(float?) }, 
         new Column { Name = "Hour_Min_Sec", Type = typeof(string) }, 
         new Column { Name = "Day_Month_Year", Type = typeof(string) }, 
         new Column { Name = "User_Name", Type = typeof(string) }, 
         new Column { Name = "Delays_Total_Duration", Type = typeof(float?) }, 
         new Column { Name = "Duration_Since_Last_End_Lot", Type = typeof(float?) }, 
         new Column { Name = "Start_Lot_Time_Data_Entry", Type = typeof(float?) }, 
         new Column { Name = "Employee_Id", Type = typeof(string) }, 
         new Column { Name = "Valid_Flag", Type = typeof(int?) }, 
         new Column { Name = "Sample_Rate", Type = typeof(int?) }, 
         new Column { Name = "Handler_Cycles", Type = typeof(int?) }, 
         new Column { Name = "Site_1_Only_Pass_Only_Avg_Test_Time", Type = typeof(float?) }, 
         new Column { Name = "Site_1_Only_Pass_Only_Count", Type = typeof(int?) }, 
         new Column { Name = "Site_1_Count", Type = typeof(int?) }, 
         new Column { Name = "Site_1_Yield", Type = typeof(float?) }, 
         new Column { Name = "Site_2_Only_Pass_Only_Avg_Test_Time", Type = typeof(float?) }, 
         new Column { Name = "Site_2_Only_Pass_Only_Count", Type = typeof(int?) }, 
         new Column { Name = "Site_2_Count", Type = typeof(int?) }, 
         new Column { Name = "Site_2_Yield", Type = typeof(float?) }, 
         new Column { Name = "Site_3_Only_Pass_Only_Avg_Test_Time", Type = typeof(float?) }, 
         new Column { Name = "Site_3_Only_Pass_Only_Count", Type = typeof(int?) }, 
         new Column { Name = "Site_3_Count", Type = typeof(int?) }, 
         new Column { Name = "Site_3_Yield", Type = typeof(float?) }, 
         new Column { Name = "Site_4_Only_Pass_Only_Avg_Test_Time", Type = typeof(float?) }, 
         new Column { Name = "Site_4_Only_Pass_Only_Count", Type = typeof(int?) }, 
         new Column { Name = "Site_4_Count", Type = typeof(int?) }, 
         new Column { Name = "Site_4_Yield", Type = typeof(int?) }, 
        }; 

        csv.MissingFieldAction = MissingFieldAction.ReplaceByNull; 
        csv.SkipEmptyLines = false; 
        csv.DefaultParseErrorAction = ParseErrorAction.RaiseEvent; 
        csv.ParseError += Csv_ParseError; 

        while (csv.ReadNextRecord()) 
        { 
         for (int i = 0; i < 54; i++) 
          Console.Write(string.Format(i + ". {0} |", string.IsNullOrEmpty(csv[i]) ? "MISSING" : csv[i])); 
         Console.WriteLine(); 
        } 

處理丟失的領域:

private static void Csv_ParseError(object sender, ParseErrorEventArgs e) 
     { 
      if (e.Error is MissingFieldCsvException) 
      { 
       e.Action = ParseErrorAction.AdvanceToNextLine; 
      } 
     } 
+0

如何處理這應該是一個商業邏輯,這意味着每個案例。有些人會忽略整條線,有些人可能會拒絕整個文件。也許你可以告訴我們你想如何處理,並讓我們看看我們可以提供什麼幫助 – Prisoner

+0

如果你想擁有這樣一個自定義的文件格式,你需要自己閱讀並解析它們。 –

+0

那麼,你現在的方法的實際問題是什麼? – grek40

回答

0

最後,我沒有使用任何CSV庫。我只是做了這個Variable Column CSV file processing C#,它的作品像魅力。我還創建了一個DataTable,然後使用SQLBulkCopy將其寫入服務器。

0

你應該換你與一個循環,如果(csv.count == 54)來檢測,如果該行是有效或沒有進入循環,之後你可以指定每個字段的錯誤,如Delay_Code和一個專用的if,都取決於你想要的邏輯。

+0

我希望有一種方法可以獲取讀者正在閱讀的當前行的數量。就像在完成第一行之後一樣,有一種方法可以計算下一行。 –

+0

我想你可以使用csv.Count和csv.FieldCount就像這個例子:https://social.msdn.microsoft.com/Forums/windows/en-US/b29d3f03-06c1-48cf-a011-9ef66ba386e6/parsed- csv-into-datagridview-but-i-no-idea-how-to-save-into-mysql?forum = winformsdatacontrols –

+0

這不起作用,因爲csv.Count返回1。 –