2017-10-11 154 views
1

我想比較兩個csv文件並在文件中打印差異。我目前使用下面的代碼刪除一行。我可以更改此代碼,以便比較兩個csv文件,或者在c#中有更好的方法來比較csv文件嗎?在C中比較兩個csv文件#

List<string> lines = new List<string>(); 
     using (StreamReader reader = new StreamReader(System.IO.File.OpenRead(path))) 
     { 
      string line; 
      while ((line = reader.ReadLine()) != null) 
      { 
       if (line.Contains(csvseperator)) 
       { 
        string[] split = line.Split(Convert.ToChar(scheidingsteken)); 

        if (split[selectedRow] == value) 
        { 

        } 
        else 
        { 
         line = string.Join(csvseperator, split); 
         lines.Add(line); 
        } 
       } 

      } 
     } 

     using (StreamWriter writer = new StreamWriter(path, false)) 
     { 
      foreach (string line in lines) 
       writer.WriteLine(line); 
     } 
    } 
+3

如果你想找出*加*,*刪除*和* *改變線路,請看看在*編輯距離* https://en.wikipedia.org/wiki/Edit_distance –

+0

我不能使用它。 – Mylan

+2

你爲什麼這麼難過?你爲什麼不能使用它?最簡單的編輯距離(*萊文斯坦*一個)易於實現 https://en.wikipedia.org/wiki/Levenshtein_distance –

回答

0

如果你只是想比較一列,你可以使用此代碼:

   List<string> lines = new List<string>(); 
    List<string> lines2 = new List<string>(); 



    try 
    { 
     StreamReader reader = new StreamReader(System.IO.File.OpenRead(pad)); 
     StreamReader read = new StreamReader(System.IO.File.OpenRead(pad2)); 

     string line; 
     string line2; 

     //With this you can change the cells you want to compair 
     int comp1 = 1; 
     int comp2 = 1; 

     while ((line = reader.ReadLine()) != null && (line2 = read.ReadLine()) != null) 
     {   
      string[] split = line.Split(Convert.ToChar(seperator)); 
      string[] split2 = line2.Split(Convert.ToChar(seperator)); 

      if (line.Contains(seperator) && line2.Contains(seperator)) 
      { 
       if (split[comp1] != split2[comp2]) 
       { 
        //It is not the same 
       } 
       else 
       { 
        //It is the same 

       } 
      } 
     } 
     reader.Dispose(); 
     read.Dispose(); 
    } 
    catch 
    { 

    } 
+0

非常感謝你這個完美的作品:) – Mylan

+0

這隻能檢查每一行的第2列,而忽略行,如果一個CSV含有比其他更多的線路。 –

+0

我該如何解決這個問題? – Mylan

0

這裏找到CSV文件之間的差異的另一種方式,利用Cinchoo ETL - 一個開源庫

對於以下示例CSV文件

sample1.csv

id,name 
1,Tom 
2,Mark 
3,Angie 

sample2.csv

id,name 
1,Tom 
2,Mark 
4,Lu 

使用Cinchoo ETL,下面的代碼演示瞭如何通過所有列

var input1 = new ChoCSVReader("sample1.csv").WithFirstLineHeader(); 
var input2 = new ChoCSVReader("sample2.csv").WithFirstLineHeader(); 

using (var output = new ChoCSVWriter("sampleDiff.csv").WithFirstLineHeader()) 
{ 
    output.Write(input1.OfType<ChoDynamicObject>().Except(input2.OfType<ChoDynamicObject>(), ChoDynamicObjectEqualityComparer.Default)); 
    output.Write(input2.OfType<ChoDynamicObject>().Except(input1.OfType<ChoDynamicObject>(), ChoDynamicObjectEqualityComparer.Default)); 
} 

找到行之間的差異sampleDiff.csv

id,name 
3,Angie 
4,Lu 

如果您想通過 'ID' 列做的差異,

var input1 = new ChoCSVReader("sample1.csv").WithFirstLineHeader(); 
var input2 = new ChoCSVReader("sample2.csv").WithFirstLineHeader(); 

using (var output = new ChoCSVWriter("sampleDiff.csv").WithFirstLineHeader()) 
{ 
    output.Write(input1.OfType<ChoDynamicObject>().Except(input2.OfType<ChoDynamicObject>(), new ChoDynamicObjectEqualityComparer(new string[] { "id" }))); 
    output.Write(input2.OfType<ChoDynamicObject>().Except(input1.OfType<ChoDynamicObject>(), new ChoDynamicObjectEqualityComparer(new string[] { "id" }))); 
} 

希望這有助於。