2011-04-11 23 views
1

我正在尋找顯示2個或更多文本(並排)之間差異的好方法。我不需要能夠創建補丁或類似的東西 - 只需逐行顯示差異即可。顯示2+純文本字符串中的差異

是否有任何現有的開源C#庫做這樣的事情?如果不是,是否有差異算法與超過2個字符串一起工作?

回答

1

下面是Levenshtein Distance算法的兩種實現在C#

Link 1 Link 2

的結果越大,越大的差異。

編輯:萬一鏈接複製代碼去死以備將來使用

例1:

using System; 

/// <summary> 
/// Contains approximate string matching 
/// </summary> 
static class LevenshteinDistance 
{ 
    /// <summary> 
    /// Compute the distance between two strings. 
    /// </summary> 
    public static int Compute(string s, string t) 
    { 
    int n = s.Length; 
    int m = t.Length; 
    int[,] d = new int[n + 1, m + 1]; 

    // Step 1 
    if (n == 0) 
    { 
     return m; 
    } 

    if (m == 0) 
    { 
     return n; 
    } 

    // Step 2 
    for (int i = 0; i <= n; d[i, 0] = i++) 
    { 
    } 

    for (int j = 0; j <= m; d[0, j] = j++) 
    { 
    } 

    // Step 3 
    for (int i = 1; i <= n; i++) 
    { 
     //Step 4 
     for (int j = 1; j <= m; j++) 
     { 
     // Step 5 
     int cost = (t[j - 1] == s[i - 1]) ? 0 : 1; 

     // Step 6 
     d[i, j] = Math.Min(
      Math.Min(d[i - 1, j] + 1, d[i, j - 1] + 1), 
      d[i - 1, j - 1] + cost); 
     } 
    } 
    // Step 7 
    return d[n, m]; 
    } 
} 

class Program 
{ 
    static void Main() 
    { 
    Console.WriteLine(LevenshteinDistance.Compute("aunt", "ant")); 
    Console.WriteLine(LevenshteinDistance.Compute("Sam", "Samantha")); 
    Console.WriteLine(LevenshteinDistance.Compute("flomax", "volmax")); 
    } 
} 

例2:

public class Distance { 

/// <summary> 
/// Compute Levenshtein distance 
/// </summary> 
/// <param name="s">String 1</param> 
/// <param name="t">String 2</param> 
/// <returns>Distance between the two strings. 
/// The larger the number, the bigger the difference. 
/// </returns> 

    public int LD (string s, string t) { 

    int n = s.Length; //length of s 

    int m = t.Length; //length of t 

    int[,] d = new int[n + 1, m + 1]; // matrix 

    int cost; // cost 

    // Step 1 

    if(n == 0) return m; 

    if(m == 0) return n; 

    // Step 2 

    for(int i = 0; i <= n; d[i, 0] = i++); 

    for(int j = 0; j <= m; d[0, j] = j++); 

    // Step 3 

    for(int i = 1; i <= n;i++) { 

     //Step 4 

     for(int j = 1; j <= m;j++) { 

     // Step 5 

     cost = (t.Substring(j - 1, 1) == s.Substring(i - 1, 1) ? 0 : 1); 

     // Step 6 

     d[i, j] = System.Math.Min(System.Math.Min(d[i - 1, j] + 1, d[i, j - 1] + 1), 
        d[i - 1, j - 1] + cost); 

     } 

    } 


    // Step 7 


    return d[n, m]; 

    } 
+0

這僅適用於雖然兩個字符串。不是嗎? – climbage 2011-04-11 18:13:31

+0

是的。但它可以很容易地修改,你想比較什麼?句子?你仍然可以使用它們來比較句子。 – kd7 2011-04-11 18:16:16

+0

是的,因此該方法的簽名要求您將兩個字符串傳遞給它。 – 2011-04-11 18:16:46