2011-06-23 17 views
6

我工作的一個腳本來分級,通過比較兩個陣列的用戶響應。 (這是一個測驗,看他們如何很好地知道信息。)我已經有一些我需要的代碼,比如讓用戶響應小寫並分割它。我需要的是找出差異/錯誤的數量。例如:使用JavaScript級用戶的響應(比較兩個陣列)

var correctanswer = ["The","quick","brown","fox","jumped","over","the","lazy","dog"]; 
var useranswer = ["The","brown","fox","jumped","up","and","over","the","really","lazy","cat"]; 
alert(counterrors(correctanswer, useranswer)); 

在這個特殊的例子,運行我正在尋找的函數將返回用戶做出5個錯誤(他們省略「快」,補充「上」,「和」 ,「真的」,並且「狗」改爲「貓」)。如你所見,這兩個陣列的長度可能不同。

有誰知道如何處理呢?我想這可能是一個循環,如:

for (x in correctanswer) { 
    // compare correctanswer[x] to useranswer[x]... not sure how exactly. Seems tricky... 
} 

謝謝你看這個!我見過約翰Resig的差異溶液(http://ejohn.org/projects/javascript-diff-algorithm/)以及其他類似的東西,甚至幾陣比較,但似乎沒有什麼,因爲我發現返回的所有分歧,而我想看看有多少差異存在是那些工作。再次感謝您的關注,並請讓我知道任何問題。

更新:非常感謝Magnar的答案!它工作完美。

回答

6

你所追求的是兩個陣列的The Levenshtein Distance

它是計算一個序列轉換成另一種的添加缺失取代必要數量的算法。

Wikipedia page I linked具有僞代碼實現。我已經做了線換行轉換爲JavaScript爲您提供:

var correctanswer = ["The","quick","brown","fox","jumped","over","the","lazy","dog"]; 
var useranswer = ["The","brown","fox","jumped","up","and","over","the","really","lazy","cat"]; 

console.log(calculate_levenshtein_distance(correctanswer, useranswer)); 

function calculate_levenshtein_distance(s, t) { 
    var m = s.length + 1, n = t.length + 1; 
    var i, j; 

    // for all i and j, d[i,j] will hold the Levenshtein distance between 
    // the first i words of s and the first j words of t; 
    // note that d has (m+1)x(n+1) values 
    var d = []; 

    for (i = 0; i < m; i++) { 
    d[i] = [i]; // the distance of any first array to an empty second array 
    } 
    for (j = 0; j < n; j++) { 
    d[0][j] = j; // the distance of any second array to an empty first array 
    } 

    for (j = 1; j < n; j++) { 
    for (i = 1; i < m; i++) { 
     if (s[i - 1] === t[j - 1]) { 
     d[i][j] = d[i-1][j-1];   // no operation required 
     } else { 
     d[i][j] = Math.min(
        d[i - 1][j] + 1,  // a deletion 
        d[i][j - 1] + 1,  // an insertion 
        d[i - 1][j - 1] + 1 // a substitution 
       ); 
     } 
    } 
    } 

    return d[m - 1][n - 1]; 
} 

這將記錄5到控制檯。正如你將會看到的那樣,數組之間的距離是正確的。該學生沒有添加lazy。所以它是1個刪除,3個添加和1個替換。

+0

Magnar - 謝謝!我想這就是它......所以我會傳遞數組值而不是字符串,就像他們在文章中所做的那樣? – Alex

+0

是的,而不是字符,你會使用數組值;的話。 – Magnar

+0

非常感謝您的幫助!我還不太擅長JavaScript,但我會盡力從這裏拿走它。如果您對如何實施有任何建議,請告訴我。我將把這標記爲公認的答案。 – Alex

0

我不知道如果我完全理解你想要什麼,但我認爲這是解決方案。

function counterrors(a, b) { 
    var max = Math.max(a.length, b.length); 
    var min = Math.min(a.length, b.length); 
    var count = 0; 
    for (var i = 0; i < min; i+=1) { 
     if (a[i] !== b[i]) { 
      count += 1; 
     } 
    } 
    return count + max - min; // max - min for any extra things that don't match 
} 
var correctanswer = ["The", "quick", "brown", "fox", "jumped", "over", "the", "lazy", "dog"]; 
var useranswer = ["The", "brown", "fox", "jumped", "up", "and", "over", "the", "really", "lazy", "cat"]; 
alert(counterrors(correctanswer, useranswer)); 
+0

John - 感謝您的分享。我剛剛嘗試過,它計算了10個錯誤而不是6個...... – Alex

+0

如果需要按照該順序鍵入單詞,則10是正確的值。你是否想知道用戶沒有輸入的單詞數量是多少?我認爲Magnar有你正在尋找的解決方案。 –

+0

我試圖找到添加的次數加刪除加上單詞變體(如用「貓」交換「狗」)。所以不只是他們沒有輸入的內容,還有他們輸入的不正確的東西......任何其他的想法?編輯:剛剛看到馬格納爾的答案。 – Alex