2015-12-21 81 views
1
import difflib 

test1 = ")\n)" 
test2 = "#)\n #)" 

d = difflib.Differ() 
diff = d.compare(test1.splitlines(), test2.splitlines()) 
print "\n".join(diff) 

OUTPUT:Python的difflib:沒有檢測到變化

-) 
+ #) 
- ) 
+ #) 
? + 

,你可以看到,它沒有檢測到了第一線的變化(沒有?線),但它在第二行做

任何人都知道爲什麼difflib認爲它是一個刪除/添加,並沒有改變?

回答

2

單字符字符串是邊緣情況。對於兩個或多個字符,插入一個字符始終可以正確檢測。下面是一個簡單的算法來證明:

import difflib 

def show_diffs(limit): 
    characters = 'abcdefghijklmnopqrstuvwxyz' 
    differ = difflib.Differ() 
    for length in range(1, limit + 1): 
     for pos in range(0, length + 1): 
      line_a = characters[:length] 
      line_b = line_a[:pos] + 'A' + line_a[pos:] 
      diff = list(differ.compare([line_a], [line_b])) 
      if len(diff) == 2 and diff[0][0] == '-' and diff[1][0] == '+': 
       marker = 'N' # Insertion not detected 
      elif len(diff) == 3 and diff[0][0] == '-' and diff[1][0] == '+' and diff[2][0] == '?': 
       marker = 'Y' # Insertion detected 
      else: 
       print('ERROR: unexpected diff for %r -> %r:\n%r' % (line_a, line_b, diff)) 
       return 
      print('%s %r -> %r' % (marker, line_a, line_b)) 

show_diffs(limit=3) 

據「失敗」只爲1個字符的字符串:

N 'a' -> 'Aa' 
N 'a' -> 'aA' 
Y 'ab' -> 'Aab' 
Y 'ab' -> 'aAb' 
Y 'ab' -> 'abA' 
Y 'abc' -> 'Aabc' 
Y 'abc' -> 'aAbc' 
Y 'abc' -> 'abAc' 
Y 'abc' -> 'abcA' 
+0

由於生病只是處理這種邊緣情況 – ealeon