您可以使用itertools.groupby()
這一點,這裏有一個例子:
from itertools import groupby
# this just sets up some byte strings to use, Python 2.x version is below
# instead of this you would use f1 = open('some_file', 'rb').read()
f1 = bytes(int(b, 16) for b in 'FF FF FF FF 00 00 00 00 FF FF 44 43 42 41 FF FF'.split())
f2 = bytes(int(b, 16) for b in '41 42 43 44 00 00 00 00 44 43 42 41 40 39 38 37'.split())
matches = []
for k, g in groupby(range(min(len(f1), len(f2))), key=lambda i: f1[i] == f2[i]):
if k:
pos = next(g)
length = len(list(g)) + 1
matches.append((pos, length))
或如上所述使用列表理解同樣的事情:
matches = [(next(g), len(list(g))+1)
for k, g in groupby(range(min(len(f1), len(f2))), key=lambda i: f1[i] == f2[i])
if k]
這裏是如果你的例子設置正在使用Python 2.x:
f1 = ''.join(chr(int(b, 16)) for b in 'FF FF FF FF 00 00 00 00 FF FF 44 43 42 41 FF FF'.split())
f2 = ''.join(chr(int(b, 16)) for b in '41 42 43 44 00 00 00 00 44 43 42 41 40 39 38 37'.split())
http://docs.python.org/2/library/difflib.html - 第一結果在谷歌 「在python DIFF」[在python/PHP的兩個字符串之間差(的 – Andrey 2013-04-03 21:26:51
可能重複的HTTP ://stackoverflow.com/questions/1209800/difference-between-two-strings-in-python-php) – Andrey 2013-04-03 21:27:45
@Andrey感謝,我試過了,但現在看來,'get_matching_blocks()'不檢查字節在每個文件中位於同一位置,只是序列存在於每個文件中。否則,是的,這正是我想要的。 – omghai2u 2013-04-03 21:28:12