3
我在一大組圖像中使用dHash(http://www.hackerfactor.com/blog/index.php?url=archives/529-Kind-of-Like-That.html)。 默認大小調整大小爲8個像素:圖像哈希指紋碰撞(dHash)
def dhash(image, hash_size=8):
"""
Difference Hash computation.
following http://www.hackerfactor.com/blog/index.php?/archives/529-Kind-of-Like-That.html
@image must be a PIL instance.
"""
image = image.convert("L").resize((hash_size + 1, hash_size), Image.ANTIALIAS)
pixels = numpy.array(image.getdata(), dtype=numpy.float).reshape((hash_size + 1, hash_size))
# compute differences
diff = pixels[1:, :] > pixels[:-1, :]
return ImageHash(diff)
如果我們將這種算法做大量的圖像我不是會得到碰撞,由於短哈希指紋?
什麼是最好的hash_size? hash_size越大,其準確度越高嗎?是因爲一些具體的原因8嗎?