是否存在Jenkins hash算法的本地Python實現?Jenkins哈希的Python實現?
我需要一個採用任意字符串並將其轉換爲32位整數的散列算法。對於給定的字符串,它必須保證跨平臺返回相同的整數。
我看過ELF哈希算法,其中我找到了一個Python實現。鑑於上述標準,這可能是一個合適的替代品嗎? (http://www.partow.net/programming/hashfunctions/#ELFHashFunction)
是否存在Jenkins hash算法的本地Python實現?Jenkins哈希的Python實現?
我需要一個採用任意字符串並將其轉換爲32位整數的散列算法。對於給定的字符串,它必須保證跨平臺返回相同的整數。
我看過ELF哈希算法,其中我找到了一個Python實現。鑑於上述標準,這可能是一個合適的替代品嗎? (http://www.partow.net/programming/hashfunctions/#ELFHashFunction)
PyPi上已經有"jenkins" package。但它不是「土」爲核心的實現是C.
你可以只使用了截至實現它在我自己的的md5sum
>>> from hashlib import md5
>>> from struct import unpack
>>> unpack("<IIII",md5("thestring").digest())[0]
1515985217
通常MD5是由於速度原因而避免,除非你真的需要加密哈希。但是,當然,MD5的C庫實現可能會遠遠超過上述算法的純Python實現! – 2010-09-07 21:47:01
的前32位。根據原始版權聲明發布(見下文)。使用您自己的風險,並有樂趣:-)
'''Implements a straight Jenkins lookup hash - http://burtleburtle.net/bob/hash/doobs.html
Usage:
from jhash import jhash
print jhash('My hovercraft is full of eels')
Returns: unsigned 32 bit integer value
Prereqs: None
Tested with Python 2.6.
Version 1.00 - [email protected] - 23.08.2010
Partly based on the Perl module Digest::JHash
http://search.cpan.org/~shlomif/Digest-JHash-0.06/lib/Digest/JHash.pm
Original copyright notice:
By Bob Jenkins, 1996. [email protected] You may use this
code any way you wish, private, educational, or commercial. It's free.
See http://burtleburtle.net/bob/hash/evahash.html
Use for hash table lookup, or anything where one collision in 2^^32 is
acceptable. Do NOT use for cryptographic purposes.
'''
def mix(a, b, c):
'''mix() -- mix 3 32-bit values reversibly.
For every delta with one or two bits set, and the deltas of all three
high bits or all three low bits, whether the original value of a,b,c
is almost all zero or is uniformly distributed,
* If mix() is run forward or backward, at least 32 bits in a,b,c
have at least 1/4 probability of changing.
* If mix() is run forward, every bit of c will change between 1/3 and
2/3 of the time. (Well, 22/100 and 78/100 for some 2-bit deltas.)'''
# Need to constrain U32 to only 32 bits using the & 0xffffffff
# since Python has no native notion of integers limited to 32 bit
# http://docs.python.org/library/stdtypes.html#numeric-types-int-float-long-complex
a &= 0xffffffff; b &= 0xffffffff; c &= 0xffffffff
a -= b; a -= c; a ^= (c>>13); a &= 0xffffffff
b -= c; b -= a; b ^= (a<<8); b &= 0xffffffff
c -= a; c -= b; c ^= (b>>13); c &= 0xffffffff
a -= b; a -= c; a ^= (c>>12); a &= 0xffffffff
b -= c; b -= a; b ^= (a<<16); b &= 0xffffffff
c -= a; c -= b; c ^= (b>>5); c &= 0xffffffff
a -= b; a -= c; a ^= (c>>3); a &= 0xffffffff
b -= c; b -= a; b ^= (a<<10); b &= 0xffffffff
c -= a; c -= b; c ^= (b>>15); c &= 0xffffffff
return a, b, c
def jhash(data, initval = 0):
'''hash() -- hash a variable-length key into a 32-bit value
data : the key (the unaligned variable-length array of bytes)
initval : can be any 4-byte value, defaults to 0
Returns a 32-bit value. Every bit of the key affects every bit of
the return value. Every 1-bit and 2-bit delta achieves avalanche.'''
length = lenpos = len(data)
# empty string returns 0
if length == 0:
return 0
# Set up the internal state
a = b = 0x9e3779b9 # the golden ratio; an arbitrary value
c = initval # the previous hash value
p = 0 # string offset
# ------------------------- handle most of the key in 12 byte chunks
while lenpos >= 12:
a += (ord(data[p+0]) + (ord(data[p+1])<<8) + (ord(data[p+2])<<16) + (ord(data[p+3])<<24))
b += (ord(data[p+4]) + (ord(data[p+5])<<8) + (ord(data[p+6])<<16) + (ord(data[p+7])<<24))
c += (ord(data[p+8]) + (ord(data[p+9])<<8) + (ord(data[p+10])<<16) + (ord(data[p+11])<<24))
a, b, c = mix(a, b, c)
p += 12
lenpos -= 12
# ------------------------- handle the last 11 bytes
c += length
if lenpos >= 11: c += ord(data[p+10])<<24
if lenpos >= 10: c += ord(data[p+9])<<16
if lenpos >= 9: c += ord(data[p+8])<<8
# the first byte of c is reserved for the length
if lenpos >= 8: b += ord(data[p+7])<<24
if lenpos >= 7: b += ord(data[p+6])<<16
if lenpos >= 6: b += ord(data[p+5])<<8
if lenpos >= 5: b += ord(data[p+4])
if lenpos >= 4: a += ord(data[p+3])<<24
if lenpos >= 3: a += ord(data[p+2])<<16
if lenpos >= 2: a += ord(data[p+1])<<8
if lenpos >= 1: a += ord(data[p+0])
a, b, c = mix(a, b, c)
# ------------------------- report the result
return c
if __name__ == "__main__":
hashstr = 'My hovercraft is full of eels'
myhash = jhash(hashstr)
print 'jhash("%s"): %d' % (hashstr, myhash)
這種原生的Python代碼應該給你相同的哈希與原始lookup3.c
# Need to constrain U32 to only 32 bits using the & 0xffffffff
# since Python has no native notion of integers limited to 32 bit
# http://docs.python.org/library/stdtypes.html#numeric-types-int-float-long-complex
'''Original copyright notice:
By Bob Jenkins, 1996. [email protected] You may use this
code any way you wish, private, educational, or commercial. Its free.
'''
def rot(x,k):
return (((x)<<(k)) | ((x)>>(32-(k))))
def mix(a, b, c):
a &= 0xffffffff; b &= 0xffffffff; c &= 0xffffffff
a -= c; a &= 0xffffffff; a ^= rot(c,4); a &= 0xffffffff; c += b; c &= 0xffffffff
b -= a; b &= 0xffffffff; b ^= rot(a,6); b &= 0xffffffff; a += c; a &= 0xffffffff
c -= b; c &= 0xffffffff; c ^= rot(b,8); c &= 0xffffffff; b += a; b &= 0xffffffff
a -= c; a &= 0xffffffff; a ^= rot(c,16); a &= 0xffffffff; c += b; c &= 0xffffffff
b -= a; b &= 0xffffffff; b ^= rot(a,19); b &= 0xffffffff; a += c; a &= 0xffffffff
c -= b; c &= 0xffffffff; c ^= rot(b,4); c &= 0xffffffff; b += a; b &= 0xffffffff
return a, b, c
def final(a, b, c):
a &= 0xffffffff; b &= 0xffffffff; c &= 0xffffffff
c ^= b; c &= 0xffffffff; c -= rot(b,14); c &= 0xffffffff
a ^= c; a &= 0xffffffff; a -= rot(c,11); a &= 0xffffffff
b ^= a; b &= 0xffffffff; b -= rot(a,25); b &= 0xffffffff
c ^= b; c &= 0xffffffff; c -= rot(b,16); c &= 0xffffffff
a ^= c; a &= 0xffffffff; a -= rot(c,4); a &= 0xffffffff
b ^= a; b &= 0xffffffff; b -= rot(a,14); b &= 0xffffffff
c ^= b; c &= 0xffffffff; c -= rot(b,24); c &= 0xffffffff
return a, b, c
def hashlittle2(data, initval = 0, initval2 = 0):
length = lenpos = len(data)
a = b = c = (0xdeadbeef + (length) + initval)
c += initval2; c &= 0xffffffff
p = 0 # string offset
while lenpos > 12:
a += (ord(data[p+0]) + (ord(data[p+1])<<8) + (ord(data[p+2])<<16) + (ord(data[p+3])<<24)); a &= 0xffffffff
b += (ord(data[p+4]) + (ord(data[p+5])<<8) + (ord(data[p+6])<<16) + (ord(data[p+7])<<24)); b &= 0xffffffff
c += (ord(data[p+8]) + (ord(data[p+9])<<8) + (ord(data[p+10])<<16) + (ord(data[p+11])<<24)); c &= 0xffffffff
a, b, c = mix(a, b, c)
p += 12
lenpos -= 12
if lenpos == 12: c += (ord(data[p+8]) + (ord(data[p+9])<<8) + (ord(data[p+10])<<16) + (ord(data[p+11])<<24)); b += (ord(data[p+4]) + (ord(data[p+5])<<8) + (ord(data[p+6])<<16) + (ord(data[p+7])<<24)); a += (ord(data[p+0]) + (ord(data[p+1])<<8) + (ord(data[p+2])<<16) + (ord(data[p+3])<<24));
if lenpos == 11: c += (ord(data[p+8]) + (ord(data[p+9])<<8) + (ord(data[p+10])<<16)); b += (ord(data[p+4]) + (ord(data[p+5])<<8) + (ord(data[p+6])<<16) + (ord(data[p+7])<<24)); a += (ord(data[p+0]) + (ord(data[p+1])<<8) + (ord(data[p+2])<<16) + (ord(data[p+3])<<24));
if lenpos == 10: c += (ord(data[p+8]) + (ord(data[p+9])<<8)); b += (ord(data[p+4]) + (ord(data[p+5])<<8) + (ord(data[p+6])<<16) + (ord(data[p+7])<<24)); a += (ord(data[p+0]) + (ord(data[p+1])<<8) + (ord(data[p+2])<<16) + (ord(data[p+3])<<24));
if lenpos == 9: c += (ord(data[p+8])); b += (ord(data[p+4]) + (ord(data[p+5])<<8) + (ord(data[p+6])<<16) + (ord(data[p+7])<<24)); a += (ord(data[p+0]) + (ord(data[p+1])<<8) + (ord(data[p+2])<<16) + (ord(data[p+3])<<24));
if lenpos == 8: b += (ord(data[p+4]) + (ord(data[p+5])<<8) + (ord(data[p+6])<<16) + (ord(data[p+7])<<24)); a += (ord(data[p+0]) + (ord(data[p+1])<<8) + (ord(data[p+2])<<16) + (ord(data[p+3])<<24));
if lenpos == 7: b += (ord(data[p+4]) + (ord(data[p+5])<<8) + (ord(data[p+6])<<16)); a += (ord(data[p+0]) + (ord(data[p+1])<<8) + (ord(data[p+2])<<16) + (ord(data[p+3])<<24));
if lenpos == 6: b += ((ord(data[p+5])<<8) + ord(data[p+4])); a += (ord(data[p+0]) + (ord(data[p+1])<<8) + (ord(data[p+2])<<16) + (ord(data[p+3])<<24))
if lenpos == 5: b += (ord(data[p+4])); a += (ord(data[p+0]) + (ord(data[p+1])<<8) + (ord(data[p+2])<<16) + (ord(data[p+3])<<24));
if lenpos == 4: a += (ord(data[p+0]) + (ord(data[p+1])<<8) + (ord(data[p+2])<<16) + (ord(data[p+3])<<24))
if lenpos == 3: a += (ord(data[p+0]) + (ord(data[p+1])<<8) + (ord(data[p+2])<<16))
if lenpos == 2: a += (ord(data[p+0]) + (ord(data[p+1])<<8))
if lenpos == 1: a += ord(data[p+0])
a &= 0xffffffff; b &= 0xffffffff; c &= 0xffffffff
if lenpos == 0: return c, b
a, b, c = final(a, b, c)
return c, b
def hashlittle(data, initval=0):
c, b = hashlittle2(data, initval, 0)
return c
if __name__ == "__main__":
import sys
hashstr = 'Four score and seven years ago'
hash, hash2 = hashlittle2(hashstr, 0xdeadbeef, 0xdeadbeef)
print '"%s": %x %x' % (hashstr, hash, hash2)
hash = hashlittle(hashstr, 0)
print '"%s": %x' % (hashstr, hash)
令人驚歎!完美地工作,我測試了我的本地c實現相同的散列,非常感謝!,你爲我節省了大量工作:) – 2011-07-23 19:37:04
是的,我已經找到了,但當我嘗試在Windows上安裝時,收到有關缺少.bat文件的錯誤消息。我正在尋找一個解決方案,只需最少量的先決條件。 – DavidR 2010-07-19 17:42:13
Jenkins 1.0.2 PyPi軟件包在Linux上也無法生成「OSError:/usr/lib/python2.6/dist-packages/lookup3.so:無法打開共享目標文件:沒有這樣的文件或目錄」 – 2013-05-07 20:44:59