SuperFastHash返回不同的值爲相同的字符串

我想在一個小項目中使用SuperFastHash，我似乎無法理解爲什麼它給同一個字符串不同的哈希。如果指針和字符串都相同，它只輸出相同的散列。有任何想法嗎？代碼證明如下。SuperFastHash返回不同的值爲相同的字符串

// SuperFastHash, taken from http://www.azillionmonkeys.com/qed/hash.html 
#include <stdint.h> 
#include <stdio.h> 
#include <stdlib.h> 

#undef get16bits 
#if (defined(__GNUC__) && defined(__i386__)) || defined(__WATCOMC__) \ 
    || defined(_MSC_VER) || defined (__BORLANDC__) || defined (__TURBOC__) 
#define get16bits(d) (*((const uint16_t *) (d))) 
#endif 

#if !defined (get16bits) 
#define get16bits(d) ((((uint32_t)(((const uint8_t *)(d))[1])) << 8)\ 
         +(uint32_t)(((const uint8_t *)(d))[0])) 
#endif 

uint32_t SuperFastHash (const char * data, int len); 

int main(void) 
{ 
    char* str = "a\0a"; 
    printf("%s\n", &str[0]); // a 
    printf("%s\n", &str[2]); // a 
    printf("%i\n", SuperFastHash(&str[0], 25)); // -1120168156 
    printf("%i\n", SuperFastHash(&str[2], 25)); // -280310739 
} 

uint32_t SuperFastHash (const char * data, int len) { 
uint32_t hash = len, tmp; 
int rem; 

    if (len <= 0 || data == NULL) return 0; 

    rem = len & 3; 
    len >>= 2; 

    /* Main loop */ 
    for (;len > 0; len--) { 
     hash += get16bits (data); 
     tmp = (get16bits (data+2) << 11)^hash; 
     hash = (hash << 16)^tmp; 
     data += 2*sizeof (uint16_t); 
     hash += hash >> 11; 
    } 

    /* Handle end cases */ 
    switch (rem) { 
     case 3: hash += get16bits (data); 
       hash ^= hash << 16; 
       hash ^= ((signed char)data[sizeof (uint16_t)]) << 18; 
       hash += hash >> 11; 
       break; 
     case 2: hash += get16bits (data); 
       hash ^= hash << 11; 
       hash += hash >> 17; 
       break; 
     case 1: hash += (signed char)*data; 
       hash ^= hash << 10; 
       hash += hash >> 1; 
    } 

    /* Force "avalanching" of final 127 bits */ 
    hash ^= hash << 3; 
    hash += hash >> 5; 
    hash ^= hash << 4; 
    hash += hash >> 17; 
    hash ^= hash << 25; 
    hash += hash >> 6; 

    return hash; 
}

來源

2014-02-12 user1637451

你len參數（25）超過其大概意思是1 char* str = "a\0a"內存佈局是{ 'a', 0, 'a', 0 }四個字符後的字符串的大小，這是不確定的，很可能不是23倍相同的值（其當然會導致相同的散列值）。 SuperFastHash函數會忽略字符串終止符，它會通過參數0123來評估您定義的字符數量。

至於看到功能的正常使用，嘗試例如：

#include <assert.h> 

int main(void) 
{ 
    char *buf = "abc\0abc"; 
    assert(SuperFastHash(&buf[0], 3) == SuperFastHash(&buf[4], 3)); 
    // etc. 
}

來源

2014-02-12 22:10:46 Wolf

我覺得很蠢。謝謝！ – user1637451

四隻眼睛看到兩個以上。別客氣！ – Wolf

如果我理解正確的，你有一個單字符字符串，你散列該字符串的25個字符。所以它會讀取你的字符和NULL字節，然後讀出23個字符。

來源

2014-02-12 22:12:23 jameswilddev

SuperFastHash返回不同的值爲相同的字符串

回答

相關問題