I have a hash table whose keys are 64 bit values. The Table size can be of different lengths of power 2, such as 2, 4, 8 etc... I want a hash table function that works well for such cases, that is, it has minimum collisions. As an example, If I want a table size of 32, the hash function should produce values from 0 to 31 with minimum collision for 64 bit inputs.
I have found good solutions for 32 bit inputs but none for 64 bits inputs yet.
For 32 b开发者_Python百科it keys, I'm using this function
#define hash32(x) ( (x) * 2654435761 )
unsigned int getHashKey( unsigned long x )
{
return hash32(x) >> ( 32 - h_bits );
}
Would to be interesting to have the hash32(x) equivalent of 64 bit.
The search for a perfect hash function is like the search for the Holy Grail. Anyway it depends on the value.
If you need a general-purpose hashing functions on x86, Murmur2, Meiyan, SBox, and CRC32 provide good performance for all kinds of keys. For 64bit values you can also try CityHash .
This page (and this) has a few hash functions suitable for integers. Here's one for 64 bit integers:
public long hash64shift(long key)
{
key = (~key) + (key << 21); // key = (key << 21) - key - 1;
key = key ^ (key >>> 24);
key = (key + (key << 3)) + (key << 8); // key * 265
key = key ^ (key >>> 14);
key = (key + (key << 2)) + (key << 4); // key * 21
key = key ^ (key >>> 28);
key = key + (key << 31);
return key;
}
This seems to be working pretty fine. It uses the FVN hash constant for 64 bit, http://isthe.com/chongo/tech/comp/fnv/.
#define hash64(x) ( (unsigned long)(x) * 14695981039346656037 )
#define H_BITS 4 // Hashtable size = 2 ^ 4 = 16
#define H_SHIFT_64 ( 64 - H_BITS )
unsigned int getHashKey( unsigned long x )
{
return hash64(x) >> H_SHIFT_64;
}
Your 32-bit hash is a multiplicative hash using a prime near to the golden ratio as suggested by Knuth in TAOCP.
phi = 0.5 * (sqrt(5) - 1) = 0.618...
2^32 * phi = 2654435769.497...
2^64 * phi = 11400714819323198485.951...
2654435761 is the nearest prime in the 32-bit case. With 64 bits, it's 11400714819323198549. So the algorithm becomes:
unsigned int getHashKey(unsigned long x) {
return (x * 11400714819323198549ul) >> (64 - h_bits);
}
精彩评论