Hash table with 64-bit values ​​as a key

I have a hash table whose keys have 64-bit values. The size of the table may have a different length of 2, for example 2, 4, 8, etc. I want the hash table function to work well for such cases, that is, it has minimal collisions. As an example, if I need a table size of 32, the hash function should create values ​​from 0 to 31 with minimal collision for 64-bit inputs.

I found good solutions for 32-bit inputs, but none of the 64-bit inputs have been.

For 32-bit keys, I use this function

#define hash32(x) ( (x) * 2654435761 ) unsigned int getHashKey( unsigned long x ) { return hash32(x) >> ( 32 - h_bits ); } 

It would be interesting to have hash32 (x) the equivalent of 64 bits.

+6
source share
4 answers

It seems to work very well. It uses the FVN hash constant for 64 bits, http://isthe.com/chongo/tech/comp/fnv/ .

 #define hash64(x) ( (unsigned long)(x) * 14695981039346656037 ) #define H_BITS 4 // Hashtable size = 2 ^ 4 = 16 #define H_SHIFT_64 ( 64 - H_BITS ) unsigned int getHashKey( unsigned long x ) { return hash64(x) >> H_SHIFT_64; } 
+1
source

Finding the perfect hash function is similar to finding the Holy Grail. In any case, it depends on the value.

If you need universal hash functions on x86, Murmur2, Meiyan, SBox and CRC32 provide good performance for all types of keys. For 64-bit values, you can also try CityHash.

+1
source

This page (and this ) contains several hash functions suitable for integers. Here is one for 64 bit integers:

 public long hash64shift(long key) { key = (~key) + (key << 21); // key = (key << 21) - key - 1; key = key ^ (key >>> 24); key = (key + (key << 3)) + (key << 8); // key * 265 key = key ^ (key >>> 14); key = (key + (key << 2)) + (key << 4); // key * 21 key = key ^ (key >>> 28); key = key + (key << 31); return key; } 
+1
source

Your 32-bit hash is a multiplicative hash using a prime close to gold, as suggested by Knut in TAOCP.

 phi = 0.5 * (sqrt(5) - 1) = 0.618... 2^32 * phi = 2654435769.497... 2^64 * phi = 11400714819323198485.951... 

2654435761 is the closest in the 32-bit case. With 64 bits, this is 11400714819323198549. Thus, the algorithm becomes:

 unsigned int getHashKey(unsigned long x) { return (x * 11400714819323198549ul) >> (64 - h_bits); } 
0
source

Source: https://habr.com/ru/post/894292/


All Articles