What is the best hash function for uint64_t keys from 0 to its maximum value?

Question

What is the best hash function for uint64_t keys from 0 to its maximum value?

Assuming we have a set of elements and we want to store them in a hash map (for example, std::unoredered_set), and each element has a type key uint64_t, the value of which can vary from 0 to its maximum possible value, this is the best choice for using a trivial hash function where the key hash is the key itself? Does it depend on the container used (i.e. a rare Google hash against an unordered map from STL)? The probability of occurrence of key values is unknown.

+3

c ++ hashtable algorithm hash

user405725 Feb 23 '11 at 0:57

source share

3 answers

64- , . MurmerHash3 :

key ^= key >> 33;
key *= 0xff51afd7ed558ccd;
key ^= key >> 33;
key *= 0xc4ceb9fe1a85ec53;
key ^= key >> 33;

Numericical Recipes, 3rd Edition, :

public static UInt64 Next( UInt64 u )
  {
  UInt64 v = u * 3935559000370003845 + 2691343689449507681;

  v ^= v >> 21;
  v ^= v << 37;
  v ^= v >>  4;

  v *= 4768777513237032717;

  v ^= v << 20;
  v ^= v >> 41;
  v ^= v <<  5;

  return v;
  }

+4

David Stafford 29 . '11 1:13

HashMaps . - O(1), , , .

uint64_t , , GHASHLISH

The library GLIBis thread safe and is used by several open source projects. It does not support uint64_tas a key initially, so you must provide your own hash_function function.

As an example, you can use the FNV hash

Here is an example using a uint64to hash uint32using FNV:

#define FNV_offset_basis 2166136261
#define FNV_prime        16777619
guint c_uint64_t_hash(gpointer data)
{
  uint8_t* v =(uint8_t*)data;
  guint hash = FNV_offset_basis;
  for(int i=0;i<8;i++)
  {
    hash = hash ^ v[i];
    hash = hash * FNV_prime;
  }
return hash;
}

0

salaheddine Nov 19 '17 at 23:16

source share

Thomas edleson · Accepted Answer · 2011-02-23T01:19:03+0000

If all you need for the hash is uint64_t of any possible value with unknown probabilities, and your output should be uint64_t, then you will not get any benefits by changing the value. Just use the key itself.

If you knew something about the distribution of your values, or your values were limited to a smaller range (which is actually the same as knowledge about the distribution), then it would be useful to apply the transformation to the key, but it depends on the implementation of the container. You would benefit from reduced collisions when the table converts the hash to the bucket index, but this depends on both the algorithm of the table and the current / average state of the table (how often each bucket is used).

What is the best hash function for uint64_t keys from 0 to its maximum value?

More articles: