What is the best hash function for uint64_t keys from 0 to its maximum value?

Assuming we have a set of elements and we want to store them in a hash map (for example, std::unoredered_set), and each element has a type key uint64_t, the value of which can vary from 0 to its maximum possible value, this is the best choice for using a trivial hash function where the key hash is the key itself? Does it depend on the container used (i.e. a rare Google hash against an unordered map from STL)? The probability of occurrence of key values ​​is unknown.

+3
source share
3 answers

If all you need for the hash is uint64_t of any possible value with unknown probabilities, and your output should be uint64_t, then you will not get any benefits by changing the value. Just use the key itself.

If you knew something about the distribution of your values, or your values ​​were limited to a smaller range (which is actually the same as knowledge about the distribution), then it would be useful to apply the transformation to the key, but it depends on the implementation of the container. You would benefit from reduced collisions when the table converts the hash to the bucket index, but this depends on both the algorithm of the table and the current / average state of the table (how often each bucket is used).

+12
source

64- , . MurmerHash3 :

key ^= key >> 33;
key *= 0xff51afd7ed558ccd;
key ^= key >> 33;
key *= 0xc4ceb9fe1a85ec53;
key ^= key >> 33;

Numericical Recipes, 3rd Edition, :

public static UInt64 Next( UInt64 u )
  {
  UInt64 v = u * 3935559000370003845 + 2691343689449507681;

  v ^= v >> 21;
  v ^= v << 37;
  v ^= v >>  4;

  v *= 4768777513237032717;

  v ^= v << 20;
  v ^= v >> 41;
  v ^= v <<  5;

  return v;
  }
+4

HashMaps . - O(1), , , .

uint64_t , , GHASHLISH

The library GLIBis thread safe and is used by several open source projects. It does not support uint64_tas a key initially, so you must provide your own hash_function function.

As an example, you can use the FNV hash

Here is an example using a uint64to hash uint32using FNV:

#define FNV_offset_basis 2166136261
#define FNV_prime        16777619
guint c_uint64_t_hash(gpointer data)
{
  uint8_t* v =(uint8_t*)data;
  guint hash = FNV_offset_basis;
  for(int i=0;i<8;i++)
  {
    hash = hash ^ v[i];
    hash = hash * FNV_prime;
  }
return hash;
}
0
source

Source: https://habr.com/ru/post/1794017/


All Articles