C ++: quick way to get an integer within a range

I need to generate hash keys for approximately N = 100 million keys. From my research, it turns out that murmur3 (MurmurHash3_x86_32, see murmur3 hash ) will be the fastest hash function with the best delay and low enough collision speed. The problem I am facing is that the function returns the key as void * . More specifically, the template:

void MurmurHash3_x86_32 (const void *key, int len, uint32_t seed, void *out);

Since the size of my hash table will be smaller than the largest hash that it can generate, I need to get it in the range of the table [0, N-1]. The easiest solution is to use the % operator. But, as you know, this is a slow operator, I wonder if there is a faster way to solve the problem.

One interesting suggestion was found that was found. Is there an alternative to using% (module) in C / C ++? in StackOverflow itself. He offers "strength two," the following works (subject to the submission of a double supplement):

return i & (n-1);

My problem is that on newer processors sometimes (or is it most of the time?), Performance degrades at a speed of 2 ^ n, IIRC due to multi-line cache lines. (This link contains an illustration regarding the inserts Big memory, part 3.5: Google sparsehash! ).

Currently, the advantages of murmur3 seem to be nullifying due to hardware problems and the well-known % operator inefficiency. Since performance is a limitation, I require a low latency solution and faster solutions on my demand, even if it is not MurmurHash3_x86_32.

+6
source share
1 answer

The problem I am facing is that the function returns the key as void * .

This is not true. It does not return anything ( void ). The hash result is written in the buffer (pointer) you specified to the last argument. For MurmurHash3_x86_32() it is most appropriate for him to point to a uint32_t pointer.

Since the size of my hash table will be smaller than the largest hash that it can generate, I need to get it in the range of the table [0, N-1]. The easiest solution is to use% operator. But, as you know, this is a slow operator, I wonder if there is a faster way to solve the problem.

% is not only the simplest solution, but also the most common. "Slow" relative - % slower than + , but much faster than a single call to MurmurHash3_x86_32() .

One interesting sentence that I found [...] suggests [use the size of the "power of two" table and calculate the module using the & operator]

Note that, contrary to the statement in the SO-answer, in fact, this does not depend on a double complement at all.

My problem is that on newer processors sometimes (or is it most of the time?), Performance degrades at a speed of 2 ^ n, IIRC due to multi-line cache lines. (This link contains an illustration regarding the inserts of Big Memory, part 3.5: Google sparsehash!).

The performance degradation described in the report that you link is due to repeated hashing, which seems quite plausible. This has nothing to do with the operations you are asking for. It can be assumed that caching (lack of) associativity can affect performance for large hash tables, but probably no more than large hash tables. The memory access patterns inherent to using a hash table naturally create poor cache locality. This is practically the point.

Currently, the advantages of murmur3 seem to be nullified due to hardware problems and the well-known% operator inefficiency. Since performance is a limitation, I request for lower requirements and faster solutions for my requirement, even if it is not MurmurHash3_x86_32.

You overdo it. Lacking the efficient use of your processor’s cache is simply the price you pay for using a large hash table. It is not related to the hash function (as long as the hash function does its job well). The cost of one arithmetic operation, whether % or & , will not be noticeable in comparison with the cost of computing a hash for work, so it is hardly important which one you choose. If you want a tiny advantage for this operation, use a two-two size table and the & operator. On the other hand, it discards some of the hash bits that you spent on so many problems to compute. Instead, select the simple size of the hash table and the % operator - then all the hash bits will help you choose a bucket, which can improve your spread.

+4
source

Source: https://habr.com/ru/post/989360/


All Articles