How to "reduce" the hash?

Suppose I have a "long" hash, for example 16 bytes of MD5 or 20 bytes of SHA1. I want to reduce this hash to 4 bytes for purposes GetHashCode().

Firstly, I understand very well that I will get more collisions. In my case, this is perfectly normal, but I would rather have less likely collisions.

There are several solutions to my problem:

  • I could take the first 4 bytes of the hash.
  • I could take the last 4 bytes of the hash.
  • I could take 4 random bytes of a hash.
  • I could generate a hash hash including classic number multiplications.

Are there any other salts that I have not thought of? And more importantly, which method will give me the most unique hash code? I am currently assuming they are almost equivalent.

Microsoft chose that the assembly public key token is the last 8 bytes of the SHA1 hash of its public key, so I will probably work on this solution, but I would like to know why.

+3
source share
5 answers

Any hash is already a shorthand.

Cryptographic hashes are designed so that no part of the data has a greater impact on any part of the hash than any other. Therefore, it doesn't matter which hash bits you choose.

+8
source

Any option except the third one selects bytes randomly - it works fine. If you select bytes randomly, the same input will generate different hash codes every time, which defeats the goal of the hash code.

+5
source

4 , , SHA1, , GetHashCode.

4 - SHA1 , , .

+1
source

If you have a reasonable number of hashes, specify them (for example, store in a database):

1 - 987baf9gfd79b7979debe90085eadf5
2 - 9754gccgfd79s7979abbc90085eadf5
...
0
source

If your current hash is held as a string, just call GetHashCode on that string and it will return you int, 4 bytes.

Any use?

0
source

Source: https://habr.com/ru/post/1749827/


All Articles