Suppose I have a "long" hash, for example 16 bytes of MD5 or 20 bytes of SHA1. I want to reduce this hash to 4 bytes for purposes GetHashCode().
Firstly, I understand very well that I will get more collisions. In my case, this is perfectly normal, but I would rather have less likely collisions.
There are several solutions to my problem:
- I could take the first 4 bytes of the hash.
- I could take the last 4 bytes of the hash.
- I could take 4 random bytes of a hash.
- I could generate a hash hash including classic number multiplications.
Are there any other salts that I have not thought of? And more importantly, which method will give me the most unique hash code? I am currently assuming they are almost equivalent.
Microsoft chose that the assembly public key token is the last 8 bytes of the SHA1 hash of its public key, so I will probably work on this solution, but I would like to know why.
source
share