I am looking for a hashing algorithm that generates a 31/32 bit / unsigned integer as a digest for utf8 string to use prng seed output, e.g. Park-Miller-Carta LCG or Mersenne-Twister.
I looked at FNV1 and FNV1a, but they give very close values for similar strings differing in their last character; I would like to have a low collision hash that changes dramatically with minimal modifications to the input string. Performance is not a problem.
My current approach is a dirty LCG that uses character codes and prime as factors:
a = 524287;
for ( i = 0; i < n; i ++ )
a = ( a * string.charCodeAt ( i ) * 16807 + 524287 ) % 2147483647;
Please let me know about the best alternatives.
source
share