Method method comment for hash () method of HashMap class in java 8

/** * Computes key.hashCode() and spreads (XORs) higher bits of hash * to lower. Because the table uses power-of-two masking, sets of * hashes that vary only in bits above the current mask will * always collide. (Among known examples are sets of Float keys * holding consecutive whole numbers in small tables.) So we * apply a transform that spreads the impact of higher bits * downward. There is a tradeoff between speed, utility, and * quality of bit-spreading. Because many common sets of hashes * are already reasonably distributed (so don't benefit from * spreading), and because we use trees to handle large sets of * collisions in bins, we just XOR some shifted bits in the * cheapest possible way to reduce systematic lossage, as well as * to incorporate impact of the highest bits that would otherwise * never be used in index calculations because of table bounds. */ static final int hash(Object key) { int h; return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16); } 

below is an earlier version of JDK 1.6

 /** * Applies a supplemental hash function to a given hashCode, which * defends against poor quality hash functions. This is critical * because HashMap uses power-of-two length hash tables, that * otherwise encounter collisions for hashCodes that do not differ * in lower bits. Note: Null keys always map to hash 0, thus index 0. */ static int hash(int h) { // This function ensures that hashCodes that differ only by // constant multiples at each bit position have a bounded // number of collisions (approximately 8 at default load factor). h ^= (h >>> 20) ^ (h >>> 12); return h ^ (h >>> 7) ^ (h >>> 4); } 

can someone explain what benefits this hashing has used than it did in earlier versions of java. How will this affect the speed and quality of key distribution, and I mean the new hash function implemented in jdk 8, and how was it achieved to reduce conflicts?

+8
source share
2 answers

In situations where the hashCode method is behaving rather badly, the performance of the HashMap can drop dramatically. For example, let's say your hashCode method only generated bit number 16 .

This solves the problem with the xor hash code, which itself is shifted to the right 16 . If the number was well distributed, for this it should still be. If it was bad, it should improve it.

+3
source

Here is a good explanation of how HashMap works in Java 8. The following is a snippet from the same blog.

To understand this in the first place, we need to understand how the index is calculated:

Match the hash code with the index in the array. The easiest way to do this is to perform a modulo operation on the hash code and the length of the array, for example, the hash (key)% n. Using modulo ensures that the index i is always between 0 and n.

i = hash% n;

For a HashMap in Java, the index is calculated by the following expression:

i = (n - 1) & hash;

In this expression, the variable n refers to the length of the table, and hash refers to the key hash.

Since we compute the module by the bitmask ((n - 1) and hash), any bit above the high bit n - 1 will not be used by the module. For example, given n = 32 and 4 hash codes for the calculation. When executed modulo directly without hash code conversion, all indexes will be 1. Collision is 100%. This is due to the fact that mask 31 (n - 1), 0000 0000 0000 0000 0000 0000 0001 1111, makes any bit above position 5 unused in the number h. To use these high-order bits, the HashMap shifts them 16 positions to the left h >>> 16 and expands with the low-order bits (h ^ (h >>> 16)). As a result, the resulting module has less collision.

0
source

Source: https://habr.com/ru/post/1246882/


All Articles