Some doubts about HashMap

HashMap is implemented in a very nice way, but to understand how it is implemented, you need a genius. So, I read about HashMap in java docs. I have small questions regarding the HashMap :

  • I know that the default HashMap capacity is 16. In java documents they indicated the Initial default capacity - MUST be the power of two. . Any specific reason for this?
  • I know a little how HashMap works based on HashCode, Bucket and LinkedList , if I'm not mistaken. Then the size of the HashMap increases. I mean how to manage the size of the buckets and the size of the LinkedList.
  • This may be a dumb question. When we add a new item to the HashMap , based on the HashCode, it directly accesses that particular bucket without traveling like in LinkedList . I'm right? And another thing is that it does not add an element under the head to the tails. What is the reason for this. Added a new element in the LinkedList header that is present inside the bucket to avoid moving the tail. Am I correct?
+6
source share
3 answers
  • Using the powers of the two simplifies the implementation and improves its performance.
    For instance. to find the bucket from the hash code, it can use the hash and (SIZE -1) instead of abs (hash)% SIZE

  • Until you know how the HashMap works exactly, you won’t be able to answer this question. If the card size exceeds a predetermined threshold determined by the load factor, for example. if the load factor is 0.75, it will resize the card after filling 75%. Like other collection classes, such as ArrayList , Java HashMap reformat itself by creating a new array of arrays twice the size of the previous HashMap , and then run each old element into this new bucket array. This process is called rehashing because it also uses a hash function to find a new bucket location.

  • We keep each new element at the head of the linked list in order to avoid moving the tail and, therefore, when resizing the entire sequence of objects in the linked list, they swap places during which there is the possibility of infinite loops.

More details here:

+2
source
  • The reason for creating a capacity with a capacity of 2 is (I think) mainly to simplify the code. There is a slight performance advantage, but it is close to negligible.

  • This happens as follows:

    • HashMap expands when you try to add a new record. This happens (roughly) when map.size() * load_factor > array.length . (Refer to the code for details.)

    • When expanding the HashMap, the array doubles in size. There is a hard limit ... imposed by the size of arrays in Java. After that, the HasMap array does not expand. (Instead, you just get longer and longer hash chains ...)

    • Nothing is being done to control the lengths of individual hash chains. Instead, when the HashMap expands, the records in each old chain are moved to the corresponding chains in the extended table. (At least in recent implementations, each node chain contains a cached hash value for writing, so there is no need to re-evaluate the hash functions during table expansion.)

  • In principle, yes and yes. New entries are added to the beginning of each hashhein, because it is the most efficient (time and space) for this. Since the order of the elements in the hash chain does not mean anything, it makes no sense to insert new entries in the tail of the chain. This also means that in a typical HashMap implementation, a table extension undoes the order of hash chain entries.


Note that the actual behavior and actual implementation data for HashMap different for different releases of Java. The only way to make sure that you are reading the source code of your version of Java is being used.

+2
source
  • I would suggest that the strength of the two requirements is to speed up the collection of buckets. If you have 16 buckets and an index of 578123 or something else, you can use simple AND to select a bucket instead of calculating 578123 mod 16, which is slower.

  • HashMap has a load factor that is 0.75 by default. If the number of factors> bucket quantity * load factor, the HashMap capacity is increased to maintain performance. I would suggest that it simply doubles the number of buckets and reassigns all the elements.

  • Sorry, I'm not sure if I understood this question correctly.

0
source

Source: https://habr.com/ru/post/958548/


All Articles