How are Trove collections more efficient than standard Java collections?

In a recent interview, I was asked how HashMap works in Java, and I was able to explain it well and explain that in the worst case, HashMap could degenerate into a list because of the chain. I was asked to find out a way to improve this work, but I could not do it during the interview. The interviewer asked me to watch the Trail.

I believe that he pointed to this page . I read the description provided on this page, but still can not understand how it overcomes the limitations of java.util.HashMap.

Even a hint would be appreciated. Thanks!!

+6
source share
3 answers

The key phrase is open addressing. Instead of hashing the bucket array, all records are in one large array. When you add an element, if the space for it is already in use, you simply move around the array to find the free space.

As long as the array is maintained large enough than the number of records, and the hash function is well distributed, you can keep the average search time small. And having one array, you can get better performance - it looks more like a cache.

However, it still has the worst linear behavior if (say) all the key hashes have the same value, so it does not avoid this problem.

+6
source

It seems to me that there are two main differences on the Trove page that increase productivity.

The first is the use of open addressing ( http://en.wikipedia.org/wiki/Hash_table#Open_addressing ). This does not eliminate the collision problem, but it does mean that there is no need to create β€œEnter” objects for each element that is on the map.

The second important difference is to provide your own hash function, which is different from that provided by the key class. This way you can provide a much faster hash function if it makes sense to do so.

+4
source

One of the advantages of Trove is that it avoids creating an object, especially for primitives. For large hash tables in an embedded Java device, this can be beneficial due to less memory consumption.

Another benefit I've seen is the use of custom hash codes / functions without the need to override hashcode (). For a particular dataset and expert in writing hash functions, this can be an advantage.

+4
source

Source: https://habr.com/ru/post/958209/


All Articles