Overriding hashCode in Java for a specific case

Question

Overriding hashCode in Java for a specific case

I know there are other questions about common best practices, while riding hashCode is equal, but I have a very specific question.

I have a class that has an instance variable, an array of the same class. To be more explicit, here is the code:

Class Node{ Node arr[] = new Node[5]; }

I need to rewrite the hashCode for the Node class, and the array is an important deciding factor in determining whether two Nodes are the same. How can I effectively include an array in a hashCode calculation?

- Edit -

I am trying to check if these two nodes coincide, which means that they have the same number of children, and that these children lead to the same conditions. So I am actually trying to compare subtrees on two node. I am wondering if I can use hashing to verify this equality.

I think I really need to hash the entire subtree, but I'm not sure how I would do it, given the recursive nature of my class definition.

+6

java hashcode overwrite

efficiencyIsBliss May 04 '11 at 17:32

source share

5 answers

Joseph Ottinger · Answer 1 · 2011-05-04T17:35:01+0000

Include http://download.oracle.com/javase/6/docs/api/java/util/Arrays.html#hashCode (java.lang.Object []) as part of the hashCode () implementation.

Jverstry · Answer 2 · 2011-05-04T18:04:21+0000

I am trying to check if these two nodes are the same, which means that they are the same number of children, and that these children lead to the same condition. Therefore, I am effectively trying to compare subtrees to two nodes. I am wondering if I can use hashing to perform this equality check.

No, hashing should not be used to verify equality. This is not his goal. Ultimately, this will help you find out if objects are objective, but he will not tell you if they are equal.

The same objects will generate the same hash value, but two different objects that are not equal can generate the same hash. In other words, if the hash values are different, you know for sure that the objects are different. What is it.

If you want to test equality, you need to implement equals. In your case, there is a danger that your method will be recursive and cause a stack overflow. What if your object contains a link to itself?

If you want to generate a hash, you can take the size of the array into account (and the fact that it is zero or not), but I would not use the hash value of the objects in the array, due to potential infinite loops. It is not perfect, but it is good enough.

There is another radical method that can also provide a good result. Instead of dynamically calculating the hash values, set a random int value for each instance of the Node object (I mean once to create at creation and always return that value). In your case, you do not risk endless loops by taking the hash value of the object instances in your array.

If the hashes are equal, then you will need to start comparing the instances of the array objects.

REM: If the nodes contain other attributes, then calculate the hash on these other attributes and forget about the array. Start exploring the contents or size of the array if and only if the hash is identical between two objects.

REM2: Comments mention the DAG graph, which means that we will not use recursion problems. However, this condition is not enough to guarantee that deepHashCode () will succeed. Moreover, this would also be redundant. There is a more efficient way to solve this problem.

If the hash method used by Node only uses an array to calculate the hash value, then deepHashCode () may work. But that would be ineffective. If the hash method uses other Node attributes, then these attributes must also be equal.

There is a faster way to compare nodes for equality. Label each instance of Node with a unique number. Then, to compare the two nodes, first compare their array size. If it is equal, compare the nodes from each array using their unique number. If one array does not have a "different" node, then we are not dealing with equal nodes. This solution is much faster than recursive.

Jon skeet · Answer 3 · 2011-05-04T17:35:27+0000

It depends on what your equality criteria are. Is array order important? If so, you probably want the hash code to depend on the order of the nodes in the array. If not, you can do something like XOR-ing the hash codes of all the nodes in the array. Presumably some of the values may be null (so be careful).

Basically, you need to redefine hashCode and equals sequentially so that if the two objects are equal, they will have the same hash code. This is the golden rule.

Eric Lippert has a great blog post about GetHashCode in .NET - tips are equally good for Java.

One potential problem to be aware of is that if you end the loop in your nodes (the link to node A appearing in the array of node B and vice versa), you can end the cycle in computing the hash code.

Yasin bahtiyar · Answer 4 · 2011-05-04T17:37:17+0000

You can use the Arrays.hashCode() and Arrays.equals() methods.

Dilum rananunga · Answer 5 · 2011-05-04T19:31:45+0000

A few words to add to the current answers if performance is causing any concern.

First, you need to decide whether the order of the child nodes in the node is worth it. If they do not, you cannot use hashcode for the array. Think about how to create a hash code function that is defined by java.util.Set . Also consider using some order inside to improve peer performance. For example, if the subtree depths / heights differ, you can sort by depth.

Secondly, if your subtrees are deep, your hash code can become very expensive. Therefore, I would cache the hash code and compute it during construction (if your node is unchanged) or invalid during mutation and recalculated on demand.

Third, if your subtrees are deep, check hashcode for equals () and return false earlier. Yes, hashcode is checked by Map implementations, but there are places where the code simply compares two objects using equals (), and they can pay a heavy price.

Finally, consider using Arrays.asList () (if the ordering order makes sense) or a HashSet (if the ordering doesn't matter and the two children are equal) instead of a simple array. Then equals and hashcode come down to delegating the call to the container instance ... with the corresponding hashcode caching, of course.

Overriding hashCode in Java for a specific case

More articles: