A set of hashes works with buckets. It stores the values ββin these "buckets" according to their hash code. A bucket may contain several elements, depending on whether these members are equal using the equals(Object) method.
So, let's say we build a hash set with 10 buckets for the sake of argument, and add integers 1, 2, 3, 5, 7, 11, and 13 to it. The hash code for int is just int. The result is something like this:
- (empty)
- 1, 11
- 2
- 3, 13
- (empty)
- 5
- (empty)
- 7
- (empty)
- (empty)
The traditional way to use a set is to look and see if an element is in this set. Therefore, when we say: "In this set of 11?" the hash set will be modulo 11 by 10, get 1 and look in the second bucket (we, of course, start our buckets with 0).
It is really very important to make sure that members belong to a set or not. If we add another 11, the hash set will see if it is already there. He will not add it again, if so. It uses the equals(Object) method to determine that, and, of course, 11 equals 11.
The hash code for a string like "abc" depends on the characters in that string. When you add a repeating line, "abc", the hash set will look in the right bucket, and then use the equals(Object) method to see if the element is already there. The equals(Object) method for a string also depends on characters, so "abc" is equal to "abc".
If you use a StringBuffer, each StringBuffer has a hash code and equality based on its object id. It does not override the base methods equals(Object) and hashCode() , so each StringBuffer looks at the hash set as a different object. They are not really duplicates.
When you print StringBuffers in the output, you call the toString () method on StringBuffers. This makes them look like repeating lines, so you see this output.
This is why it is very important to override hashCode() if you override equals(Object) , otherwise Set will look in the wrong bucket and you will get very strange and unpredictable behavior!