TreeSet shows incorrect output

While working with a set of trees, I found a very peculiar behavior. In my understanding, this program should print two identical lines:

public class TestSet { static void test(String... args) { Set<String> s = new TreeSet<String>(String.CASE_INSENSITIVE_ORDER); s.addAll(Arrays.asList("a", "b")); s.removeAll(Arrays.asList(args)); System.out.println(s); } public static void main(String[] args) { test("A"); test("A", "C"); } } 

but strangely he prints:

 [b] [a, b] 

Why does a tree work like this?

+43
java treeset
May 11 '17 at 13:19
source share
4 answers

This is because Sort ComparSets Comparator is used for sorting, but removeAll relies on the equals method for each element. From the SortedSet Documentation :

Note that the order supported by the sorted set (be it an explicit comparator) must be matched if the sorted set must correctly implement the Set interface. (See the Comparable or Comparator interface for an exact definition of matching with equals.) This is because the Set interface is defined in terms of the equals operation, but the sorted set performs all element comparisons using the compareTo (or compare ) method, therefore, two elements that are considered equal this method are equal in terms of sorted set. The behavior of a sorted set is well defined, even if its ordering does not match the equal; it simply does not obey the general contract of the Set interface.

The explanation "consistent with equals" is defined in Comparable Documentation :

A natural ordering for a class C is called consistent with equalities if and only if e1.compareTo(e2) == 0 has the same Boolean value as e1.equals(e2) for each e1 and e2 class C Note: null not an instance of any class, and e.compareTo(null) should e.equals(null) NullPointerException , although e.equals(null) returns false .

It is strongly recommended (though not necessary) that natural orders correspond to equalities. This is because sorted sets (and sorted cards) without explicit comparators behave โ€œweirdlyโ€ when they are used with elements (or keys) whose natural ordering is incompatible with equal ones. In particular, such a sorted set (or sorted map) violates the general contract for the set (or map), which is defined in terms of the equals method.

In conclusion, your Sets Comparator behaves differently than the equals method, which causes unusual (albeit predictable) behavior.

+41
May 11 '17 at 13:35
source share
โ€” -

Well, that surprised me, I donโ€™t know if I am right, but look at this implementation in AbstractSet :

 public boolean removeAll(Collection<?> c) { Objects.requireNonNull(c); boolean modified = false; if (size() > c.size()) { for (Iterator<?> i = c.iterator(); i.hasNext(); ) modified |= remove(i.next()); } else { for (Iterator<?> i = iterator(); i.hasNext(); ) { if (c.contains(i.next())) { i.remove(); modified = true; } } } return modified; } 

Basically in your example, the size of the set is equal to the size of the arguments you want to remove, so the else condition is called. There is a check in this condition if your collection of arguments removes contains current element of the iterator, and this check is case sensitive, so it checks if c.contains("a") returns false because c contains "A" , not "A" , so the item is not deleted. Note that when you add an item to your set s.addAll(Arrays.asList("a", "b", "d")); , it works correctly because size() > c.size() now true, therefore there is no longer contains .

+15
May 11 '17 at 1:33 pm
source share

To add some information on why remove from TreeSet actually removes case insensitive in your example (and assuming you follow the path if (size() > c.size()) as described in @Shadov's answer):

This is the remove method in TreeSet :

 public boolean remove(Object o) { return m.remove(o)==PRESENT; } 

it calls remove from its internal TreeMap :

 public V remove(Object key) { Entry<K,V> p = getEntry(key); if (p == null) return null; V oldValue = p.value; deleteEntry(p); return oldValue; } 

which calls getEntry

  final Entry<K,V> getEntry(Object key) { // Offload comparator-based version for sake of performance if (comparator != null) return getEntryUsingComparator(key); if (key == null) throw new NullPointerException(); @SuppressWarnings("unchecked") Comparable<? super K> k = (Comparable<? super K>) key; Entry<K,V> p = root; while (p != null) { int cmp = k.compareTo(p.key); if (cmp < 0) p = p.left; else if (cmp > 0) p = p.right; else return p; } return null; } 

If there is a Comparator (as in your example), the search is performed based on this Comparator (this is done by getEntryUsingComparator ), so it is actually found (then deleted), despite the fact that the case is the difference.

+3
May 11 '17 at 14:26
source share

This is interesting, so here are a few tests with the output:

 static void test(String... args) { Set<String> s =new TreeSet<String>(String.CASE_INSENSITIVE_ORDER); s.addAll(Arrays.asList( "a","b","c")); s.removeAll(Arrays.asList(args)); System.out.println(s); } public static void main(String[] args) { test("C"); output: [a, b] test("C", "A"); output: [b] test("C", "A","B"); output: [a, b, c] test("B","C","A"); output: [a, b, c] test("K","C"); output: [a, b] test("C","K","M"); output: [a, b, c] !! test("C","K","A"); output: [a, b, c] !! } 

Now, without a comparator, it works just like a sorted HashSet<String>() :

  static void test(String... args) { Set<String> s = new TreeSet<String>();// s.addAll(Arrays.asList( "a","b","c")); s.removeAll(Arrays.asList(args)); System.out.println(s); } public static void main(String[] args) { test("c"); output: [a, b] test("c", "a"); output: [b] test("c", "a","b"); output: [] test("b","c","a"); output: [] test("k","c"); output: [a, b] test("c","k","m"); output: [a, b] test("c","k","m"); output: [a, b] } 

Now from the documentation:

public boolean removeAll (collection c)

Removes from this set all its elements that are contained in the specified collection (optional operation). If the specified collection is also a set, this operation effectively modifies this set so that its value is the asymmetric difference of the sets of two sets.

This implementation determines which one is smaller and the specified collection by calling the size method for each. If this set has fewer elements, then the implementation iterates over this, set it, checking each element returned by the iterator, in turn, to ensure that it is contained in the specified collection. If so, it is removed from this set using the iterator removal method. If the specified collection has fewer elements, then the implementation iterates over the specified collection, removing from this set every element returned by the iterator using this set deletion method.

Source

0
May 11 '17 at 14:19
source share



All Articles