Should I use clear () with collect (toSet ())

Gathering the elements of a stream into a set, is there any advantage (or disadvantage) that also defines .distinct()the stream? For instance:

return items.stream().map(...).distinct().collect(toSet());

Given that the kit will already remove duplicates, this seems redundant, but does it offer any performance advantages or disadvantages? The answer depends on whether the stream is parallel / sequential or ordered / disordered?

+4
source share
3 answers

According to javadoc , it distinctis an intermediate state operation.

.distinct, .collect, . , .distinct , Set, , , .

, , .distinct .map, , , .

+3

, : toSet() HashSet, , :

javadoc:

() (, , ), . (, ()) BaseStream.unordered() () , . , () , BaseStream.sequential() .

, distinct(). toSet() ( API).

, equals, :

class F {
  int a;
  int b;
  @Override int hashCode() {return Objects.hashCode(a);}
  @Override boolean equals(Object other) {
    if (other == this) return true;
    if (!(other instanceof F)) return false;
    return a == ((F)other).a;
  }
}

a = F(10, 1) b = F(10, 2), . .

(b, a)

  • toSet() . (b, a) ..
  • () , : (b, a).

, , ( ..).

. TreeSet compareTo.

+1

different will call equals / hashcode to separate the elements, and later toSet will do the same (even if after parsing is not necessary, but toSet cannot really know this), so basically you just duplicate the calls. This should be worse than IMO. It is also quite easy to measure.

0
source

Source: https://habr.com/ru/post/1666511/


All Articles