Performance for Java Stream.concat VS Collection.addAll

In order to combine two data sets into a stream.

Stream.concat(stream1, stream2).collect(Collectors.toSet());

Or

stream1.collect(Collectors.toSet())
       .addAll(stream2.collect(Collectors.toSet()));

Which is more efficient and why?

+4
source share
5 answers

For ease of reading and intention Stream.concat(a, b).collect(toSet())is more understandable than the second option.

For the question, which is “what is the most effective”, here is the JMH test (I would like to say that I do not use JMH so much, maybe there is room for improving my test test):

Using JMH with the following code:

package stackoverflow;

import java.util.HashSet;
import java.util.Set;
import java.util.concurrent.TimeUnit;
import java.util.stream.Collectors;
import java.util.stream.Stream;

import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.Setup;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Warmup;
import org.openjdk.jmh.infra.Blackhole;

@State(Scope.Benchmark)
@Warmup(iterations = 2)
@Fork(1)
@Measurement(iterations = 10)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@BenchmarkMode({ Mode.AverageTime})
public class StreamBenchmark {
  private Set<String> s1;
  private Set<String> s2;

  @Setup
  public void setUp() {
    final Set<String> valuesForA = new HashSet<>();
    final Set<String> valuesForB = new HashSet<>();
    for (int i = 0; i < 1000; ++i) {
      valuesForA.add(Integer.toString(i));
      valuesForB.add(Integer.toString(1000 + i));
    }
    s1 = valuesForA;
    s2 = valuesForB;
  }

  @Benchmark
  public void stream_concat_then_collect_using_toSet(final Blackhole blackhole) {
    final Set<String> set = Stream.concat(s1.stream(), s2.stream()).collect(Collectors.toSet());
    blackhole.consume(set);
  }

  @Benchmark
  public void s1_collect_using_toSet_then_addAll_using_toSet(final Blackhole blackhole) {
    final Set<String> set = s1.stream().collect(Collectors.toSet());
    set.addAll(s2.stream().collect(Collectors.toSet()));
    blackhole.consume(set);
  }
}

You get this result (I skipped the part for readability).

Result "s1_collect_using_toSet_then_addAll_using_toSet":
  156969,172 ±(99.9%) 4463,129 ns/op [Average]
  (min, avg, max) = (152842,561, 156969,172, 161444,532), stdev = 2952,084
  CI (99.9%): [152506,043, 161432,301] (assumes normal distribution)

Result "stream_concat_then_collect_using_toSet":
  104254,566 ±(99.9%) 4318,123 ns/op [Average]
  (min, avg, max) = (102086,234, 104254,566, 111731,085), stdev = 2856,171
  CI (99.9%): [99936,443, 108572,689] (assumes normal distribution)
# Run complete. Total time: 00:00:25

Benchmark                                                       Mode  Cnt       Score      Error  Units
StreamBenchmark.s1_collect_using_toSet_then_addAll_using_toSet  avgt   10  156969,172 ± 4463,129  ns/op
StreamBenchmark.stream_concat_then_collect_using_toSet          avgt   10  104254,566 ± 4318,123  ns/op

Stream.concat(a, b).collect(toSet()) ( JMH).

, , , ( HashSet), , , Stream .

, , . toCollection(() -> new HashSet(1000)) toSet(), , - HashSet.

+6

, , . toSet() Set " , , " . , addAll Set.

, , HashSet, . , toSet() toCollection(HashSet::new) Streams collect.

, , , toSet(), , HashSet. , toSet() toCollection(…) , , .

+4

. , , . , .


, - alan7678

.

.

:

  • , . " " , , .

: ( , ) , (: , " " ).
.

, .
, - , , .

, , . - alan7678

, .

, , , . - alan7678

, ?

?

: Java9 Java10?

Javas JVM, . , ( java) , Java . ...

, . - alan7678

- 5 ? , , ?

, .
, 10+ , , - .

, - . - alan7678

, .

+3

, : , Stream.concat(stream1, stream2) , , callig .collect().

.toSet() , , , .

, stream1.collect(Collectors.toSet()) .addAll(stream2.collect(Collectors.toSet())) , .

.

Edit:

@NoDataFound. , , Stream.concat, , , Collection.addAll. , . , ( ). , .

Concat-collect   10000 elements, all distinct: 7205462 nanos
Collect-addAll   10000 elements, all distinct: 12130107 nanos

Concat-collect  100000 elements, all distinct: 78184055 nanos
Collect-addAll  100000 elements, all distinct: 115191392 nanos

Concat-collect 1000000 elements, all distinct: 555265307 nanos
Collect-addAll 1000000 elements, all distinct: 1370210449 nanos

Concat-collect 5000000 elements, all distinct: 9905958478 nanos
Collect-addAll 5000000 elements, all distinct: 27658964935 nanos

Concat-collect   10000 elements, 50% distinct: 3242675 nanos
Collect-addAll   10000 elements, 50% distinct: 5088973 nanos

Concat-collect  100000 elements, 50% distinct: 389537724 nanos
Collect-addAll  100000 elements, 50% distinct: 48777589 nanos

Concat-collect 1000000 elements, 50% distinct: 427842288 nanos
Collect-addAll 1000000 elements, 50% distinct: 1009179744 nanos

Concat-collect 5000000 elements, 50% distinct: 3317183292 nanos
Collect-addAll 5000000 elements, 50% distinct: 4306235069 nanos

Concat-collect   10000 elements, 10% distinct: 2310440 nanos
Collect-addAll   10000 elements, 10% distinct: 2915999 nanos

Concat-collect  100000 elements, 10% distinct: 68601002 nanos
Collect-addAll  100000 elements, 10% distinct: 40163898 nanos

Concat-collect 1000000 elements, 10% distinct: 315481571 nanos
Collect-addAll 1000000 elements, 10% distinct: 494875870 nanos

Concat-collect 5000000 elements, 10% distinct: 1766480800 nanos
Collect-addAll 5000000 elements, 10% distinct: 2721430964 nanos

Concat-collect   10000 elements,  1% distinct: 2097922 nanos
Collect-addAll   10000 elements,  1% distinct: 2086072 nanos

Concat-collect  100000 elements,  1% distinct: 32300739 nanos
Collect-addAll  100000 elements,  1% distinct: 32773570 nanos

Concat-collect 1000000 elements,  1% distinct: 382380451 nanos
Collect-addAll 1000000 elements,  1% distinct: 514534562 nanos

Concat-collect 5000000 elements,  1% distinct: 2468393302 nanos
Collect-addAll 5000000 elements,  1% distinct: 6619280189 nanos

import java.util.HashSet;
import java.util.Random;
import java.util.Set;
import java.util.stream.Collectors;
import java.util.stream.Stream;

public class StreamBenchmark {
    private Set<String> s1;
    private Set<String> s2;

    private long createStreamsTime;
    private long concatCollectTime;
    private long collectAddAllTime;

    public void setUp(final int howMany, final int distinct) {
        final Set<String> valuesForA = new HashSet<>(howMany);
        final Set<String> valuesForB = new HashSet<>(howMany);
        if (-1 == distinct) {
            for (int i = 0; i < howMany; ++i) {
                valuesForA.add(Integer.toString(i));
                valuesForB.add(Integer.toString(howMany + i));
            }
        } else {
            Random r = new Random();
            for (int i = 0; i < howMany; ++i) {
                int j = r.nextInt(distinct);
                valuesForA.add(Integer.toString(i));
                valuesForB.add(Integer.toString(distinct + j));
            }
        }
        s1 = valuesForA;
        s2 = valuesForB;
    }

    public void run(final int streamLength, final int distinctElements, final int times, boolean discard) {
        long startTime;
        setUp(streamLength, distinctElements);
        createStreamsTime = 0l;
        concatCollectTime = 0l;
        collectAddAllTime = 0l;
        for (int r = 0; r < times; r++) {
            startTime = System.nanoTime();
            Stream<String> st1 = s1.stream();
            Stream<String> st2 = s2.stream();
            createStreamsTime += System.nanoTime() - startTime;
            startTime = System.nanoTime();
            Set<String> set1 = Stream.concat(st1, st2).collect(Collectors.toSet());
            concatCollectTime += System.nanoTime() - startTime;
            st1 = s1.stream();
            st2 = s2.stream();
            startTime = System.nanoTime();
            Set<String> set2 = st1.collect(Collectors.toSet());
            set2.addAll(st2.collect(Collectors.toSet()));
            collectAddAllTime += System.nanoTime() - startTime;
        }
        if (!discard) {
            // System.out.println("Create streams "+streamLength+" elements,
            // "+distinctElements+" distinct: "+createStreamsTime+" nanos");
            System.out.println("Concat-collect " + streamLength + " elements, " + (distinctElements == -1 ? "all" : String.valueOf(100 * distinctElements / streamLength) + "%") + " distinct: " + concatCollectTime + " nanos");
            System.out.println("Collect-addAll " + streamLength + " elements, " + (distinctElements == -1 ? "all" : String.valueOf(100 * distinctElements / streamLength) + "%") + " distinct: " + collectAddAllTime + " nanos");
            System.out.println("");
        }
    }

    public static void main(String args[]) {
        StreamBenchmark test = new StreamBenchmark();
        final int times = 5;
        test.run(100000, -1, 1, true);
        test.run(10000, -1, times, false);
        test.run(100000, -1, times, false);
        test.run(1000000, -1, times, false);
        test.run(5000000, -1, times, false);
        test.run(10000, 5000, times, false);
        test.run(100000, 50000, times, false);
        test.run(1000000, 500000, times, false);
        test.run(5000000, 2500000, times, false);
        test.run(10000, 1000, times, false);
        test.run(100000, 10000, times, false);
        test.run(1000000, 100000, times, false);
        test.run(5000000, 500000, times, false);
        test.run(10000, 100, times, false);
        test.run(100000, 1000, times, false);
        test.run(1000000, 10000, times, false);
        test.run(5000000, 50000, times, false);
    }
}
+1

.

, , ,

+1

Source: https://habr.com/ru/post/1666657/


All Articles