Collect from parallel thread in java 8

I want to take input and apply a parallel stream to it, then I want it to be output as a list. The input can be any List or any collection on which we can apply streams.

My problems here are that if we want the output to appear as a map, we have an option from java like

list.parallelStream().collect(Collectors.toConcurrentMap(args)) 

But there is no option that I can see for collecting from a parallel stream in a streaming safe way of providing a list as output. I see another use case

list.parallelStream().collect(Collectors.toCollection(<Concurrent Implementation>))

in this way we can provide various parallel implementations in the collection method. But I think that only the CopyOnWriteArrayList List implementation is present in java.util.concurrent. We could use a different implementation of the queue here, but they will not look like a list. I mean, we can get a list in a workaround.

Could you tell me what is the best way if I want the result to be like a list?

Note. I could not find any other posts related to this, any link would be helpful.

+5
source share
2 answers

The Collection object used to receive the collected data does not have to be parallel. You can give it a simple ArrayList .

This is because collecting values ​​from a parallel stream is not actually collected into a single Collection object. Each thread will collect its own data, and then all sub-selections will be merged into one final Collection object.

All of this is well documented in the Collector javadoc, and Collector is the parameter that you provide the collect() method:

 <R,A> R collect(Collector<? super T,A,R> collector) 
+8
source

But there is no option that I can see to collect from parallel stream in thread safe way to provide list as output . This is completely wrong.

The whole point of threads is that you can use a non-threading assembly to achieve completely correct results that are not dependent on threads. This is due to how threads are implemented (and this was a key part of thread design). You could see that Collector defines the supplier method, which will create a new instance at each step. These instances will be merged between them.

So this is an absolutely safe thread:

  Stream.of(1,2,3,4).parallel() .collect(Collectors.toList()); 

Since there are 4 elements in this thread, 4 ArrayList instances will be created that will be merged at the end with one result (assuming at least 4 CPU cores)

On the other hand, methods like toConcurrent create a container with a single result , and all threads will put their result in it.

+4
source

Source: https://habr.com/ru/post/1268043/


All Articles