Perform multiple unrelated operations on elements of a single thread in Java

How to perform several unrelated operations with elements of the same thread?

Say I have a List<String> consisting of text. Each line in the list may or may not contain a specific word, which represents an action to be performed. Say that:

  • if the line contains 'of', all words in this line should be counted
  • if the string contains "for", the part after the first occurrence of "for" should be returned, which gives a List<String> with all the substrings

Of course, I could do something like this:

 List<String> strs = ...; List<Integer> wordsInStr = strs.stream() .filter(t -> t.contains("of")) .map(t -> t.split(" ").length) .collect(Collectors.toList()); List<String> linePortionAfterFor = strs.stream() .filter(t -> t.contains("for")) .map(t -> t.substring(t.indexOf("for"))) .collect(Collectors.toList()); 

but then the list will be traversed twice, which can lead to performance strs if strs contains many elements.

Is it possible to somehow perform these two operations without going twice over the list?

+5
source share
4 answers

If you need one pass of Stream , then you should use a custom Collector (possibly parallelization).

 class Splitter { public List<String> words = new ArrayList<>(); public List<Integer> counts = new ArrayList<>(); public void accept(String s) { if(s.contains("of")) { counts.add(s.split(" ").length); } else if(s.contains("for")) { words.add(s.substring(s.indexOf("for"))); } } public Splitter merge(Splitter other) { words.addAll(other.words); counts.addAll(other.counts); return this; } } Splitter collect = strs.stream().collect( Collector.of(Splitter::new, Splitter::accept, Splitter::merge) ); System.out.println(collect.counts); System.out.println(collect.words); 
+5
source

Here is the answer to turn to the OP from another aspect. First of all, let's see how to quickly / slowly iterate over a list / collection. Here is the test result on my machine using the performance test below:

When: line list length = 100, stream number = 1, cycles = 1000, unit = milliseconds


OP: 0.013

Accepted Answer: 0.020

In counter function: 0.010


When: row list length = 1000_000, stream number = 1, loops = 100, unit = milliseconds


OP: 99.387

Accepted Answer: 89.848

Counter Function: 59.183


Conclusion The percentage of performance improvement is rather small or even slower (if the length of the list of lines is small). This is usually a mistake to reduce the iteration of the list / collection, which is loaded into memory by a more complex collector. You wonโ€™t get big performance improvements. we should look elsewhere if there is a performance problem.

Here is my performance test code with the Profiler tool: (I wonโ€™t discuss how to perform a performance test here. If you doubt the test results, you can do it again using any tool you believe in)

 @Test public void test_46539786() { final int strsLength = 1000_000; final int threadNum = 1; final int loops = 100; final int rounds = 3; final List<String> strs = IntStream.range(0, strsLength).mapToObj(i -> i % 2 == 0 ? i + " of " + i : i + " for " + i).toList(); Profiler.run(threadNum, loops, rounds, "OP", () -> { List<Integer> wordsInStr = strs.stream().filter(t -> t.contains("of")).map(t -> t.split(" ").length).collect(Collectors.toList()); List<String> linePortionAfterFor = strs.stream().filter(t -> t.contains("for")).map(t -> t.substring(t.indexOf("for"))) .collect(Collectors.toList()); assertTrue(wordsInStr.size() == linePortionAfterFor.size()); }).printResult(); Profiler.run(threadNum, loops, rounds, "Accepted answer", () -> { Splitter collect = strs.stream().collect(Collector.of(Splitter::new, Splitter::accept, Splitter::merge)); assertTrue(collect.counts.size() == collect.words.size()); }).printResult(); final Function<String, Integer> counter = s -> { int count = 0; for (int i = 0, len = s.length(); i < len; i++) { if (s.charAt(i) == ' ') { count++; } } return count; }; Profiler.run(threadNum, loops, rounds, "By the counter function", () -> { List<Integer> wordsInStr = strs.stream().filter(t -> t.contains("of")).map(counter).collect(Collectors.toList()); List<String> linePortionAfterFor = strs.stream().filter(t -> t.contains("for")).map(t -> t.substring(t.indexOf("for"))) .collect(Collectors.toList()); assertTrue(wordsInStr.size() == linePortionAfterFor.size()); }).printResult(); } 
+3
source

To do this, you can use a custom collector and repeat it only once:

  private static <T, R> Collector<String, ?, Pair<List<String>, List<Long>>> multiple() { class Acc { List<String> strings = new ArrayList<>(); List<Long> longs = new ArrayList<>(); void add(String elem) { if (elem.contains("of")) { long howMany = Arrays.stream(elem.split(" ")).count(); longs.add(howMany); } if (elem.contains("for")) { String result = elem.substring(elem.indexOf("for")); strings.add(result); } } Acc merge(Acc right) { longs.addAll(right.longs); strings.addAll(right.strings); return this; } public Pair<List<String>, List<Long>> finisher() { return Pair.of(strings, longs); } } return Collector.of(Acc::new, Acc::add, Acc::merge, Acc::finisher); } 

Using:

 Pair<List<String>, List<Long>> pair = Stream.of("t of rm", "t of rm", "nice for nice nice again") .collect(multiple()); 
+1
source

If you want to have 1 thread through the list, you need a way to manage two different states, you can do this by implementing Consumer in a new class.

  class WordsInStr implements Consumer<String> { ArrayList<Integer> list = new ArrayList<>(); @Override public void accept(String s) { Stream.of(s).filter(t -> t.contains("of")) //probably would be faster without stream here .map(t -> t.split(" ").length) .forEach(list::add); } } class LinePortionAfterFor implements Consumer<String> { ArrayList<String> list = new ArrayList<>(); @Override public void accept(String s) { Stream.of(s) //probably would be faster without stream here .filter(t -> t.contains("for")) .map(t -> t.substring(t.indexOf("for"))) .forEach(list::add); } } WordsInStr w = new WordsInStr(); LinePortionAfterFor l = new LinePortionAfterFor(); strs.stream()//stream not needed here .forEach(w.andThen(l)); System.out.println(w.list); System.out.println(l.list); 
0
source

Source: https://habr.com/ru/post/1272299/


All Articles