Read X lines at a time from a text file using Java streams?

I have a "plain old text file" where lines end with a new line character. For arbitrary reasons, I need to read and parse this text file 4 (X for generality) lines at a time.

I would like to use Java streams for this task, and I know that I can turn this file into a stream as follows:

try (Stream<String> stream = Files.lines(Paths.get("file.txt""))) {
    stream.forEach(System.out::println);
} catch (IOException e) {
    e.printStackTrace();
}

But how can I use the Java Stream API to “group” a file into groups of four consecutive lines?

+4
source share
5 answers

This is an assignment for java.util.Scanner. In Java 9 you can just use

try(Scanner s = new Scanner(PATH)) {
    s.findAll("(.*\\R){1,4}")
     .map(mr -> Arrays.asList(mr.group().split("\\R")))
     .forEach(System.out::println);
}

Java 8 findAll . import static

try(Scanner s = new Scanner(PATH)) {
    findAll(s, Pattern.compile("(.*\\R){1,4}"))
        .map(mr -> Arrays.asList(mr.group().split("\\R")))
        .forEach(System.out::println);
}

, , ( ). , .

MatchResult s , .

try(Scanner s = new Scanner(PATH)) {
    findAll(s, Pattern.compile("(.*)\\R(?:(.*)\\R)?(?:(.*)\\R)?(?:(.*)\\R)?"))
        .flatMap(mr -> IntStream.rangeClosed(1, 4)
                           .mapToObj(ix -> mr.group(ix)==null? null: ix+": "+mr.group(ix)))
        .filter(Objects::nonNull)
        .forEach(System.out::println);
}
+4

Guava Iterators.partition:

try (Stream<String> stream = Files.lines(Paths.get("file.txt""))) {

    Iterator<List<String>> iterator = Iterators.partition(stream.iterator(), 4);

    // iterator.next() returns each chunk as a List<String>

} catch (IOException e) {
    // handle exception properly
}

, , - ...


EDIT: , :

Stream<List<String>> targetStream = StreamSupport.stream(
      Spliterators.spliteratorUnknownSize(iterator, Spliterator.ORDERED),
      false);
+3

, , , . , .

private static final class CustomCollector {

    private List<String> list = new ArrayList<>();

    private List<String> acumulateList = new ArrayList<>();

    public void accept(String str) {
        acumulateList.add(str);
        if (acumulateList.size() == 4) { // acumulate 4 strings
            String collect = String.join("", acumulateList);
            // I just joined them in on string, you can do whatever you want
            list.add(collect);
            acumulateList = new ArrayList<>();
        }
    }

    public CustomCollector combine(CustomCollector other) {
        throw new UnsupportedOperationException("Parallel Stream not supported");
    }

    public List<String> finish() {
        if(!acumulateList.isEmpty()) {
            list.add(String.join("", acumulateList));
        }
        return list;
    }

    public static Collector<String, ?, List<String>> collector() {
        return Collector.of(CustomCollector::new, CustomCollector::accept, CustomCollector::combine, CustomCollector::finish);
    }
}

:

stream.collect(CustomCollector.collector());
+2

RxJava, buffer:

Stream<String> stream = Files.lines(Paths.get("file.txt"))

Observable.fromIterable(stream::iterator)
          .buffer(4)                      // Observable<List<String>>
          .map(x -> String.join(", ", x)) // Observable<String>
          .forEach(System.out::println);

buffer Observable, . map, , Observable . , processChunk, a List<String> String, :

Observable<String> fileObs =
    Observable.fromIterable(stream::iterator)
              .buffer(4)
              .map(x -> processChunk(x));
+2

n -size chunks Java 8 Stream API. Collectors.groupingBy(), - Collection<List<String>> , (, ).

:

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.Collection;
import java.util.List;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.stream.Collectors;

public class ReadFileWithStream {

    public static void main(String[] args) throws IOException {
        // Path to a file to read
        final Path path = Paths.get(ReadFileWithStream.class.getResource("/input.txt")‌​.toURI());
        final AtomicInteger counter = new AtomicInteger(0);
        // Size of a chunk
        final int size = 4;

        final Collection<List<String>> partitioned = Files.lines(path)
                .collect(Collectors.groupingBy(it -> counter.getAndIncrement() / size))
                .values();

        partitioned.forEach(System.out::println);
    }
}

( ), , - :

[0, 0, 0, 2]
[0, -3, 2, 0]
[1, -3, -8, 0]
[2, -12, -11, -11]
[-8, -1, -8, 0]
[2, -1, 2, -1]
... and so on

Collectors.groupingBy() . Collectors.toList() , List<String>, Collection<List<String>> .

Let's say I want to read 4-dimensional pieces, and I want to sum all the numbers in a piece. In this case, I will use Collectors.summingInt()both my descending function and the return result Collection<Integer>:

final Collection<Integer> partitioned = Files.lines(path)
        .collect(Collectors.groupingBy(it -> counter.getAndIncrement() / size, Collectors.summingInt(Integer::valueOf)))
        .values();

Conclusion:

2
-1
-10
-32
-17
2
-11
-49
... and so on

And last but not least. Collectors.groupingBy()returns a map where values ​​are grouped by specific keys. That is why at the end we call Map.values()to get the set of values ​​contained on this map.

Hope this helps.

+2
source

Source: https://habr.com/ru/post/1692063/


All Articles