Stream processing based on previous and next elements

I need to work with a fixed-width file that contains a predefined record layout, there are several types of records, and the first character of the record determines its type. Since a fixed width is not always suitable for an integer type of record in one line, the second character is the serial number of the record. For example:

0This is the header record------------------------------------
1This is another record always existing out of one lin--------
21This is a record that can be composed out of multiple parts.
22This is the second part of record type 2--------------------
21This is a new record of type 2, first part.-----------------
22This is the second part of record type 2--------------------
23This is the third part of record type 2---------------------
...

Using the Stream API, I would like to parse this file:

Stream<String> lines = Files.lines(Paths.get(args[1]));

lines.map(line -> RecordFactory.createRecord(line)).collect(Collectors.toList());

But since this stream delivers line by line, the display of record 2 is incomplete when it parses the first line of record type 2 (record type 2 of sequence 1). The next line (sequence of the 2nd type of record 2) should be added to the result of the previous comparison.

?

+4
2

, , Stream API.

StreamEx, groupRuns:

, , .

, , . , .

private static final Pattern PATTERN = Pattern.compile("\\d(\\d+)");

public static void main(String[] args) throws IOException {
    try (StreamEx<String> stream = StreamEx.ofLines(Paths.get("..."))) {
        List<Record> records =
            stream.groupRuns((s1, s2) -> getRecordPart(s2) > getRecordPart(s1))
                  .map(RecordFactory::createRecord)
                  .toList();
    }
}

private static final int getRecordPart(String str) {
    Matcher matcher = PATTERN.matcher(str);
    if (matcher.find()) {
        return Integer.parseInt(matcher.group(1));
    }
    return 1; // if the pattern didn't find anything, it means the record is on a single line
}

, RecordFactory Record List<String>, String. , , , , List , ( ).

+3

, Collector, , Collector<String,List<String>,List<String>>.

2- accumulator , , , combiner, , , , - , https://github.com/jOOQ/jOOL.

0

Source: https://habr.com/ru/post/1626974/


All Articles