Reading a list of files as a Java 8 stream

I have a (possibly long) list of binaries that I want to read lazily. There will be too many files in memory. I am currently reading them as a MappedByteBuffer with FileChannel.map(), but this is probably not required. I want the method to readBinaryFiles(...)return a Java 8 thread, so I could be lazy to load a list of files as I access them.

    public List<FileDataMetaData> readBinaryFiles(
    List<File> files, 
    int numDataPoints, 
    int dataPacketSize )
    throws
    IOException {

    List<FileDataMetaData> fmdList = new ArrayList<FileDataMetaData>();

    IOException lastException = null;
    for (File f: files) {

        try {
            FileDataMetaData fmd = readRawFile(f, numDataPoints, dataPacketSize);
            fmdList.add(fmd);
        } catch (IOException e) {
            logger.error("", e);
            lastException = e;
        }
    }

    if (null != lastException)
        throw lastException;

    return fmdList;
}


//  The List<DataPacket> returned will be in the same order as in the file.
public FileDataMetaData readRawFile(File file, int numDataPoints, int dataPacketSize) throws IOException {

    FileDataMetaData fmd;
    FileChannel fileChannel = null;
    try {
        fileChannel = new RandomAccessFile(file, "r").getChannel();
        long fileSz = fileChannel.size();
        ByteBuffer bbRead = ByteBuffer.allocate((int) fileSz);
        MappedByteBuffer buffer = fileChannel.map(FileChannel.MapMode.READ_ONLY, 0, fileSz);

        buffer.get(bbRead.array());
        List<DataPacket> dataPacketList = new ArrayList<DataPacket>();

        while (bbRead.hasRemaining()) {

            int channelId = bbRead.getInt();
            long timestamp = bbRead.getLong();
            int[] data = new int[numDataPoints];
            for (int i=0; i<numDataPoints; i++) 
                data[i] = bbRead.getInt();

            DataPacket dp = new DataPacket(channelId, timestamp, data);
            dataPacketList.add(dp);
        }

        fmd = new FileDataMetaData(file.getCanonicalPath(), fileSz, dataPacketList);

    } catch (IOException e) {
        logger.error("", e);
        throw e;
    } finally {
        if (null != fileChannel) {
            try {
                fileChannel.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }

    return fmd;
}

The return fmdList.Stream()from readBinaryFiles(...)will not be executed, because the contents of the file will already be read in memory, which I cannot do.

Other approaches to reading the contents of multiple files as a stream rely on usage Files.lines(), but I need to read the binaries.

Scala golang, , Java.

, .

+4
4

:

return files.stream().map(f -> readRawFile(f, numDataPoints, dataPacketSize));

... , .. throws IOException readRawFile. IOException UncheckedIOException. ( , .)

0

, FileDataMetaData. , FileDataMetaData .

, , Java 7, Java 8, .. RandomAccessFile, , try-with-resources, . , . ByteBuffer , . , , read ByteBuffer , , JRE read.

, -. , .

public FileDataMetaData readRawFile(
    File file, int numDataPoints, int dataPacketSize) throws IOException {

    try(FileChannel fileChannel=FileChannel.open(file.toPath(), StandardOpenOption.READ)) {
        long fileSz = fileChannel.size();
        MappedByteBuffer bbRead=fileChannel.map(FileChannel.MapMode.READ_ONLY, 0, fileSz);
        List<DataPacket> dataPacketList = new ArrayList<>();
        while(bbRead.hasRemaining()) {
            int channelId = bbRead.getInt();
            long timestamp = bbRead.getLong();
            int[] data = new int[numDataPoints];
            for (int i=0; i<numDataPoints; i++) 
                data[i] = bbRead.getInt();
            dataPacketList.add(new DataPacket(channelId, timestamp, data));
        }
        return new FileDataMetaData(file.getCanonicalPath(), fileSz, dataPacketList);
    } catch (IOException e) {
        logger.error("", e);
        throw e;
    }
}

, , , :

public Stream<FileDataMetaData> readBinaryFiles(
    List<File> files, int numDataPoints, int dataPacketSize) throws IOException {
    return files.stream().map(f -> {
        try {
            return readRawFile(f, numDataPoints, dataPacketSize);
        } catch (IOException e) {
            logger.error("", e);
            throw new UncheckedIOException(e);
        }
    });
}
+1

, , java.io.SequenceInputStream, DataInputStream. . BufferedInputStream , .

0

VGR, , :

return files.stream().map(f -> readRawFile(f, numDataPoints, dataPacketSize))

, ( , map(). readRawFile, InputStream, :

public FileDataMetaData readRawFile(File file, int numDataPoints, int dataPacketSize)
  throws DataPacketReadException { // <- Custom unchecked exception, nested for class

  FileDataMetadata results = null;

  try (FileInputStream fileInput = new FileInputStream(file)) {
    String filePath = file.getCanonicalPath();
    long fileSize = fileInput.getChannel().size()

    DataInputStream dataInput = new DataInputStream(new BufferedInputStream(fileInput);

    results = new FileDataMetadata(
      filePath, 
      fileSize,
      dataPacketsFrom(dataInput, numDataPoints, dataPacketSize, filePath);
  }

  return results;
}

private List<DataPacket> dataPacketsFrom(DataInputStream dataInput, int numDataPoints, int dataPacketSize, String filePath)
    throws DataPacketReadException { 

  List<DataPacket> packets = new 
  while (dataInput.available() > 0) {
    try {
      // Logic to assemble DataPacket
    }
    catch (EOFException e) {
      throw new DataPacketReadException("Unexpected EOF on file: " + filePath, e);
    }
    catch (IOException e) {
      throw new DataPacketReadException("Unexpected I/O exception on file: " + filePath, e);
    }
  }

  return packets;
}

This should reduce the amount of code and make sure your files are closed on error.

0
source

Source: https://habr.com/ru/post/1654141/


All Articles