Parsing huge binary files in Node.js

I want to create a Node.js module that should be able to parse huge binary files (some over 200 GB). Each file is divided into pieces, and each fragment can be more than 10 GB. I tried to use flowing and non-current methods to read the file, but the problem is that the end of the read buffer is reached when the fragment is parsed, so the parsing of this fragment must be completed before the next event onData. This is what I tried:

var s = getStream();

s.on('data', function(a){
    parseChunk(a);
});

function parseChunk(a){
    /*
        There are a lot of codes and functions.
        One chunk is larger than buffer passed to this function,
        so when the end of this buffer is reached, parseChunk
        function must be terminated before parsing process is finished.
        Also, when the next buffer is passed, it is not the start of
        a new chunk because the previous chunk is not parsed to the end.
    */
}

Downloading the entire fragment to the process memory is not valid because I have only 8 GB of RAM. How can I read data from a stream synchronously or how can I pause a parseChunkfunction when the end of the buffer is reached and wait for new data to appear?

+4
1

, - , , , .

let chunk;
let Nbytes; // # of bytes to read into a chunk
stream.on('readable', ()=>{
  while(chunk = stream.read(Nbytes)!==null) { 
    // call whatever you like on the chunk of data of size Nbytes   
  }
})

, , , null , . , . , "" < Nbytes .

0

Source: https://habr.com/ru/post/1649669/


All Articles