I have an ASCII file where each line contains a variable length record. for instance
Record-1:15 characters Record-2:200 characters Record-3:500 characters ... ... Record-n: X characters
Since the file size is about 10 GB, I would like to read the record in chunks. After reading, I need to convert them, write them to another file in binary format.
So, for reading, my first reaction was to create a char array, such as
FILE *stream; char buffer[104857600];
- Is it correct to assume that linux issues one system call and retrieves all 100 MB?
- When records are separated by a new line, I search for a character by character for a new character in a line in the buffer and restore each record.
My question is, should I read in chunks or is there a better alternative for reading data in chunks and for restoring each record? Is there an alternative way to read x the number of variable-sized lines from an ASCII file in one call?
Further, during the recording, I do the same. I have a char write buffer that I pass to fwrite to write the entire set of records in one call.
fwrite(buffer, sizeof(buffer), 104857600, stream);
UPDATE: if I setbuf (stream, buffer), where buffer is my 100MB char buffer, will fgets return from the buffer or call the IO drive?
source share