I have large files containing a small number of large data sets. Each data set contains the name and size of the data set in bytes, which allows you to skip and go to the next data set.
I want to quickly create a dataset name index. An example file is about 21 MB and contains 88 data sets. Reading 88 names quickly using std::ifstreamand seekg()to skip between sets of data takes about 1300 ms, which I would like to reduce.
Thus, I read 88 fragments of about 30 bytes in size, at the given positions in the file 21 MB, and takes 1300 ms.
Is there a way to improve this, or is it a limitation of the OS and file system? I am running a test under the 64-bit version of Windows 7.
I know that having a full index at the beginning of a file would be better, but the file format does not have this, and we cannot change it.
source
share