I am processing several old database files. Everything goes well. All the files I have worked with so far have fixed width entries, and the width is defined in the header. Pretty straight forward. I know the length of the header, so I can start reading the file immediately after the header, and then I know that X bytes will come to the end of the record later. If the record is 30 bytes and the title is 100, I can do something like this:
file = IO.binread(path + file_name, end_of_header, end_of_file)
read_file(file[0, 30])
However, there are several tables with dynamic width entries. Thus, one record can be 100 bytes, and the next - 20 bytes. Entries are the size of the text that the user has saved. There seems to be nothing that marks the length of the record in the record.
Each record is separated by a separator (FEFE). I look at the next delimiter and pull out the record this way, but I need to read the entire byte file byte forever, looking for matches. Is there a better way than scanning to find the next match OR get a list of all the indices of each occurrence of a byte array?
RUBY...
source
share