So, the script looks like this: I have 2 GB files of binary serialized objects, I also have an index file that contains the identifier of each object and their offset in the file.
I need to write a method that specified a set of id deserializes them into memory. Performance is the most important benchmark and reasonable compliance with memory requirements. The second one.
Using a MemoryMappedFile seems to be the way to go, however I'm a little unsure of how to handle a large file. I cannot create a MemoryMappedViewAccessor for the whole file, since it is so big. Is it possible to simultaneously open several MemoryMappedViewAccessor from different segments without affecting too much memory, in this case, how large should these segments be?
Views can be kept in memory for a long time if data is accessed a lot and then deleted
Perhaps a naive method would be to arrange the objects that will be selected by the offset, and simply call CreateViewAccessor for each offset with a small buffer. Another would be to try to figure out the smallest number of different MemoryMappedViewAccessor and their size. But I'm not sure about the overhead of creating a CreateViewAccessor and how much space you can safely get at one time. I can do some testing, but if anyone has a better idea ... :)
I suppose another way is to split a large data file into several, but I'm not sure if everything will be fine in this case ...
Homde source
share