I work with a system that has lists and dictionaries with over five million elements, where each element is usually flat dto with up to 90 primitive properties. Collections are stored on disk using protobuf-net to provide stability and subsequence processing.
No wonder we hit LOH during processing and serialization.
We can avoid LOH during processing with ConcurrentBag, etc., but we still encounter a problem with serialization.
Currently, items in the collection are grouped into groups of 1000 and are sequentially serialized into memory streams. Each byte array is placed in a parallel queue for subsequent writing to the file stream.
Although I understand that this is trying to do, it seems too complicated. It seems that there should be something in the proto-buffer itself that is associated with huge collections without using LOH.
I hope I made a student mistake - that there are some settings that I forgot. Otherwise, I will search to write my own binary reader / writer.
I must point out that we are using 4.0, hoping to move on to 4.5 soon, but be aware that we will not overcome this problem, despite the improvements to the GC.
Any help was appreciated.
source share