I created the .NET application many years ago without thinking about the file format: it uses soap formatting to serialize our large hierarchy of objects. It was dirty, and so I did not think about it.
Now I'm trying to create a more optimal file format, given the following problem: When the file is saved, it will eventually be converted to an array of bytes and sent to the database for storage by posting. This becomes a big problem because you have all your objects in memory, then you allocate more memory for the serializer, and then allocate even more memory for the byte array. Even modest object graphs end up using large amounts of memory to take care of saving the file.
I'm not sure how to improve this both in terms of file format and potentially in terms of algorithm (objects → stream → byte array)
UPDATE: I always locked an array of bytes before sending it by cable, so while this good advice, it has already been implemented in my application.
I converted from Soap to Binary Serialization, and that made a huge difference: our files are about 7x smaller than before. (Of course, your mileage may vary).
source share