Using streams to create a BSON byte array through Json.NET (for file format)

We need the BSON equivalent for

{ "Header": { "SubHeader1": { "Name": "Bond", "License": 7 }, "SubHeader2": { "IsActive": true } }, "Payload": /* This will be a 40GB byte stream! */ } 

But we get:

enter image description here

As you can see, the payload appears FIRST and then the rest of the header!

We use the Json.NET BSON writer ( Bson.BsonWriter.WriteValue(byte[] value) ), but it only accepts the actual byte[] , not the Stream . Since our payload will be 10 GB, we should use streams, so we tried to work (the code below), but this gives us the wrong result shown above

 public void Expt() { // Just some structure classes, defined below var fileStruct = new FileStructure(); using (Stream outputSt = new FileStream("TestBinary.bson", FileMode.Create)) { var serializer = new JsonSerializer(); var bw = new BsonWriter(outputSt); // Start bw.WriteStartObject(); // Write header bw.WritePropertyName("Header"); serializer.Serialize(bw, fileStruct.Header); // Write payload bw.WritePropertyName("Payload"); bw.Flush(); // <== flush ! // In reality we 40GB into the stream, dummy example for now byte[] dummyPayload = Encoding.UTF8.GetBytes("This will be a 40GB byte stream!"); outputSt.Write(dummyPayload, 0, dummyPayload.Length); // End bw.WriteEndObject(); } } 

This looks like a classic case of no synchronization / non-cleaning buffers, even though we do send Flush to Json.NET before writing the payload to the base stream.

Question: Is there any other way to do this? We would rather not break the source of Json.NET (and explore its internal piping) or somehow reinvent the wheel ...


Details: support structure classes (if you want to reproduce this)

 public class FileStructure { public TopHeader Header { get; set; } public byte[] Payload { get; set; } public FileStructure() { Header = new TopHeader { SubHeader1 = new SubHeader1 {Name = "Bond", License = 007}, SubHeader2 = new SubHeader2 {IsActive = true} }; } } public class TopHeader { public SubHeader1 SubHeader1 { get; set; } public SubHeader2 SubHeader2 { get; set; } } public class SubHeader1 { public string Name { get; set; } public int License { get; set; } } public class SubHeader2 { public bool IsActive { get; set; } } 
+4
source share
1 answer

Well, thatโ€™s why we have reached some intermediate position here because we donโ€™t have time (at the moment) to fix the excellent Json.NET library. Since we are fortunate that Stream is only at the end, we now use BSON for the header (small enough for byte[] ), and then pass it to the standard stream, i.e. Representation :

 { "SubHeader1": { "Name": "Bond", "License": 7 }, "SubHeader2": { "IsActive": true } } /* End of valid BSON */ // <= Our Stream is written here, raw byte stream, no BSON 

It would be more aesthetically pleasing to have a single BSON layout, but in the absence of it, this is also great. Probably a little faster! If someone else finds a better answer in the future, we will listen.

+1
source

Source: https://habr.com/ru/post/1485866/


All Articles