Partial and asynchronous C # deserialization with protobuf-net

Question

Partial and asynchronous C # deserialization with protobuf-net

Context

I have a file with the following structure:

[ProtoContract] public class Data { [ProtoMember(1)] public string Header { get; set; } [ProtoMember(2)] public byte[] Body { get; set; } }

Code that reads / writes data to a file is executed in the context of asp.net mvc webapi. I am trying to keep every async IO lock in order to minimize the lock and achieve the best scalability. Reading and writing from files supports ReadAsync, WriteAsync and CopyToAsync.

The body can be large enough (heading →), and I only need to read the body if the heading meets some specific criteria.

I can partially read and deserialize the header synchronously, as well as read and deserialize the body in the same way, using the approach described in deserializing the part of the binary

Problem

How can I use an asynchronous IO file to do the same, reading and deserializing the Async header, and reading and deserializing the body in the same way?

I read Asynchronous protobuff serialization - this is not an option.

+1

c # protobuf-net

tozevv Aug 16 '13 at 7:52

source share

1 answer

Marc gravell · Accepted Answer · 2013-08-16T09:26:16+0000

Technically, the protobuf fields may not be in order, but in most cases (including the one you are showing) we can reasonably assume that the fields are in order (the only way to get them out of order here is to separately serialize the two semi-classes and combine the results, which is technically true in the protobuf specification).

So what we will have:

varint notation: field 1, string - always decimal 10
varint denoting "a", the length of the header
"a" encoded UTF-8 header
varint notation: field 2, string - always decimal 18
varint denoting "b", body length
"b" bytes, body

We can assume that "a" is >= 0 and < int.MaxValue - this means that encoding will require no more than 5 bytes; therefore, if you buffer at least 6 bytes, you will have enough information to know how big the header is. Of course, it can technically also contain part of the body, so you will need to hold it in your hands! But if you had sync-over-async Stream , you can only read this part of the stream, for example:

 int protoHeader = ProtoReader.DirectReadVarintInt32(stream); // 10 int headerLength = ProtoReader.DirectReadVarintInt32(stream); string header = ProtoReader.DirectReadString(stream, headerLength);

Or, if "sync over async" is complex, an explicit read:

 static byte[] ReadAtLeast6() { return new byte[] { 0x0A, 0x0B, 0x68, 0x65, 0x6C, 0x6C, 0x6F }; } static byte[] ReadMore(int bytes) { return new byte[] { 0x20, 0x77, 0x6F, 0x72, 0x6C, 0x64 }; } static void Main() { // pretend we read 7 bytes async var data = ReadAtLeast6(); using (var ms = new MemoryStream()) { ms.Write(data, 0, data.Length); ms.Position = 0; int protoHeader = ProtoReader.DirectReadVarintInt32(ms); // 10 int headerLength = ProtoReader.DirectReadVarintInt32(ms); // 11 int needed = (headerLength + (int)ms.Position) - data.Length; // 6 more var pos = ms.Position; ms.Seek(0, SeekOrigin.End); data = ReadMore(needed); ms.Write(data, 0, needed); ms.Position = pos; string header = ProtoReader.DirectReadString(ms, headerLength); } }

Partial and asynchronous C # deserialization with protobuf-net

Context

Problem

More articles: