First of all: JSON and XML are not an option in this particular case, please do not offer them. If this facilitates the acceptance of this fact, imagine that I intend to reinvent the wheel for self-education.
Return to point:
I need to create a binary-safe data format for encoding some datagrams that I send to a specific dumb server that I write (in C, if that matters).
To simplify the matter, let's say that I send only numbers, strings and arrays.
An important fact: the server does not know (and should not) know anything about Unicode, etc. It treats all strings as binary drops (and never looks at them).
The format that I originally developed is as follows:
- Datagram:
<Number:size>\n<Value1>...<ValueN> - Value:
- Number:
N\n<Value>\n - String:
S\n<Number:size-in-bytes>\n<bytes>\n - Array:
A\n<Number:size>\n<Value0>...<ValueN>
Example:
[ 1, "foo", [] ]
Serialized as follows:
1 ; number of items in datagram
A; - array -
3; number of items in array
N; - number -
1 ; number value
S; - string -
3; string size in bytes
foo; string bytes
A; - array -
0; number of items in array
The problem is that I cannot reliably get the size of the string in bytes in JavaScript.
So, the question arises: how to change the format, so the string can be saved in JS and loaded into C carefully.
I do not want to add Unicode support to the server.
And I donβt really want to decode the lines on the server (say, from base64 or just to unescape \ xNN sequences) - this will require working with dynamic line buffers, which, given how dumb the server is, is not so desirable ...
Any clues?
source share