Trying to figure out which portable approach to saving data

I have a program that runs on Intel Edison (32-bit Yocto Linux). It reads the sensor data and then writes the sensor data to a file. Data arrives in packets of 1 int and 13 two-local ones, with 100 packets arriving every second. After some time, I will extract the files from this file and read these files using a tool that runs on a Windows x64 machine.

I am currently writing data as a raw text file (since the lines are nice and portable). However, due to the amount of data that will be written for this, I am looking for ways to save space. However, I am trying to find a way to ensure that no data is lost when interpreting this from the other side.

My initial idea was to continue and create a structure that looks like this:

struct dataStruct{
  char front;
  int a;
  double b, c, d, e, f, g, h, i, j, l, m, n, o;
  char end;
}

and then do the union of this as follows:

union dataUnion{
  dataStruct d;
  char[110] c;
}
//110 was chosen because an int = 4 char, and a double = 8 char,
//so 13*8 = 104, and therefore d = 1 + 4 + 13*8 + 1 = 110

and then write the char array to the file. However, a little reading around tells me that such an implementation may not necessarily be compatible between the OS (worse ... it may work for a while, not other times ...).

So I'm wondering - is there a portable way to save this data, not just saving it as raw text?

+4
source share
5 answers

As others have said: serialization is probably the best solution to your problem.

, - MsgPack. ( ++ 11), , ++ . (, /):

// adapted from https://github.com/msgpack/msgpack-c/blob/master/QUICKSTART-CPP.md

#include <msgpack.hpp>
#include <vector>
#include <string>

struct dataStruct {
    int a;
    double b, c, d, e, f, g, h, i, j, l, m, n, oo;  // yes "oo", because "o" clashes with msgpack :/

    MSGPACK_DEFINE(a, b, c, d, e, f, g, h, i, j, l, m, n, oo);
};

int main(void) {
    std::vector<dataStruct> vec;
    // add some elements into vec...

    // you can serialize dataStruct directly
    msgpack::sbuffer sbuf;
    msgpack::pack(sbuf, vec);

    msgpack::unpacked msg;
    msgpack::unpack(&msg, sbuf.data(), sbuf.size());

    msgpack::object obj = msg.get();

    // you can convert object to dataStruct directly
    std::vector<dataStruct> rvec;
    obj.convert(&rvec);
}

Google FlatBuffers. , .

EDIT. , -: [/p >

// adapted from:
// https://github.com/msgpack/msgpack-c/blob/master/QUICKSTART-CPP.md
// https://github.com/msgpack/msgpack-c/wiki/v1_1_cpp_unpacker#msgpack-controls-a-buffer

#include <msgpack.hpp>
#include <fstream>
#include <iostream>

using std::cout;
using std::endl;

struct dataStruct {
    int a;
    double b, c, d, e, f, g, h, i, j, l, m, n, oo;  // yes "oo", because "o" clashes with msgpack :/

    MSGPACK_DEFINE(a, b, c, d, e, f, g, h, i, j, l, m, n, oo);
};

std::ostream& operator<<(std::ostream& out, const dataStruct& ds)
{
    out << "[a:" << ds.a << " b:" << ds.b << " ... oo:" << ds.oo << "]";
    return out;
}

int main(void) {

    // serialize
    {
        // prepare the (buffered) output file
        std::ofstream ofs("log.bin");

        // prepare a data structure
        dataStruct ds;

        // fill in sample data
        ds.a  = 1;
        ds.b  = 1.11;
        ds.oo = 101;
        msgpack::pack(ofs, ds);
        cout << "serialized: " << ds << endl;

        ds.a  = 2;
        ds.b  = 2.22;
        ds.oo = 202;
        msgpack::pack(ofs, ds);
        cout << "serialized: " << ds << endl;

        // continuously receiving data
        //while ( /* data is being received... */ ) {
        //
        //    // initialize ds...
        //
        //    // serialize ds
        //    // You can use any classes that have the following member function:
        //    // https://github.com/msgpack/msgpack-c/wiki/v1_1_cpp_packer#buffer
        //    msgpack::pack(ofs, ds);
        //}
    }

    // deserialize
    {
        // The size may decided by receive performance, transmit layer protocol and so on.

        // prepare the input file
        std::ifstream ifs("log.bin");
        std::streambuf* pbuf = ifs.rdbuf();

        const std::size_t try_read_size = 100;  // arbitrary number...
        msgpack::unpacker unp;
        dataStruct ds;

        // read data while there are still unprocessed bytes...
        while (pbuf->in_avail() > 0) {
            unp.reserve_buffer(try_read_size);
            // unp has at least try_read_size buffer on this point.

            // input is a kind of I/O library object.
            // read message to msgpack::unpacker internal buffer directly.
            std::size_t actual_read_size = ifs.readsome(unp.buffer(), try_read_size);

            // tell msgpack::unpacker actual consumed size.
            unp.buffer_consumed(actual_read_size);

            msgpack::unpacked result;
            // Message pack data loop
            while(unp.next(result)) {
                msgpack::object obj(result.get());
                obj.convert(&ds);

                // use ds
                cout << "deserialized: " << ds << endl;
            }
            // All complete msgpack message is proccessed at this point,
            // then continue to read addtional message.
        }
    }
}

:

serialized: [a:1 b:1.11 ... oo:101]
serialized: [a:2 b:2.22 ... oo:202]
deserialized: [a:1 b:1.11 ... oo:101]
deserialized: [a:2 b:2.22 ... oo:202]
+1

. , , .

( ) - . , , (, IEE754 ), , , .

.

, -, - .

+1

. , Google " " - , . (EG JSON XML)

skool ASN.1

, .

+1

- (ProtoBuf, Thrift ..). "" , - , :

struct dataStruct{
  uint32_t a;  // see cstdint.h or boost
  /// ...
}

. little-endian ( big-endian) , ( "" ).

, , )) .

#pragma pack(1)

__attribute__((packed))

, .

0

, . int double / . , .

"" / , /... , int , , 5 , 'e' 2/3 .

-1

Source: https://habr.com/ru/post/1620632/


All Articles