Double / float binary serialization portability in C ++

Question

Double / float binary serialization portability in C ++

The C ++ standard does not discuss the basic layout of float and double types, but only the range of values that they should represent. (This is also true for signed types, these are two compliments or something else)

My question is: what are the methods used to serialize / deserialize POD types like double and float in a portable way? At the moment, it seems that the only way to do this is to have the value represented literally (as in "123.456"). The ieee754 layout for double is not standard for all architectures.

+44

c ++ double portability serialization ieee-754

Matthieu N. Jan 19 '11 at 8:27

source share

9 answers

Sylvain Defresne · Answer 1 · 2011-01-19 09:28

Brian "Beej Jorgensen" Hall gives in his "Network Programming Guide" a code for packing a float (respectively double ) on uint32_t (respectively uint64_t ) in order to be able to safely transmit it over the network between two machines that cannot agree with their presentation. It has some limitation, basically it does not support NaN and infinity.

Here is its packing function:

 #define pack754_32(f) (pack754((f), 32, 8)) #define pack754_64(f) (pack754((f), 64, 11)) uint64_t pack754(long double f, unsigned bits, unsigned expbits) { long double fnorm; int shift; long long sign, exp, significand; unsigned significandbits = bits - expbits - 1; // -1 for sign bit if (f == 0.0) return 0; // get this special case out of the way // check sign and begin normalization if (f < 0) { sign = 1; fnorm = -f; } else { sign = 0; fnorm = f; } // get the normalized form of f and track the exponent shift = 0; while(fnorm >= 2.0) { fnorm /= 2.0; shift++; } while(fnorm < 1.0) { fnorm *= 2.0; shift--; } fnorm = fnorm - 1.0; // calculate the binary form (non-float) of the significand data significand = fnorm * ((1LL<<significandbits) + 0.5f); // get the biased exponent exp = shift + ((1<<(expbits-1)) - 1); // shift + bias // return the final answer return (sign<<(bits-1)) | (exp<<(bits-expbits-1)) | significand; }

Martin York · Answer 2 · 2011-01-19 08:43

What is wrong with human readable format.

It has several advantages over binary:

He read
It is tolerated
This simplifies support (since you can ask the user to look at him in your favorite editor even for a word)
Easy to fix (or configure files manually in error situations)

Inconvenience:

It is not compact. If this is a real problem, you can always fix it.
This might be a bit slower to extract / generate. Note that the binary format probably also needs to be normalized (see htonl() )

To output a double with full accuracy:

 double v = 2.20; std::cout << std::setprecision(std::numeric_limits<double>::digits) << v;

OK I’m not sure what it is for sure. It may lose accuracy.

TonyK · Answer 3 · 2011-01-19 09:36

Just write the IEEE754 binary representation to disk and document this as your storage format (along with entianity). Then it's up to the implementation to convert it to an internal representation, if necessary.

user1016736 · Answer 4 · 2011-11-07 15:26

Take a look at the implementation of the old gtypes.h file in glib 2 - it includes the following:

 #if G_BYTE_ORDER == G_LITTLE_ENDIAN union _GFloatIEEE754 { gfloat v_float; struct { guint mantissa : 23; guint biased_exponent : 8; guint sign : 1; } mpn; }; union _GDoubleIEEE754 { gdouble v_double; struct { guint mantissa_low : 32; guint mantissa_high : 20; guint biased_exponent : 11; guint sign : 1; } mpn; }; #elif G_BYTE_ORDER == G_BIG_ENDIAN union _GFloatIEEE754 { gfloat v_float; struct { guint sign : 1; guint biased_exponent : 8; guint mantissa : 23; } mpn; }; union _GDoubleIEEE754 { gdouble v_double; struct { guint sign : 1; guint biased_exponent : 11; guint mantissa_high : 20; guint mantissa_low : 32; } mpn; }; #else /* !G_LITTLE_ENDIAN && !G_BIG_ENDIAN */ #error unknown ENDIAN type #endif /* !G_LITTLE_ENDIAN && !G_BIG_ENDIAN */

glib link

Tobias Langner · Answer 5 · 2011-01-19 09:26

Create the appropriate serializer / de-serializer interface for writing / reading.

An interface can have several implementations, and you can check your parameters.

As stated earlier, the obvious parameters would be:

IEEE754, which writes / reads a binary fragment if it is directly supported by the architecture or analyzes it if it is not supported by the architecture
Text: You always need to understand.
Whatever you think.

Just remember - if you have this layer, you can always start with IEEE754 if you only support platforms that use this format internally. Thus, you will have additional efforts only when you need to support another platform! Do not do work that you do not need.

peoro · Answer 6 · 2011-01-19 08:33

You must convert them to a format that you can always use to recreate your floats / doubles.

This can use a string representation or, if you need something that takes up less space, specify your number in ieee754 (or any other format you choose), and then parse it the same way as with the string.

Nim · Answer 7 · 2011-01-19 09:45

I think the answer "depends" on what your specific application and its performance profile are.

Suppose you have a low latency environment in market data, then using strings is frankly stupid. If the information you transmit is the price, then double (and their binary representation) is really difficult to work with. Where, how, if you really don't care about performance, and what you need is visibility (storage, transfer), then strings are an ideal candidate.

I would choose the integral representation of the mantissa / exponential representation of floats / doubles - that is, at the first opportunity I converted float / double to a pair of integers, and then passed that. You then only need to worry about portability of integers and, well, various subroutines (such as the hton() subroutines for conversion processing for you). Also save everything in your most common endianess platform (for example, if you use only linux, then what's the point of storing things in a large endianess?)

Bernardo Ramos · Answer 8 · 2015-11-13 05:05

SQLite4 uses a new format for storing paired and floating

It works reliably and consistently even on platforms that lack IEEE 754 floating point support.
Exchange rates are usually executed accurately and without rounding.
Any signed or unsigned 64-bit integer can be represented accurately.
The floating point range and precision exceed the range of the IEEE 754 floating point numbers.
Positive and negative infinity and NaN (Not-a-Number) have clearly defined representations.

Sources:

https://sqlite.org/src4/doc/trunk/www/design.wiki

https://sqlite.org/src4/doc/trunk/www/decimal.wiki

J Lind · Answer 9 · 2018-12-05 10:44

Found this old thread. There is no one solution that allows a fair number of cases - the use of a fixed point, the transfer of integers with a known scaling factor using built-in throws at both ends. This way you do not have to worry about the basic floating point representation.

Of course, there are disadvantages. This solution assumes that you can have a fixed scaling factor and still get both the range and resolution needed for a particular application. In addition, you convert from a floating point to a fixed point at the end of serialization and convert back during deserialization, presenting two rounding errors. However, over the years I have found that a fixed point is sufficient for my needs in almost all cases, and it is also fast enough.

A typical example for a fixed point is communication protocols for embedded systems or other devices.

Double / float binary serialization portability in C ++

More articles: