C / C ++ packaging, signed char in int

I need to pack four signed bytes into a 32-bit integral type. this is what i came to:

int32_t byte(int8_t c) { return (unsigned char)c; } int pack(char c0, char c1, ...) { return byte(c0) | byte(c1) << 8 | ...; } 

This is a good decision? Is it portable (not in the sense of communication)? Is there a ready-made solution, possibly to increase?

Problem Basically, I am concerned about the bit order when converting negative bits from char to int. I do not know what the correct behavior should be.

thanks

+4
source share
6 answers

I liked Joey Adam's answer, except for the fact that it was written with macros (which cause real pain in many situations), and the compiler will not give you a warning if "char" does not have a width of 1 byte. This is my solution (based on Joey's).

 inline uint32_t PACK(uint8_t c0, uint8_t c1, uint8_t c2, uint8_t c3) { return (c0 << 24) | (c1 << 16) | (c2 << 8) | c3; } inline uint32_t PACK(sint8_t c0, sint8_t c1, sint8_t c2, sint8_t c3) { return PACK((uint8_t)c0, (uint8_t)c1, (uint8_t)c2, (uint8_t)c3); } 

I omitted casting c0-> c3 to uint32_t, since the compiler should handle this for you when switching, and I used c-style casts, since they will work for c or C ++ (OPs are marked as both).

+7
source

char not guaranteed to be signed or unsigned (in PowerPC Linux, char is used unsigned by default). Put the word!

What you want is something like this macro:

 #include <stdint.h> /* Needed for uint32_t and uint8_t */ #define PACK(c0, c1, c2, c3) \ (((uint32_t)(uint8_t)(c0) << 24) | \ ((uint32_t)(uint8_t)(c1) << 16) | \ ((uint32_t)(uint8_t)(c2) << 8) | \ ((uint32_t)(uint8_t)(c3))) 

This is ugly, mainly because it doesn't work very well with the order of C operations. Also, the backslash is returned, so this macro does not have to be one big long line.

In addition, the reason we pass uint8_t before casting to uint32_t is to prevent unwanted sign expansion.

+7
source

You can avoid casting with implicit conversions:

 uint32_t pack_helper(uint32_t c0, uint32_t c1, uint32_t c2, uint32_t c3) { return c0 | (c1 << 8) | (c2 << 16) | (c3 << 24); } uint32_t pack(uint8_t c0, uint8_t c1, uint8_t c2, uint8_t c3) { return pack_helper(c0, c1, c2, c3); } 

The idea is that you see “correctly convert all parameters. Shift and combine them”, and not “for each parameter, correctly convert, shift and combine”. However, not so much.

Then:

 template <int N> uint8_t unpack_u(uint32_t packed) { // cast to avoid potential warnings for implicit narrowing conversion return static_cast<uint8_t>(packed >> (N*8)); } template <int N> int8_t unpack_s(uint32_t packed) { uint8_t r = unpack_u<N>(packed); return (r <= 127 ? r : r - 256); // thanks to caf } int main() { uint32_t x = pack(4,5,6,-7); std::cout << (int)unpack_u<0>(x) << "\n"; std::cout << (int)unpack_s<1>(x) << "\n"; std::cout << (int)unpack_u<3>(x) << "\n"; std::cout << (int)unpack_s<3>(x) << "\n"; } 

Conclusion:

 4 5 249 -7 

This is just as portable as the uint32_t , uint8_t and int8_t . None of these are required on C99, and the stdint.h header is not defined in C ++ or C89. If types exist and meet the requirements of C99, the code will work. Of course, in C, for decompression functions, instead of a template parameter, a function parameter is required. You may also prefer C ++ if you want to write short loops for unpacking.

To eliminate the fact that types are optional, you can use uint_least32_t , which is required on C99. Similar to uint_least8_t and int_least8_t . You will need to change the pack_helper and unpack_u code:

 uint_least32_t mask(uint_least32_t x) { return x & 0xFF; } uint_least32_t pack_helper(uint_least32_t c0, uint_least32_t c1, uint_least32_t c2, uint_least32_t c3) { return mask(c0) | (mask(c1) << 8) | (mask(c2) << 16) | (mask(c3) << 24); } template <int N> uint_least8_t unpack_u(uint_least32_t packed) { // cast to avoid potential warnings for implicit narrowing conversion return static_cast<uint_least8_t>(mask(packed >> (N*8))); } 

Honestly, it is unlikely to be worth it - the chances are that the rest of your application is written under the assumption that int8_t , etc., exists. This is a rare implementation that does not have 8-bit and 32-bit add-on types.

+3
source

"Perfection"
IMHO, this is the best solution for you. EDIT: although I would use static_cast<unsigned int> instead of a C-style cast, and I probably would not use a separate method to hide the cast ...

Portability:
There will be no portable way to do this, because it says nothing that char should be eight bits, and nothing says that unsigned int should be 4 bytes wide.

In addition, you rely on instantiation, and therefore a data packet on one architecture will not be used on one with the opposite precision.

Is there a ready-made solution, possibly to increase?
Not that I know of.

+1
source

This is based on the answers of Grant Peters and Joey Adams, extended to show how to unpack signed values ​​(unpack functions are based on the modular rules of unsigned values ​​in C):

(As Steve Jessop noted in the comments, there is no need for separate pack_s and pack_u ).

 inline uint32_t pack(uint8_t c0, uint8_t c1, uint8_t c2, uint8_t c3) { return ((uint32_t)c0 << 24) | ((uint32_t)c1 << 16) | ((uint32_t)c2 << 8) | (uint32_t)c3; } inline uint8_t unpack_c3_u(uint32_t p) { return p >> 24; } inline uint8_t unpack_c2_u(uint32_t p) { return p >> 16; } inline uint8_t unpack_c1_u(uint32_t p) { return p >> 8; } inline uint8_t unpack_c0_u(uint32_t p) { return p; } inline uint8_t unpack_c3_s(uint32_t p) { int t = unpack_c3_u(p); return t <= 127 ? t : t - 256; } inline uint8_t unpack_c2_s(uint32_t p) { int t = unpack_c2_u(p); return t <= 127 ? t : t - 256; } inline uint8_t unpack_c1_s(uint32_t p) { int t = unpack_c1_u(p); return t <= 127 ? t : t - 256; } inline uint8_t unpack_c0_s(uint32_t p) { int t = unpack_c0_u(p); return t <= 127 ? t : t - 256; } 

(This is necessary, and not just discarding on int8_t , because the latter can lead to the appearance of a signal determined by the implementation if the value exceeds 127, so it is not strictly portable).

+1
source

You can also let the compiler do the work for you.

 union packedchars { struct { char v1,v2,v3,v4; } int data; }; packedchars value; value.data = 0; value.v1 = 'a'; value.v2 = 'b; 

Etc.

-1
source

Source: https://habr.com/ru/post/1304000/


All Articles