Representation of structure in memory on a 64-bit machine

For my curiosity, I wrote a program that was supposed to show every byte of my structure. Here is the code:

#include <stdio.h> #include <stdlib.h> #include <string.h> #include <stdint.h> #include <limits.h> #define MAX_INT 2147483647 #define MAX_LONG 9223372036854775807 typedef struct _serialize_test{ char a; unsigned int b; char ab; unsigned long long int c; }serialize_test_t; int main(int argc, char**argv){ serialize_test_t *t; t = malloc(sizeof(serialize_test_t)); t->a = 'A'; t->ab = 'N'; t->b = MAX_INT; t->c = MAX_LONG; printf("%x %x %x %x %d %d\n", t->a, t->b, t->ab, t->c, sizeof(serialize_test_t), sizeof(unsigned long long int)); char *ptr = (char *)t; int i; for (i=0; i < sizeof(serialize_test_t) - 1; i++){ printf("%x = %x\n", ptr + i, *(ptr + i)); } return 0; } 

and here is the result:

 41 7fffffff 4e ffffffff 24 8 26b2010 = 41 26b2011 = 0 26b2012 = 0 26b2013 = 0 26b2014 = ffffffff 26b2015 = ffffffff 26b2016 = ffffffff 26b2017 = 7f 26b2018 = 4e 26b2019 = 0 26b201a = 0 26b201b = 0 26b201c = 0 26b201d = 0 26b201e = 0 26b201f = 0 26b2020 = ffffffff 26b2021 = ffffffff 26b2022 = ffffffff 26b2023 = ffffffff 26b2024 = ffffffff 26b2025 = ffffffff 26b2026 = ffffffff 

And here is the question: if sizeof(long long int) is 8 , then why sizeof(serialize_test_t) is 24 instead of 32 - I always thought that the size of the structure is rounded to the largest type and multiplied by the number of fields, for example, here: 8 (bytes ) * 4 (fields) = 32 (bytes)) - by default, without pragma pack directives?

Also, when I discard this structure in char * , I see in the output that the offset between the values ​​in memory is not 8 bytes. Could you give me the key? Or maybe it's just a compiler optimization?

+6
source share
5 answers

On modern 32-bit machines, such as SPARC or Intel [34] 86, or on any Motorola chip from 68020 up, each information should usually be "self-configuring", starting with an address that is several of its size. Thus, 32-bit types must start from a 32-bit boundary, 16-bit types at a 16-bit boundary, 8-bit types can start anywhere, struct / array / union types have alignment of their bounding element itself.

The total size of the structure will depend on the packaging. In your case, it will be 8 bytes, so the final structure will look like

 typedef struct _serialize_test{ char a;//size 1 byte padding for 3 Byte; unsigned int b;//size 4 Byte char ab;//size 1 Byte again padding of 7 byte; unsigned long long int c;//size 8 byte }serialize_test_t; 

Thus, the first two and last two are aligned correctly, and the total size reaches up to 24.

+4
source

Depends on the alignment chosen by your compiler. However, you can reasonably expect the following default values:

 typedef struct _serialize_test{ char a; // Requires 1-byte alignment unsigned int b; // Requires 4-byte alignment char ab; // Requires 1-byte alignment unsigned long long int c; // Requires 4- or 8-byte alignment, depending on native register size }serialize_test_t; 

Given the above requirements, the first field will have a zero offset.

Field b starts at offset 4 (after filling in 3 bytes).

The next field starts at offset 8 (no fill required).

The next field begins with an offset of 12 (32-bit) or 16 (64-bit) (after filling 3 or 7 bytes).

This gives you a total size of 20 or 24, depending on the alignment requirements for long long on your platform.

GCC has an offsetof function that you can use to determine the offset of any particular member, or you can define it yourself:

 // modulo errors in parentheses... #define offsetof(TYPE,MEMBER) (int)((char *)&((TYPE *)0)->MEMBER - (char *)((TYPE *)0)) 

Which basically calculates the offset using the difference in address using the imaginary base address for the aggregate type.

+2
source

Usually the addition is supplemented so that the structure is a multiple of the size of the word (in this case, 8)

So, the first 2 fields are in one 8-byte fragment. The third field is in another 8-byte block, and the last is in one 8-byte fragment. Only 24 bytes.

 char padding padding padding unsigned int unsigned int unsigned int unsigned int char // Word Boundary padding padding padding padding padding padding padding unsigned long long int // Word Boundary unsigned long long int unsigned long long int unsigned long long int unsigned long long int unsigned long long int unsigned long long int unsigned long long int 
0
source

Related to alignment.

The size of the structure is not rounded to the largest type and is multiplied by fields. Bytes are aligned according to their respective types: http://en.wikipedia.org/wiki/Data_structure_alignment#Architectures

Alignment works in that the type must be displayed in a memory address that is a multiple of its size, therefore:

Char 1 byte aligned, so it can be displayed anywhere in memory that is a multiple of 1 (anywhere).

An unfamiliar int must begin with an address multiple of 4.

char could be anywhere.

and then long long must be a multiple of 8.

If you look at the addresses, it is.

0
source

The compiler only cares about the individual alignment of the structural elements one by one. He does not think of structure as a whole. Since the structure does not exist at the binary level, it is just a piece of individual variables allocated at a specific address offset. There is no such thing as a "struct round-up", the compiler would not care how big the structure is if all the elements of the structure are correctly aligned.

In the C standard, nothing is said about the manner of filling, in addition, the compiler is not allowed to add uppercase bytes at the very beginning of the structure. In addition, the compiler can add any number of padding bytes anywhere in the structure. It can have 999 fill bytes, and it will conform to the standard.

So, the compiler looks at the structure and sees: here is char, it needs alignment. In this case, the CPU can probably handle 32-bit accesses, i.e. 4 byte alignment. Because it only adds 3 bytes of padding.

Then it sees a 32-bit int, does not require alignment, it remains as it is. Then another char, 3 bytes of padding, then a 64-bit int, alignment is not required.

0
source

Source: https://habr.com/ru/post/945072/


All Articles