Conclusion Explanation of this program in C?

I have this program in C:

int main(int argc, char *argv[]) { int i=300; char *ptr = &i; *++ptr=2; printf("%d",i); return 0; } 

Conclusion - 556 at the small end.

I tried to figure out a way out. Here is my explanation.

Question: will there be an answer to the rest on the big end machine?)

i = 300; => i = 100101100 // in binary format in word format => BB Hb 0001 00101100 where B = byte and Hb = half byte

(A) => in memory (assuming it's a small end))

 0x12345678 - 1100 - 0010 ( Is this correct for little endian) 0x12345679 - 0001 - 0000 0x1234567a - 0000 - 0000 0x1234567b - 0000 - 0000 

0x1234567c - The location of the next intezer (the location of ptr ++ or ptr + 1, where ptr is an intezer pointer, since ptr is of type int => when ++ ptr is executed, it will increase by 4 bytes ( int size ))

when

(B) we do char * ptr = & i; ptr will become of type char => when ++ ptr is executed it will increase by 1 byte ( char size ) so when ++ ptr is executed it will move to the location β†’ 0x12345679 (which has 0001 - 0000) now we do ++ ptr = 2 = > 0x12345679 will be overwritten 2 => 0x12345679 will have 00 * 10 ** - 0000 instead of 000 * 1 * - 0000

therefore, the new memory contents will look like this:

(FROM)

 0x12345678 - 1100 - 0010 0x12345679 - 0010 - 0000 0x1234567a - 0000 - 0000 0x1234567b - 0000 - 0000 

which is equivalent => BB Hb 0010 00101100, where B = bytes and Hb = half bytes

Am I reasoning correctly? Is there any other short method for doing this? Rgds, Softy

+6
source share
4 answers

On a 32-bit system with a small number of characters, int 300 ( 0x012c ) is usually (*) stored as 4 consecutive bytes, the lowest first: 2C 01 00 00 . When you increment the char pointer, which used to be a pointer to int &i , you are pointing to the second byte of this sequence, and setting it to 2 makes the sequence 2C 02 00 00 β€” which, when returned back to int, 0x22c or 556.

(As for your understanding of the sequence of bits ... this seems a bit off. Endianness affects the byte order in memory since the byte is the smallest addressable unit. The bits within the byte do not change, the low byte of the byte will be 2C ( 00101100 ), whether the system is unimportant or big-endian. (Even if the system did change the byte bits, it will cancel them again to present them to you as a number, so you won't notice the difference.) The big difference is that this byte appears in the sequence. The only places where and EET order bits, - the equipment and drivers, and those where you can get a smaller byte at a time).

In a system with a large end, int is usually (*) represented by the byte sequence 00 00 01 2C (different from the low-character representation only in byte order - the first byte comes first). You still change the second byte of the sequence, though ... by doing 00 02 01 2C , which as an int is 0x02012c or 131372.

(*) There are many things here, including two additions (which almost all systems use these days ... but C does not require this), the sizeof(int) value, alignment / addition, and whether the system is truly large or small or semi-similar to it implementation. This is much of why dropping with larger bytes often results in undefined or implementation-specific behavior.

+8
source

This is your int :

 int i = 300; 

And this is what the memory contains in &i : 2c 01 00 00 With the following instruction, assign the address from i to ptr , and then go to the next byte with ++ptr and change its value to 2 :

 char *ptr = &i; *++ptr = 2; 

So, now the memory contains: 2c 02 00 00 (i.e. 556). The difference is that in the big-endian system at address i you would see 00 00 01 2C , and after the change: 00 02 01 2C .

Even if the internal representation of int is determined by the implementation:

For signed integer types, the object representation bits should be divided into three groups: value bits, padding bits, and a little bit. There should be no padding bits; a signed char should not have any padding bits. There must be exactly one sign bit. Each bit that the value bit must have the same value as one bit in the object represents the corresponding unsigned type (if there are M value bits in the signed type and N in the unsigned type, then M ≀ N). If the sign bit is zero, it should not affect the resulting value. If the sign bit is equal to one, the value must be changed in one of the following ways: - the corresponding value with the sign bit 0 is negated (sign and value); - the sign bit has a value of - (2M) (two additions); - the sign bit has the value - (2M - 1) (complement units). Which of them relates to the implementation , since whether the value with the sign bit 1 and all bits of the values ​​is zero (for the first two) or with the sign bit and all bits of the values ​​1 (for units) is a trap representation or a normal value. In the case of the sign, both the magnitude and the complement complement if this representation of the normal value is called negative zero.

+3
source

This implementation is defined. The internal representation of int is unknown according to the standard, so what you are doing is not portable. See Section 6.2.6.2 in Standard C.

However, since most implementations use two additional representations of signed ints, endianness will affect the result, as described in cHaos's answer.

+3
source

I like the experiments and that is the reason for having the PowerPC G5.

stacktest.c:

 int main(int argc, char *argv[]) { int i=300; char *ptr = &i; *++ptr=2; /* Added the Hex dump */ printf("%d or %x\n",i, i); return 0; } 

Build Command:

 powerpc-apple-darwin9-gcc-4.2.1 -o stacktest stacktest.c 

Conclusion:

 131372 or 2012c 

Summary: cHao's answer is complete, and in case of doubt, this is experimental evidence.

+1
source

Source: https://habr.com/ru/post/917424/


All Articles