What are the rules for cast pointers in C?

K&R does not go through it, but they use it. I tried to see how it works by writing an example program, but everything went not so well:

#include <stdio.h> int bleh (int *); int main(){ char c = '5'; char *d = &c; bleh((int *)d); return 0; } int bleh(int *n){ printf("%d bleh\n", *n); return *n; } 

It compiles, but my print statement spills out garbage variables (they are different each time the program is called). Any ideas?

+55
c casting pointers
Jun 23 '13 at 11:57
source share
7 answers

When you think of pointers, it helps to draw diagrams . A pointer is an arrow pointing to an address in memory, with a label indicating the type of value. The address indicates where to look, and the type indicates what to do. Hovering the cursor changes the mark on the arrow, but not where the arrow is indicated.

d in main is a pointer to c , which is of type char . A char is one byte of memory, so when d dereferenced, you get a value in that byte of memory. In the diagram below, each cell represents one byte.

 -+----+----+----+----+----+----+- | | c | | | | | -+----+----+----+----+----+----+- ^~~~ | char d 

When you press d on int* you say that d really points to the value of int . On most systems today, int takes 4 bytes.

 -+----+----+----+----+----+----+- | | c | ?₁ | ?β‚‚ | ?₃ | | -+----+----+----+----+----+----+- ^~~~~~~~~~~~~~~~~~~ | int (int*)d 

When you cast (int*)d , you get a value that is determined from these four bytes of memory. The value you get depends on what is in these cells marked with ? , and how int displayed in memory.

The PC is little-endian , which means that the int value is calculated this way (assuming it covers 4 bytes): * ((int*)d) == c + ?₁ * 2⁸ + ?β‚‚ * 2¹⁢ + ?₃ * 2²⁴ . Thus, you will see that while the value is garbage, if you print in hexadecimal format ( printf("%x\n", *n) ), the last two digits will always be 35 (this is the value of the character '5' ).

Some other systems are large and arrange bytes in the other direction: * ((int*)d) == c * 2²⁴ + ?₁ * 2¹⁢ + ?β‚‚ * 2⁸ + ?₃ . On these systems, you will find that the value always starts at 35 when printing in hexadecimal format. Some systems have an int size that differs from 4 bytes. Rare few systems arrange int in different ways, but you are unlikely to meet them.

Depending on your compiler and operating system, you may find that each time the program starts, the value is different or it is always the same, but it changes when you make even small changes in the source code.

On some systems, the int value must be stored in an address that is a multiple of 4 (or 2 or 8). This is called alignment . Depending on whether the address c correctly aligned or not, the program may fail.

Unlike your program, this is what happens when you have an int value and hover over it.

 int x = 42; int *p = &x; 
 -+----+----+----+----+----+----+- | | x | | -+----+----+----+----+----+----+- ^~~~~~~~~~~~~~~~~~~ | int p 

Pointer p points to an int value. The label on the arrow correctly describes what is in the memory cell, so there are no surprises when dereferencing.

+105
Jun 23 '13 at 12:54 on
source share
 char c = '5' 

A char (1 byte) is allocated on the stack at 0x12345678 .

 char *d = &c; 

You will get the address c and save it in d , so d = 0x12345678 .

 int *e = (int*)d; 

You force the compiler to assume that 0x12345678 points to int , but int is not just one byte ( sizeof(char) != sizeof(int) ). It can be 4 or 8 bytes according to architecture or even other values.

Therefore, when you print the value of a pointer, an integer counts, taking the first byte (which was c ) and other consecutive bytes that are on the stack, and this is just rubbish for your purpose.

+33
Jun 23 '13 at 12:03
source share

Cast pointers are usually invalid in C. There are several reasons:

  • Alignment

    . It is possible that due to negotiation considerations, the type of the destination pointer cannot represent the value of the type of the source pointer. For example, if int * was essentially 4-byte aligned, casting char * to int * would lose the lower bits.

  • Aliasing In general, access to an object is prohibited, except through an lvalue of the correct type for the object. There are some exceptions, but if you do not understand them very well, you do not want to do this. Note that anti-aliasing is only a problem if you really look for a pointer (apply * or -> operators to it or pass it to a function that will dereference it).

The main notable cases where the fill pointers are in order are:

  • When the type of destination pointer indicates the type of character. It is guaranteed that pointers to character types can represent any pointer to any type and, if necessary, return it back to its original type. A pointer to void ( void * ) is exactly the same as a pointer to a character type, except that you are not allowed to play it or do arithmetic on it, and it will automatically convert to and from other types of pointers without requiring a cast, therefore pointers to void are usually preferable to pointers to character types for this purpose.

  • When the type of the destination pointer is a pointer to a type of structure whose members exactly correspond to the initial elements of the type of structure with the original focus. This is useful for various object oriented programming methods in C.

Some other obscure cases are technically acceptable in terms of language requirements, but are problematic and best avoided.

+12
Jun 23 '13 at 12:45
source share

You have a pointer to char . As your system knows, this memory address has a char value on sizeof(char) . When you execute it before int* , you will work with sizeof(int) data, so you will print your char and some memory garbage after it as an integer.

+2
Jun 23 '13 at 12:02 on
source share

I suspect you need a more general answer:

In C! The language allows you to point any pointer to any other pointer without comment.

But the point is: There is no data conversion or something done! It is solely your own responsibility to ensure that the system does not erroneously interpret the data after the translation, which, as a rule, will take place, which will lead to an execution error.

So, when you completely decide to take care that, if the data is used from a cast pointer, the data is compatible!

C is optimized for performance, so it lacks the runtime reflexivity of pointers / references. But this has a price - you, as a programmer, should take better care of what you do. You should know for yourself if what you want to do is "legal"

+2
Jun 23 '13 at 12:22
source share

The meaning of the garbage is actually due to the fact that you called the bleh () function before the declaration .

In the case of C ++, you get a compilation error, but in c, the compiler assumes the return type of the function is int , while your function returns a pointer to an integer.

See this for more information: http://www.geeksforgeeks.org/g-fact-95/

+2
Jan 05 '17 at 11:53 on
source share

What happens if we cast type β€œc” to int, and the memory extension allocated for conversion gets the memory addresses already allocated by the integer β€œe”?

 -+----+----+----+----+----+----+- | | c | | | e | -+----+----+----+----+----+----+- ^~~~ ^~~~ | char | int d -+----+----+----+----+----+----+- | | c | ?₁ | ?β‚‚ | ?₃ | | e -+----+----+----+----+----+----+- ^~~~ ^~~~ | int | int d 
0
Jan 28 '19 at 14:03
source share



All Articles