Realloc (), lifetime and UB

In a recent CppCon2016 conversation, “My little optimizer”: Undefined Behavior is Magic , which shows the following code (26 minutes in conversation), I graced it a bit:

#include <stdio.h> #include <stdlib.h> int main(void) { int* p = malloc(sizeof(int)); int* q = realloc(p, sizeof(int)); *p = 1; *q = 2; if (p == q) { printf("%d %d\n", *p, *q); } return 0; } 

There is Undefined behavior in the code (p becomes invalid after realloc (), even if realloc () returns the same pointer), and when compiling it can print not only "2 2", but also "1 2".

How about a slightly modified version of the code ?:

 #include <stdio.h> #include <stdlib.h> #include <stdint.h> int main(void) { int* p = malloc(sizeof(int)); uintptr_t ap = (uintptr_t)p; int* q = realloc(p, sizeof(int)); *(int*)ap = 1; *q = 2; if ((int*)ap == q) { printf("%d %d\n", *(int*)ap, *q); } return 0; } 

Why can I get "1 2"? Is the integer variable ap also somehow invalid or "corrupt"? If so, what is the logic here? Should ap be "separated" from p?

PS Added C ++ tag. This code can be trivially rewritten as C ++, and the same question applies in C ++. I am interested in both C and C ++.

+5
source share
4 answers

Recent C standards leave the issue controversial. N2090 claims DR260 Committee Response

was not included in the standard text, and it also leaves many specific issues unclear ...

So, it is reasonable to assume that, in fact, there is undefined behavior, even if it is not explicitly documented in the standard.

0
source

As stated in C, the code has undefined behavior, because realloc may return another block of memory. In this case, *(int *)ap forms an invalid pointer.

A more interesting question will be what happens if we change the code so that it only tries to act if realloc does not change the block:

 int* p = malloc(sizeof(int)); uintptr_t ap = (uintptr_t)p; int* q = realloc(p, sizeof(int)); if ( (uintptr_t)q == ap ) { *(int*)ap = 1; // ... } 

For C2X, there is a proposal N2090 to indicate the origin of the pointer when passing through integer types.

There are some rules related to proof in the current C standard, but it does not say what happens with the origin when the pointer is passed through integer types and vice versa.

As part of this proposal, my code will still have undefined behavior: ap gets the same token as p , which becomes an invalid token when the block is freed. (int *)ap then uses a pointer with an invalid origin.

The proposal is aimed at avoiding omissions of pointers that were "hacked" by intermediate operations using uintptr_t , etc. In this case, it indicates that (int *)ap has exactly the same behavior as p . (Which is undefined, even if the block did not move, since p is an invalid pointer after realloc whether it physically moved the block). On an abstract C machine, the goal is that it is impossible to determine if a block was moved using realloc.

Background to the prospectus

“Origin of the pointer” means the relationship between the values ​​of the pointer and the block of memory they point to. If the pointer value points to an object, then other pointer values ​​obtained from this value (for example, according to pointer arithmetic) must remain within this object.

(Of course, a pointer variable can be reassigned to point to another object - and thereby get a new origin - this is not what we are talking about).

This is not what appears in the compiled executable, but it is something that compilers can track at compile time to perform optimizations. Two pointers with different origins can have the same representation of memory (for example, p and q if the implementation used the same block of physical memory).

A simple example of why pointer proof provides useful optimization opportunities is the following snippet:

 char p[8]; int q = 5; *(p+10) = 123; printf("%d\n", q); 

The idea of ​​provenance allows the optimizer to register undefined behavior on p + 10 code, so it can translate this fragment to puts("5") , for example, even if q occurs immediately after p in memory, (I wonder if the DJ Bernstein boringcc compiler can actually will be able to perform this optimization).

Existing rules for checking the boundaries of pointers (C11 6.5.6 / 8) already cover this case, but in more complex cases they are unclear, therefore proposal N2090. For example, if ( p + 8 == (void *)&q ) *(char *)((uintptr_t)p + 10) = 123; according to N2090 will still be undefined.

+5
source

The code provided in the original question calls Undefined Behavior, so the compiler has the right to do whatever it wants. The following is an example background of Undefined Behavior.

Clang will behave strangely, however, with code that is similar to yours but does not call Undefined Behavior. If someone does not believe that any language in the Standard does not make sense, then, apparently, the clan does not meet the requirements in this regard. Some people would like to change the Standard to make them call UB, thereby justifying the behavior of clans, but I would consider such proposals as fundamentally wrong.

 #include <stddef.h> #include <stdlib.h> #include <stdint.h> uintptr_t gap,gaq; int test(void) { int x=0; uint8_t *p = calloc(4,1); uintptr_t ap = (uintptr_t)p; uint8_t *q = realloc(p,4); // p is no longer valid after this, but ap still holds some number. uintptr_t aq = (uintptr_t)q; *q=1; if (ap == aq) { x=256; // Nothing in the Standard would say that the result of casting a // uintptr_t to a pointer is affected by anything other than the // numerical value of the uintptr_t in question. If aq happened // to equal eg 8675309, then casting any expression equal to // 8675309 into an int* should yield the same value as casting aq; // since were here, we'd know that ap was also equal to 8675309, and // thus that (int*)ap is equivalent to (int*)aq. *(uint8_t*)ap = 123; } gap=ap; gaq=aq; return *q+x; } 

Clang 3.9.0, called on godbolt with the -xc -O3 -pedantic , generates code that either returns 1 or 257 depending on whether ap and aq compared, even if nothing in this standard allows ap be processed differently than any other variable of type uintptr_t that has the same value. The way of writing code, since external code does not have the right to observe p , it would be permissible for the compiler to generate code that sets ap to any arbitrary value that is not equal to aq , and then completely ignore the comparison, but nothing in the standard allows the implementation to do that -or else, besides writing the same value in gap and gaq and returning 379 (123 + 256) or writing different values ​​in gap and gaq and return 1.

Reference for UB comparison of invalid pointer to valid comparison

On some processors, trying to load a pointer into a register will cause the processor to perform some validation. For example, at 80286, each pointer includes a segment selector and an offset, and loading the segment selector will cause the processor to retrieve some information from the table of valid segments.

Some C implementations will load pointers into registers whenever something is done with them, regardless of whether they are used to access memory, and some C implementations for 80286 can invalidate a segment descriptor if the only one in the corresponding segment block of memory that has been freed. The authors of Standard C did not want to require that implementations of C spend effort by avoiding register loads in cases where pointers are not dereferenced, and they did not require implementations to maintain descriptor descriptor descriptors for freed pointers. The easiest way to enforce any requirement was to refrain from requiring anything in cases where the code does something that can free the pointer, and then does everything that can cause the pointer to be loaded into the register .

There are many implementations in which loading the pointer into the register is safe even if the freed storage was freed up or to avoid loading “trappable” in cases where the pointer is not dereferenced (loading pointers to general registers would be cheaper to compare than loading them into segment registers and transferring them to general purpose registers for comparison), and I see no reason to believe that the authors of the standard assumed that the code was intended solely for implementation implementations, anywhere from the above should not be able to use methods such as:

 void do_realloc(int new_size) { void *new_ptr = realloc(old_ptr, new_size); if (!new_ptr) fatal_error; if (new_ptr != old_pointer) update_pointers(); } 

in situations where realloc is likely to succeed "in place" (for example, because the block was shrinking) and where it would be possible - but expensive - to regenerate pointers to things in the allocated storage if the object was moved. Nevertheless, since the Standard does not require any implementations to support such methods even in cases where it would not cost anything, some implementations (even those where such support would not cost anything) to avoid this.

0
source

Why can I get "1 2"?

For the same reason as the source code, the optimizer knows *p invalid and works according to this assumption. Why is this invalid? Because...

J.2 Undefined Behavior

1 Undefined behavior in the following cases:

A pointer value is used that refers to the space freed up by calling the free or realloc function (7.20.3).

*p not valid. The optimizer knows that *(int*)ap really *p , so it is also invalid.

Looking at IR for both the original and your code with clang -S -O3 -emit-llvm , they are almost exactly the same. Both hardcode 1 and 2 in printf .

 %7 = tail call i32 (i8*, ...) @printf(i8* nonnull getelementptr inbounds ([7 x i8], [7 x i8]* @.str, i64 0, i64 0), i32 1, i32 2) 

Here is your main in IR mode.

 define i32 @main() #0 { %1 = tail call i8* @malloc(i64 4) %2 = tail call i8* @realloc(i8* %1, i64 4) %3 = bitcast i8* %2 to i32* %4 = bitcast i8* %1 to i32* store i32 1, i32* %4, align 4, !tbaa !2 store i32 2, i32* %3, align 4, !tbaa !2 %5 = icmp eq i8* %1, %2 br i1 %5, label %6, label %8 ; <label>:6 ; preds = %0 %7 = tail call i32 (i8*, ...) @printf(i8* nonnull getelementptr inbounds ([7 x i8], [7 x i8]* @.str, i64 0, i64 0), i32 1, i32 2) br label %8 ; <label>:8 ; preds = %6, %0 ret i32 0 } 

There is almost the same thing, except for the difference in the ordering of tail calls and bit-bits. This is yours.

  %1 = tail call i8* @malloc(i64 4) %2 = tail call i8* @realloc(i8* %1, i64 4) %3 = bitcast i8* %2 to i32* %4 = bitcast i8* %1 to i32* 

It's theirs.

  %1 = tail call i8* @malloc(i64 4) %2 = bitcast i8* %1 to i32* %3 = tail call i8* @realloc(i8* %1, i64 4) %4 = bitcast i8* %3 to i32* 

Everything else is the same.


As mentioned in the conversation, if is a red herring. This is just to illustrate how obviously absurd behavior is. He is still there in IR mode.

  %5 = icmp eq i8* %1, %2 br i1 %5, label %6, label %8 

You can see the optimizer at work if you print p , q and (int*)ap . All of them are the result of the first malloc.

  %1 = tail call i8* @malloc(i64 4) ... %8 = tail call i32 (i8*, ...) @printf(i8* nonnull getelementptr inbounds ([10 x i8], [10 x i8]* @.str.1, i64 0, i64 0), i8* %1, i8* %1, i8* %1) 

Pointer in order. This is acting out that problem.

-1
source

Source: https://habr.com/ru/post/1258270/


All Articles