Std :: bind and stack-use-after-scope

So, today I ran some code created using Address Sanitizer, and came across a strange error using the stack to use. I have a simplified example:

#include <functional> class k { public: operator int(){return 5;} }; const int& n(const int& a) { return a; } int main() { kl; return std::bind(n, l)(); } 

ASAN complains about the last line of code:

 ==27575==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7ffeab375210 at pc 0x000000400a01 bp 0x7ffeab3750e0 sp 0x7ffeab3750d8 READ of size 4 at 0x7ffeab375210 thread T0 #0 0x400a00 (/root/tstb.exe+0x400a00) #1 0x7f97ce699730 in __libc_start_main (/lib64/libc.so.6+0x20730) #2 0x400a99 (/root/tstb.exe+0x400a99) Address 0x7ffeab375210 is located in stack of thread T0 at offset 288 in frame #0 0x40080f (/root/tstb.exe+0x40080f) This frame has 6 object(s): [32, 33) '<unknown>' [96, 97) '<unknown>' [160, 161) '<unknown>' [224, 225) '<unknown>' [288, 292) '<unknown>' <== Memory access at offset 288 is inside this variable [352, 368) '<unknown>' HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext (longjmp and C++ exceptions *are* supported) SUMMARY: AddressSanitizer: stack-use-after-scope (/root/tstb.exe+0x400a00) Shadow bytes around the buggy address: 0x1000556669f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100055666a00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100055666a10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 0x100055666a20: f1 f1 f8 f2 f2 f2 f2 f2 f2 f2 f8 f2 f2 f2 f2 f2 0x100055666a30: f2 f2 f8 f2 f2 f2 f2 f2 f2 f2 f8 f2 f2 f2 f2 f2 =>0x100055666a40: f2 f2[f8]f2 f2 f2 f2 f2 f2 f2 00 00 f2 f2 f3 f3 0x100055666a50: f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100055666a60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100055666a70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100055666a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100055666a90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb ==27575==ABORTING 

If I understand correctly, he says that we get access to the stack variable after it has already gone out of scope. If you look at uninstrumented and unoptimized disassembly, I really see that this happens inside the __invoke_impl instance:

 Dump of assembler code for function std::__invoke_impl<int const&, int const& (*&)(int const&), k&>(std::__invoke_other, int const& (*&)(int const&), k&): 0x0000000000400847 <+0>: push %rbp 0x0000000000400848 <+1>: mov %rsp,%rbp 0x000000000040084b <+4>: push %rbx 0x000000000040084c <+5>: sub $0x28,%rsp 0x0000000000400850 <+9>: mov %rdi,-0x28(%rbp) 0x0000000000400854 <+13>: mov %rsi,-0x30(%rbp) 0x0000000000400858 <+17>: mov -0x28(%rbp),%rax 0x000000000040085c <+21>: mov %rax,%rdi 0x000000000040085f <+24>: callq 0x4007a2 <std::forward<int const& (*&)(int const&)>(std::remove_reference<int const& (*&)(int const&)>::type&)> 0x0000000000400864 <+29>: mov (%rax),%rbx 0x0000000000400867 <+32>: mov -0x30(%rbp),%rax 0x000000000040086b <+36>: mov %rax,%rdi 0x000000000040086e <+39>: callq 0x4005c4 <std::forward<k&>(std::remove_reference<k&>::type&)> 0x0000000000400873 <+44>: mov %rax,%rdi 0x0000000000400876 <+47>: callq 0x40056a <k::operator int()> 0x000000000040087b <+52>: mov %eax,-0x14(%rbp) 0x000000000040087e <+55>: lea -0x14(%rbp),%rax 0x0000000000400882 <+59>: mov %rax,%rdi 0x0000000000400885 <+62>: callq *%rbx => 0x0000000000400887 <+64>: add $0x28,%rsp 0x000000000040088b <+68>: pop %rbx 0x000000000040088c <+69>: pop %rbp 0x000000000040088d <+70>: retq End of assembler dump. 

After calling k::operator int() it pushes the return value onto the stack and passes its address to n() , which immediately returns it, and then returns from __invoke_impl itself (and reaches the main return).

So, it looks like ASAN right here, and we really have access to the use stack after access.

Question: what is wrong with my code?

I tried to build it using gcc, clang and icc, and they all produce similar assembler outputs.

+5
source share
2 answers

std::bind essentially generates an implementation function object that calls the associated function with the required arguments. In your case, this implementation function object is roughly equivalent

 struct Impl { const int &operator()() const { int tmp = k_; return n(tmp); } private: k k_; Impl(/*unspecified*/); }; 

Since n returns its argument as a reference to const, the Impl call Impl will return a reference to a local variable, which is a dangling reference, which is then read from main . Consequently, the use of the stack after an area error.

Your confusion may be due to the fact that return n(l); Without bind , it is expected to work fine. However, in the latter case, a temporary int is created in the frame of the main stack, lives for the duration of the full expression, which is the return argument, which evaluates to int .

In other words, although temporary life is until the end of the full expression in which it was created, this does not apply to temporary functions called inside functions called inside this full expression. They are considered part of another complete expression and are destroyed when that expression has been evaluated.

PS: For this reason, the binding of any function (object) of the signature R(Args...) to std::function<const R&(Args...)> leads to a guaranteed return of a binding link when called - a construct that the IMO library should reject at compile time.

+4
source

Well, this is very difficult if you do not know the features of std::bind .

When binding an argument to a callable using std::bind copy of the argument is a servant ( source ):

Binding arguments are copied or moved and are never passed by reference unless they are enclosed in std :: ref or std :: cref.

std::bind(n, l) returns a callable object of an unspecified type having a member object of type k build as a copy of l . Note that this called object is temporary (rvalue), I will give it a name: bindtmp.

When called, bindtmp() creates a temporary (inttemp) integer (5) to apply bindtmp::lcopy to bindtmp::ncopy (these are member objects built from main::l and ::n ). ::n returns a const reference to inttemp inside the bindtmp() in the return statement.

Everything becomes complicated here ( source ):

Whenever a link is attached to a temporary or to a subobject, the lifetime of the temporary extension extends to match the lifetime of the link with the following exceptions:
- a temporary reference to the return value of the function in the return statement does not apply: it is immediately destroyed at the end of the returned expression. Such a function always returns a dangling link.
-...

This means that the temporary inttemp destroyed after returning ::n .

From this moment, everything falls apart. bindtmp() returns a reference to an object whose lifetime has expired, main tries to convert it to an lvalue value, and this sis where undefined behavior (using the odr of an object from the stack after using it).

+3
source

Source: https://habr.com/ru/post/1275318/


All Articles