Two problems:
- The exec permission on the page is because you used an array that will be the
.data
section of the noexec read + write .data
. - Your machine code does not end with a
ret
statement, so even if it is still executed, execution will return to what will be further in memory, and not to return.
And by the way, the REX prefix is ββcompletely redundant. "\x31\xc0"
xor eax,eax
has the same effect as xor rax,rax
.
You need a page containing machine code in order to have permission to execute . X86-64 page tables have a separate bit for execution, separate from read permissions, unlike legacy 386 page tables.
The easiest way to put static arrays into read + exec memory is to compile with gcc -z execstack
. (Makes the stack and other section executable).
Until recently (2018 or 2019), the standard .rodata
(binutils ld
) placed the .rodata
section in the same ELF segment as the .text
, so they would both have read permission + exec. Thus, using const char code[] = "...";
was enough to execute manually the specified bytes as data.
But on my Arch Linux system with GNU ld (GNU Binutils) 2.31.1
this is no longer the case. readelf -a
shows that the .rodata
section .rodata
moved into the ELF segment with .eh_frame_hdr
and .eh_frame
and has read permission only. .text
goes to the segment with Read + Exec, and .data
goes to the segment with Read + Write (along with .got
and .got.plt
). ( What is the difference between section and segment in ELF file format )
On older Linux systems: gcc -O3 shellcode.c &&./a.out
(works due to const
in global / static arrays)
On old and current gcc -O3 -z execstack shellcode.c &&./a.out
Linux: gcc -O3 -z execstack shellcode.c &&./a.out
(works due to -zexecstack
no matter where your machine code). Also works with clang -z execstack
. gcc allows -zexecstack
to be -zexecstack
without spaces, but clang does not.
They also work on Windows, where the data is for .rdata
only in .rdata
instead of .rodata
.
The main
generated by the compiler is as follows (from objdump -drwC -Mintel
). You can run it inside gdb
and set breakpoints on code
and ret0_code
(I actually used gcc -no-pie -O3 -zexecstack shellcode.c hence the addresses near 401000 0000000000401020 <main>: 401020: 48 83 ec 08 sub rsp,0x8
Or use system calls to change page permissions
Instead of compiling with gcc -zexecstack
you can instead use mmap(PROT_EXEC)
to highlight new executable pages or mprotect(PROT_EXEC)
to change existing pages to executable. (Including pages containing static data.) Typically, you also want at least PROT_READ
and sometimes PROT_WRITE
.
Using mprotect
for a static array means that you are still executing code from a known place, possibly making it easier to set a breakpoint on it.
On Windows, you can use VirtualAlloc or VirtualProtect.
In GNU C, you also need to use __builtin___clear_cache(buf, buf + len)
after writing bytes of machine code to the buffer, because the optimizer does not consider dereferencing a function pointer as reading bytes from this address. Removing dead storage can remove the byte storage of machine code bytes into the buffer if the compiler proves that the storage is not being read as data. https://codegolf.stackexchange.com/questions/160100/the-repetitive-byte-counter/160236#160236 and https://godbolt.org/g/pGXn3B has an example where gcc really does this optimization because gcc "knows about" malloc
.
(And on non-x86 architectures, where the I-cache is not consistent with the D-cache, it will actually perform any necessary cache synchronization. On x86, it's just an optimization blocker at compile time.)
My change in @AntoineMathys answer added this. Currently, gcc does not know about mmap
, so it does not optimize storage for the pointer returned by mmap
.
But it is not needed after mprotect
on a page containing read-only C variables.
#include <stdio.h>
I used PROT_READ|PROT_EXEC|PROT_WRITE
in this example so that it works no matter where your variable is located. If it was local on the stack and you missed PROT_WRITE
, the call
will fail after the stack is read only when you try to send a return address.
In addition, PROT_WRITE
allows you to test the PROT_WRITE
code, which changes automatically, for example, edit zeros in your machine code or other bytes that he avoided.
$ gcc -O3 shellcode.c
If I comment out mprotect
, segfault does this.
If I did something like ret0_code[2] = 0xc3;
After that I will need __builtin___clear_cache(ret0_code+2, ret0_code+2)
to make sure that the storage has not been optimized, but if I do not mprotect
static arrays, then I will not need it after mprotect
. This is necessary after mmap
+ memcpy
or manual saving, because we want to execute the bytes that were written in C (with memcpy
).