Two problems:
- The exec permission on the page is because you used an array that will be the
.data section of the noexec read + write .data . - Your machine code does not end with a
ret statement, so even if it is still executed, execution will return to what will be further in memory, and not to return.
And by the way, the REX prefix is ββcompletely redundant. "\x31\xc0" xor eax,eax has the same effect as xor rax,rax .
You need a page containing machine code in order to have permission to execute . X86-64 page tables have a separate bit for execution, separate from read permissions, unlike legacy 386 page tables.
The easiest way to put static arrays into read + exec memory is to compile with gcc -z execstack . (Makes the stack and other section executable).
Until recently (2018 or 2019), the standard .rodata (binutils ld ) placed the .rodata section in the same ELF segment as the .text , so they would both have read permission + exec. Thus, using const char code[] = "..."; was enough to execute manually the specified bytes as data.
But on my Arch Linux system with GNU ld (GNU Binutils) 2.31.1 this is no longer the case. readelf -a shows that the .rodata section .rodata moved into the ELF segment with .eh_frame_hdr and .eh_frame and has read permission only. .text goes to the segment with Read + Exec, and .data goes to the segment with Read + Write (along with .got and .got.plt ). ( What is the difference between section and segment in ELF file format )
On older Linux systems: gcc -O3 shellcode.c &&./a.out (works due to const in global / static arrays)
On old and current gcc -O3 -z execstack shellcode.c &&./a.out Linux: gcc -O3 -z execstack shellcode.c &&./a.out (works due to -zexecstack no matter where your machine code). Also works with clang -z execstack . gcc allows -zexecstack to be -zexecstack without spaces, but clang does not.
They also work on Windows, where the data is for .rdata only in .rdata instead of .rodata .
The main generated by the compiler is as follows (from objdump -drwC -Mintel ). You can run it inside gdb and set breakpoints on code and ret0_code
(I actually used gcc -no-pie -O3 -zexecstack shellcode.c hence the addresses near 401000 0000000000401020 <main>: 401020: 48 83 ec 08 sub rsp,0x8
Or use system calls to change page permissions
Instead of compiling with gcc -zexecstack you can instead use mmap(PROT_EXEC) to highlight new executable pages or mprotect(PROT_EXEC) to change existing pages to executable. (Including pages containing static data.) Typically, you also want at least PROT_READ and sometimes PROT_WRITE .
Using mprotect for a static array means that you are still executing code from a known place, possibly making it easier to set a breakpoint on it.
On Windows, you can use VirtualAlloc or VirtualProtect.
In GNU C, you also need to use __builtin___clear_cache(buf, buf + len) after writing bytes of machine code to the buffer, because the optimizer does not consider dereferencing a function pointer as reading bytes from this address. Removing dead storage can remove the byte storage of machine code bytes into the buffer if the compiler proves that the storage is not being read as data. https://codegolf.stackexchange.com/questions/160100/the-repetitive-byte-counter/160236#160236 and https://godbolt.org/g/pGXn3B has an example where gcc really does this optimization because gcc "knows about" malloc .
(And on non-x86 architectures, where the I-cache is not consistent with the D-cache, it will actually perform any necessary cache synchronization. On x86, it's just an optimization blocker at compile time.)
My change in @AntoineMathys answer added this. Currently, gcc does not know about mmap , so it does not optimize storage for the pointer returned by mmap .
But it is not needed after mprotect on a page containing read-only C variables.
#include <stdio.h>
I used PROT_READ|PROT_EXEC|PROT_WRITE in this example so that it works no matter where your variable is located. If it was local on the stack and you missed PROT_WRITE , the call will fail after the stack is read only when you try to send a return address.
In addition, PROT_WRITE allows you to test the PROT_WRITE code, which changes automatically, for example, edit zeros in your machine code or other bytes that he avoided.
$ gcc -O3 shellcode.c
If I comment out mprotect , segfault does this.
If I did something like ret0_code[2] = 0xc3; After that I will need __builtin___clear_cache(ret0_code+2, ret0_code+2) to make sure that the storage has not been optimized, but if I do not mprotect static arrays, then I will not need it after mprotect . This is necessary after mmap + memcpy or manual saving, because we want to execute the bytes that were written in C (with memcpy ).