This will not work too well [as written]. However, this is possible, so read on ...
This helps to find out what the actual stack structure is when the main function is called. This is a little more complicated than most people realize.
Assuming POSIX OS (for example, linux), the kernel sets the stack pointer to a fixed address.
The kernel performs the following actions:
It calculates how much space is needed for the lines of the environment variable (ie strlen("HOME=/home/me") + 1 for all environment variables and "pushes" these lines onto the stack in the down direction [towards lower memory] Then it calculates how much it was (for example, envcount ) and creates char *envp[envcount + 1] on the stack and populates the envp values envp pointers to the given lines. This null terminates this envp
A similar process is performed for argv strings.
The kernel then loads the ELF interpreter. The kernel starts the process with the start address of the ELF interpreter. The ELF interpreter [ultimately] calls the "start" function (for example, _start from crt0.o ), which executes some init, and then calls main(argc,argv,envp)
This is [view] what the stack looks like when calling main :
"HOME=/home/me" "LOGNAME=me" "SHELL=/bin/sh" // alignment pad ... char *envp[4] = { // address of "HOME" string // address of "LOGNAME" string // address of "SHELL" string NULL }; // string for argv[0] ... // string for argv[1] ... // ... char *argv[] = { // pointer to argument string 0 // pointer to argument string 1 // pointer to argument string 2 NULL } // possibly more stuff put in by ELF interpreter ... // possibly more stuff put in by _start function ...
In x86 , the pointer values argc , argv and envp are placed in the first three registers of the x86 ABI arguments.
Here's the problem [problems, plural, actually] ...
By the time all this is done, you hardly understand what the shell code address is. Thus, any code you write should be RIP-relative addressing and [probably] built using -fPIC .
And as a result, the resulting code cannot have a zero byte in the middle, because it is transmitted [by the kernel] as a string with completed EOS. Thus, a line with zero (e.g., <byte0>,<byte1>,<byte2>,0x00,<byte5>,<byte6>,... ) will only transmit the first three bytes, and not the entire shell code program.
Also you have a good idea as to what the value of the stack pointer is.
In addition, you need to find the memory word on the stack in which there is a return address (i.e. this is what the call main asm command launches).
This word containing the return address must be set to the address of the shell code. But it does not always have a fixed offset relative to the main frame variable (e.g. buf ). Thus, you cannot predict which word on the stack to change to get the "return to shell code" effect.
In addition, x86 architecture has special mitigation equipment. For example, a page might be marked with NX [no execute]. This is usually done for certain segments, such as the stack. If the RIP is modified to point to the stack, the hardware will fail.
Here's the [easy] solution ...
gcc has some built-in functions that can help: __builtin_return_address , __builtin_frame_address .
So, get the value of the real return address from the internal [name it retadr ]. Get the address of the stack frame [name it fp ].
Starting with fp and increasing (by sizeof(void*) ) towards higher memory, find the word corresponding to retadr . This is the memory location you want to change to point to shell code. It will probably be at offset 0 or 8
So then do: *fp = argv[1] and return.
Note. Additional steps may be required, because if the NX bit is set on the stack, the line indicated by argv[1] is on the stack, as indicated above.
Here is an example of code that works:
#define _GNU_SOURCE #include <stdio.h> #include <unistd.h> #include <sys/syscall.h> void shellcode(void) { static char buf[] = "shellcode: hello\n"; char *cp; for (cp = buf; *cp != 0; ++cp); // NOTE: in real shell code, we couldn't rely on using this function, so // these would need to be the CPP macro versions: _syscall3 and _syscall2 // respectively or the syscall function would need to be _statically_ // linked in syscall(SYS_write,1,buf,cp - buf); syscall(SYS_exit,0); } int main(int argc,char **argv) { void *retadr = __builtin_return_address(0); void **fp = __builtin_frame_address(0); int iter; printf("retadr=%p\n",retadr); printf("fp=%p\n",fp); // NOTE: for your example, replace: // *fp = (void *) shellcode; // with: // *fp = (void *) argv[1] for (iter = 20; iter > 0; --iter, fp += 1) { printf("fp=%p %p\n",fp,*fp); if (*fp == retadr) { *fp = (void *) shellcode; break; } } if (iter <= 0) printf("main: no match\n"); return 0; }