C: overwrite another function byte with byte

Suppose I have a function:

int f1(int x){ // some more or less complicated operations on x return x; } 

And that I have another function

 int f2(int x){ // we simply return x return x; } 

I would like to do something like the following:

 char* _f1 = (char*)f1; char* _f2 = (char*)f2; int i; for (i=0; i<FUN_LENGTH; ++i){ f1[i] = f2[i]; } 

those. I would like to interpret f1 and f2 as the original byte arrays and "overwrite f1 byte by byte" and thus replace it with f2 .

I know that usually called code is write protected, however in my particular situation you can just overwrite the memory location where f1 is located. That is, I can copy the bytes to f1 , but then, if I call f1 , it will all work.

So is my approach possible in principle? Or are there some problems with the machine / implementation / all kinds of problems that I have to consider?

+6
source share
4 answers

It would be easier to replace the first few bytes of f1 with jump by the beginning of f2 . This way, you won’t have to deal with any possible code porting issues.

Also, information about how many bytes a function takes ( FUN_LENGTH in your question) is usually not available at runtime. Using jump will also avoid this problem.

X86 requires the operation code of the relative transition command E9 (as per here ). This is a 32-bit relative jump, which means you need to calculate the relative offset between f2 and f1 . This code can do this:

 int offset = (int)f2 - ((int)f1 + 5); // 5 bytes for size of instruction char *pf1 = (char *)f1; pf1[0] = 0xe9; pf1[1] = offset & 0xff; pf1[2] = (offset >> 8) & 0xff; pf1[3] = (offset >> 16) & 0xff; pf1[4] = (offset >> 24) & 0xff; 

The offset is executed from the end of the JMP instruction, so if 5 is added to the offset calculation, add the address f1 .

It is a good idea to execute the result using an assembly-level debugger to make sure you puncture the correct bytes. Of course, all this does not meet the standards, so if it breaks, you can save both parts.

+8
source

Your approach is undefined behavior for standard C.

And in many operating systems (for example, Linux) your example will fail: the function code is inside the read-only .text segment (and section) of the ELF executable, and this sort-of segment ( mmap -ed is read-only execve (or on dlopen or the dynamic linker), so you cannot write inside it.

+1
source

Instead of rewriting a function (which you have already found fragile at best), I would consider using a function pointer:

 int complex_implementation(int x) { // do complex stuff with x return x; } int simple_implementation(int x) { return x; } int (*f1)(int) = complex_implementation; 

You would use it like:

 for (int i=0; i<limit; i++) { a = f1(a); if (whatever_condition) f1 = simple_implementation; } 

... and after the assignment, calling f1 will just return the input value.

Calling a function with a pointer imposes some overhead, but (thanks to the fact that this is common in OO languages) most compilers and processors do a pretty good job of minimizing this overhead.

+1
source

Most memory architectures will stop writing function code. This is a failure ... But some embedded devices, you can do such things, but it is dangerous if you do not know that there is enough space, the call will be in order, the stack will be in order, etc ...

Most likely, there is a better way to solve the problem.

0
source

Source: https://habr.com/ru/post/906896/


All Articles