Reverse engineer optimized c code from assembly

Question

Reverse engineer optimized c code from assembly

The goal of this problem is to reverse engineer the c code that was created after the compiler started with level 2 optimization. The c source code looks like this (calculates the largest common factor):

int gcd(int a, int b){ int returnValue = 0; if (a != 0 && b != 0){ int r; int flag = 0; while (flag == 0){ r = a % b; if (r ==0){ flag = 1; } else { a = b; b = r; } } returnValue = b; } return(returnValue); }

when I started the optimized compiler, I started this from the command line:

 gcc -O2 -S Problem04b.c

to get the build file for this optimized code

 .gcd: .LFB12: .cfi_startproc testl %esi, %esi je .L2 testl %edi, %edi je .L2 .L7: movl %edi, %edx movl %edi, %eax movl %esi, %edi sarl $31, %edx idivl %esi testl %edx, %edx jne .L9 movl %esi, %eax ret .p2align 4,,10 .p2align 3 .L2: xorl %esi, %esi movl %esi, %eax ret .p2align 4,,10 .p2align 3 .L9: movl %edx, %esi jmp .L7 .cfi_endproc

I need to convert this assembly code back to c code here, where I am now:

 int gcd(int a int b){ /* testl %esi %esi sets zero flag if a is 0 (ZF) but doesn't store anything */ if (a == 0){ /* xorl %esi %esi sets the value of a variable to 0. More compact than movl */ int returnValue = 0; /* movl %esi %eax ret return the value just assigned */ return(returnValue); } /* testl %edi %edi sets zero flag if b is 0 (ZF) but doesn't store anything */ if (b == 0){ /* xorl %esi %esi sets the value of a variable to 0. More compact than movl */ int returnValue = 0; /* movl %esi %eax ret return the value just assigned */ return(returnValue); } do{ int r = b; int returnValue = b; }while(); }

Can someone help me write this back in c code? I almost got lost.

+6

c compiler-optimization assembly x86 reverse-engineering

scottybobby Feb 19 '15 at 16:18

source share

1 answer

Daniel Kleinstein · Accepted Answer · 2015-02-19T17:08:50+0000

First of all, you mixed the values in your code. %esi starts at b and %edi starts at a .

You can deduce from the line testl %edx, %edx that %edx used as a condition variable for a loop starting with .L7 (if %edx is other than 0, then control is passed to .L9 and then returns to .L7 ). We will refer to %edx as remainder in our reverse engineering code.

Let's begin to reconstruct the main loop:

 movl %edi, %edx

Since %edi stores a , this is equivalent to initializing the remainder value with a : int remainder = a; .

 movl %edi, %eax

Save int temp = a;

 movl %esi, %edi

Run int a = b; (remember that %edi is a and %esi is b ).

sarl $31, %edx

This arithmetic shift instruction shifts our remainder variable 31 bits to the right, keeping the sign of the number. By moving 31 bits, you set remainder to 0 if it is positive (or zero) and -1 if it is negative. So this is equivalent to remainder = (remainder < 0) ? -1 : 0 remainder = (remainder < 0) ? -1 : 0 .

idivl %esi

Divide %edx:%eax by %esi or in our case divide remainder * temp by b (variable). The rest will be saved in %edx or in our remainder code. When combining this with the previous statement: if remainder < 0 then remainder = -1 * temp % b and otherwise remainder = temp % b .

 testl %edx, %edx jne .L9

Make sure that if remainder is 0 - if not, go to .L9 . The code there just sets b = remainder; before returning to .L7 . To implement this in C, we will save the variable count , which will store the number of times the loop repeats. We do b = remainder at the beginning of the loop, but only after the first iteration, which means count != 0 .

Now we are ready to build our complete C cycle:

 int count = 0; do { if (count != 0) b = remainder; remainder = a; temp = a; a = b; if (remainder < 0){ remainder = -1 * temp % b; } else { remainder = temp % b; } count++; } while (remainder != 0)

And after completing the cycle,

 movl %esi, %eax ret

It will return the GCD that the program calculated (in our code it will be stored in variable b ).

Reverse engineer optimized c code from assembly

More articles: