How to find out if two exe are the same by code?

Is there a way to determine if two EXEs (compiled from VS.Net 2008 for C ++ / MFC) have any code level changes between them, i.e. in order to know that there were no changes to the instructions.

This meets the requirements when my provider sends me an exe, supposedly without any changes made to the code since the last test.

Is there a tool to verify that this is so?

Greetings

+4
source share
7 answers

You can use the disassembler to dismantle the executable in the assembly and compare it with a regular text analysis tool.

But even that will not be 100% accurate. The compilation process is not lossless, and most of the information is lost or irreversibly transformed when compiling C ++ code.

In particular, different compiler settings can generate significantly different machine code from the same source. Different compilers and even different versions or service pack / patch levels of the same compiler can create completely different machine codes from the same source files.

Another question: why do they even send you exe back "supposedly without any changes"? If so, why don't you just use the one you originally had?

+4
source

Automate your testing so that tests can be run quickly.

Although this is a small expression to make, this is a great undertaking

+2
source

For binary auditing, one of the best hand-down tools you should have is an interactive disassembler , also known as IDA Pro . This is necessary when you need to conduct an audit without access to the source code. Someone who owns IDA Pro will be able to tell you with sufficient confidence if there was anything more than superficial changes in the source code. In this context, surface changes would cover things like renaming variables in source files or changing the order of declarations of variables, functions, or classes and definitions. They will be able to tell you if the base blocks of code that make up the executable files have significant differences that can be flagged as suspicious, in the sense that there is a high probability that the differences indicate a difference at the source level.

I say more or less, because there are several ways in which two executable files generated from the same source tree can still have subtle and sometimes not so subtle differences. Factors that may affect the generation of executable files include:

  • Compiler Optimization Settings
  • different versions of libraries executable files are associated with
  • modifies header files external to the source tree used to build executable files that were included by the C ++ preprocessor prior to the compilation stage
  • an executable file that manipulates its own code at runtime, which may include decompressing or decrypting "on the fly" some part of itself to some area of ​​memory that it can go to

And this list may go on for some time.

Is the type of binary audit you offer? Yes, a person with sufficient knowledge and skills could do this. Hackers do this all the time. And if the person doing the analysis is good enough, they can tell you exactly how confident they are in their assessment.

Ultimately, this becomes a matter of feasibility. How much are you willing to spend on this audit? Hiring or contracting with someone who can do this can go beyond what the budget provides for such an audit, is there enough money for this? How hard is it to test the software? What is the nature of your relationship with your seller?

This last question is important because if it is in their interests to pass this audit and they are aware of it, they may be ready to help you to a certain extent. This can happen in the form of debugging symbols, a list of used compiler parameters, or some other artifacts of the assembly process that they are ready to disclose. The previous ones can be very useful in any analysis where the source code is not available for analysis purposes for any reason. And if access to the source code is available for this purpose, everything becomes easier to analyze by an order of magnitude.

If this is something you would like to continue on your own, two books that I would recommend are the IDA Pro Book: An Unofficial Guide to the World A Popular Disassembler from Chris Eagle and the Shellcoder Handbook: Finding and Using Protective Holes by Chris Anley, John Heisman, Felix Linder and Gerardo Richarte.

Finally, methods and tools designed for this kind of analysis that will help you are still very active areas of research. Your question either goes deeper than you can understand, or perhaps I was misunderstood. A thorough interpretation of your question, even from a practical point of view and ignoring the theory that goes with it, can and fills many books.

I hope you find at least some of this useful information. Good luck

+2
source

You can always execute MD5sum in executable files. This will not tell you whether they are logically equivalent or different, just that there is a difference.

I am not sure if this will solve your problem as you can look for a tool for comparison.

+1
source

If you control the source, just do not send exes that do not have relevant version information associated with them.

If for some reason they create their own exes, I would suggest creating a build step that they should use to insert the version control version number into the version information.

If they do not use your build step (which you may find), then you assume that they are different.

Most version control systems (such as SVN, for example) will allow you to create a build step that tells you if the code is in an altered state or not. You can include this information in a line in the embedded resource for exe. Then you would extract this resource.

So, to make sure all assemblies come from your custom build script.

+1
source

From now on, add the step of creating a message that will generate the MD5 source files and add it to the VERSION resource (so you can see it in the exe properties).
It will cost you 2 or 3 person-days.

+1
source

Download exes to a hexadecimal comparison program (BeyondCompare rocks!).

If there are any non-trivial changes (provided that the compiler settings have not changed), they should be fairly easy to pick up. If it is a matter of time, etc., it can be pretty obvious.

This is definitely not reliable, but it will be my first step.

+1
source

Source: https://habr.com/ru/post/1306441/


All Articles