It will be almost impossible. Read why.
The compiler wins this game every time ...
Compiling the same project twice in a row, even without making any changes to the source code or project parameters, will always create different executable files.
One reason for this is because the PE (Portable Executable) format that Windows uses for EXE files includes a timestamp indicating the date and time the EXE was created, which is updated by the VB6 compiler whenever you build a project. In addition to the “main” time stamp for the EXE as a whole, each resource directory in the EXE (where icons, bitmaps, lines, etc. are stored in the EXE) also has a time stamp, which the compiler also updates when creating a new EXE. In addition to this, EXE files also have a checksum field that the compiler recounts based on the source binary contents of the EXE. Since timestamps are updated to the current date / time, the checksum for the EXE will also change every time the project is recompiled.
But, but ... I found this really cool EXE editing tool that can undo this compiler trick!
There are EXE editing tools, such as PE Explorer , which claim to be able to set all timestamps in an EXE file for a fixed time. At first glance, you might think that you can simply set the timestamps in two EXE instances on the same date and end up with equivalent files (provided that they were created from the same source code), but it's more complicated than that: the compiler is free to write resources (lines, icons, file version information, etc.) in a different order each time you compile code, and you cannot really prevent this. Resources are stored as independent "chunks" of data that can be reordered as a result of the EXE without affecting the behavior of the program at runtime.
If this was not enough, the compiler can expand the EXE file in the area of ​​uninitialized memory, so certain parts of the EXE may contain bits and fragments of what was in memory during the compiler's operation, creating even more differences.
As for MD5 ...
You do not misunderstand MD5 hashing: MD5 will always produce the same hash with the same input. The problem here is that the input in this case (exe files) is changing.
Conclusion: source control is your friend
As for solving your current dilemma, I’ll leave you with this: associating a particular EXE with a specific version of the source code is more of a policy issue that needs to be applied in some way than anything else. Trying to figure out which EXE came from which version without any context will simply not be reliable. You need to track this with other tools. For example, make sure that each assembly creates a different version number for your EXE and that this version can easily be paired with a specific version / branch / tag / regardless of your version control system. To this end, the situation is "free for everyone" when some developers use the source of control, while others use "that copy of the source code from 1997 that I keep in my network folder, because my code and source code are for sysies anyway "won't help make it easier. I would force everyone to drink control of Kool-Aid and adhere to the standard policy for creating assemblies right away.
Whenever we build projects, our build server (we use Hudson ) ensures that the compiled version of EXE is updated to include the current build number (for this we use the Version Number Plugin and a custom build script), and when we release the build, we create the tag in Subversion using the version number as the tag name. The assembly of the assembly of the assembly server archives, so we can always get the specific EXE program (and settings) provided to the client. For internal testing, we can choose to pull the archive EXE from the build server or simply tell the build server to rebuild the EXE from the tag that we created in Subversion.
We also never and never release any binaries for QA or clients from any machine except the build server. This prevents errors "running on my machine" and ensures that we always compile from a "known" copy of the source code (it only pulls and builds the code that is in our Subversion repository), and that we can always associate the given binary with the exact version the code from which it was created.