How do you confirm that 2 copies of the VB 6 executable came from the same code base?

I have a versioned program that has gone through several releases. Today there was a situation when someone somehow managed to point to an old copy of the program and, thus, ran into errors that have since been fixed. I would like to go back and just delete all the old copies of the program (maintaining them is a company policy that dates back to how version control was normal and no longer needed), but I need a way to verify that I can generate the same executable , which is better than saying, "The old came out of this fixation so that it is one and the same."

My initial thought was to just have an MD5 hash executable, save the hash file in the original control and do with it, but I ran into a problem that I could not even parse.

It seems that every time an executable file is generated (method: Open Project. File> Make X.exe), it hashes differently. I noticed that Visual Basic works with files every time a project opens seemingly randomly, but I didn’t think it would get into an executable file and I have no evidence that this is really what is happening. To try to protect myself from this, I tried to generate an executable file several times in one IDE session and check the hashes, but each time they were different.

So this is:

  • Create executable file
  • Create MD5 checksum: md5sum X.exe > X.md5
  • Check MD5 for the current executable: md5sum -c X.md5
  • Create a new executable file
  • Check MD5 for the new executable: md5sum -c X.md5
  • Fault tolerance because the calculated checksum does not match.

I don’t understand anything about MD5 or how VB 6 generates an executable file, but I am also not married to the idea of ​​using MD5. If there is a better way to verify that the two executables are really the same, I'm all ears.

Thanks in advance for your help!

+4
source share
1 answer

It will be almost impossible. Read why.

The compiler wins this game every time ...

Compiling the same project twice in a row, even without making any changes to the source code or project parameters, will always create different executable files.

One reason for this is because the PE (Portable Executable) format that Windows uses for EXE files includes a timestamp indicating the date and time the EXE was created, which is updated by the VB6 compiler whenever you build a project. In addition to the “main” time stamp for the EXE as a whole, each resource directory in the EXE (where icons, bitmaps, lines, etc. are stored in the EXE) also has a time stamp, which the compiler also updates when creating a new EXE. In addition to this, EXE files also have a checksum field that the compiler recounts based on the source binary contents of the EXE. Since timestamps are updated to the current date / time, the checksum for the EXE will also change every time the project is recompiled.

But, but ... I found this really cool EXE editing tool that can undo this compiler trick!

There are EXE editing tools, such as PE Explorer , which claim to be able to set all timestamps in an EXE file for a fixed time. At first glance, you might think that you can simply set the timestamps in two EXE instances on the same date and end up with equivalent files (provided that they were created from the same source code), but it's more complicated than that: the compiler is free to write resources (lines, icons, file version information, etc.) in a different order each time you compile code, and you cannot really prevent this. Resources are stored as independent "chunks" of data that can be reordered as a result of the EXE without affecting the behavior of the program at runtime.

If this was not enough, the compiler can expand the EXE file in the area of ​​uninitialized memory, so certain parts of the EXE may contain bits and fragments of what was in memory during the compiler's operation, creating even more differences.

As for MD5 ...

You do not misunderstand MD5 hashing: MD5 will always produce the same hash with the same input. The problem here is that the input in this case (exe files) is changing.

Conclusion: source control is your friend

As for solving your current dilemma, I’ll leave you with this: associating a particular EXE with a specific version of the source code is more of a policy issue that needs to be applied in some way than anything else. Trying to figure out which EXE came from which version without any context will simply not be reliable. You need to track this with other tools. For example, make sure that each assembly creates a different version number for your EXE and that this version can easily be paired with a specific version / branch / tag / regardless of your version control system. To this end, the situation is "free for everyone" when some developers use the source of control, while others use "that copy of the source code from 1997 that I keep in my network folder, because my code and source code are for sysies anyway "won't help make it easier. I would force everyone to drink control of Kool-Aid and adhere to the standard policy for creating assemblies right away.

Whenever we build projects, our build server (we use Hudson ) ensures that the compiled version of EXE is updated to include the current build number (for this we use the Version Number Plugin and a custom build script), and when we release the build, we create the tag in Subversion using the version number as the tag name. The assembly of the assembly of the assembly server archives, so we can always get the specific EXE program (and settings) provided to the client. For internal testing, we can choose to pull the archive EXE from the build server or simply tell the build server to rebuild the EXE from the tag that we created in Subversion.

We also never and never release any binaries for QA or clients from any machine except the build server. This prevents errors "running on my machine" and ensures that we always compile from a "known" copy of the source code (it only pulls and builds the code that is in our Subversion repository), and that we can always associate the given binary with the exact version the code from which it was created.

+11
source

Source: https://habr.com/ru/post/1309775/


All Articles