Save CRC value in file without changing actual CRC checksum?

I save some objects that I defined from my own classes to a file. (saving stream data).

This is good, but I would like to be able to store the CRC checksum of this file in the file.

Then, when my application tries to open the file, it can read the internally stored CRC value.

Then check the actual file if the CRC of the file matches the internally stored CRC value. I can handle the file normally, otherwise display an error message to say that the file is invalid.

I need advice on how to do this, although I thought I could do something like this:

  • Save the file from my application.
  • Calculate the CRC of the saved file.
  • Edit the saved file while keeping the CRC value.
  • Whenever a file is open, check that the CRC matches the internal CRC value.

The problem is that as soon as one byte of data is changed in the file, the results of the CRC checksum are completely different - as expected.

+4
source share
4 answers

Simply put, you need to exclude the bytes used to store the checksum from the checksum calculation.

Write the checksum as the last thing in the file. Calculate it based on the contents of the file except the checksum. When you come to read the file, calculate the checksum based on the contents to the checksum. Or you can write the checksum as the first bytes of a random access file. Just as long as you know where it is.

+8
source

I usually prefer an approach where the CRC exception is excluded from the check. But if this is not possible for some reason, there is a workaround:

You need to reserve 8 bytes, 4 for CRC and 4 for compensation data. First, fill the reserved bytes with a specific dummy value (say 0x00 ). Then calculate the CRC in the first 4 bytes and finally change the remaining 4 bytes so that the CRC of the file remains the same.

More on how to perform this calculation: Reversible CRC32


I actually used this in one of my projects :

I was developing a zip based file format. The first file in the archive is stored uncompressed and serves as the header file. It also means that it is stored at a fixed offset in the file. It is still pretty standard and similar, for example, ePub.

Now I decided to include the hash in sha1 in the header to give each file a unique identifier based on the content and integrity check. Since the header and thus the sha1 hash are at a known offset in the file, masking it when the hashing is trivial. So I put a dummy hash and create a zip file, then a hash file and populate the real hash.

But now there is a problem: Zip stores the CRC of all contained files. And not only in one place, which is easy to mask during sha1 hashing, but in second place with a variable offset closer to the end of the file. So I decided to go with CRC rigging, so I got my strong hash, and zip gets its valid CRC32.

And since I already faked CRC for the final file, I decided that faking it for the original header file would not hurt either. Thus, all files in this format now start with a header file with CRC 0xD1CE0DD5 .

+9
source

Store the CRC as part of the file itself, but do not include the data for it in the CRC calculation. If you have some kind of fixed header, nullify the CRC field before passing it to the CRC function. If not, just add it to the end of the file and pass everything except the last 4 bytes to the CRC function.


Alternatively, if the files are stored on an NTFS drive, and you do not need to transfer them to another computer, you can use NTFS Alternate Data Streams to store the ZPC. Basically, you open a file with the name ADS separated by a colon (for example, C:\file.txt:CRC ). Windows handles the difference internally, so you can use simple TFileStream functions to manage them.

Alternative data streams are stored separately from the standard file stream, so opening or changing only C:\file.txt will not affect it.

So, the code will look like this:

 procedure UpdateCRC(const aFileName: string); var FileStream, ADSStream: TStream; CRC: LongWord; begin FileStream := TFileStream.Create(aFileName, fmOpenRead); try CRC := CrcOf(FileStream); finally FileStream.Free; end; ADSStream := TFileStream.Create(aFileName + ':CRC', fmCreate); try ADSStream.WriteBuffer(CRC, SizeOf(CRC)); finally ADSStream.Free; end; end; 

If you need to find all the alternative data streams attached to the file (there may be several), you can iterate through them using BackupRead . Internet Explorer uses ADS to support "This file has been downloaded from the Internet. Are you sure you want to open it?" invitation.

+6
source

I would recommend storing the checksum in another file, possibly in a .ini file. Or for a really strange idea, you can include a checksum as part of the file name.
i.e. MyFile_checksum_digits_here.dat

+1
source

Source: https://habr.com/ru/post/1387615/


All Articles