Integrated JPEG checksum / fingerprint?

I am compiling a script to find deleted duplicates in a large image library. At the moment, I am doing a two-pass filter for the first search files of the same size, and then doing sha256 on a part of the file with a size of 10,240 bytes to get a fingerprint of files with the same size (code here ).

This works well, but I suppose the jpeg format probably uses checksums that I could use instead of doing sha256.

Does anyone know if there are checksums or other components that could act as checksums / fingerprints? If so, is there an effective way to access them?

+3
source share
6 answers

I do not think that the JPEG specification includes any checksum in the way you describe.

However, a JPEG may contain a thumbnail as part of its EXIF โ€‹โ€‹metadata. This is not an ideal indicator, since it is possible that two different images have the same thumbnail. There is at least one documented case where the thumbnail has not been replaced after the image has undergone significant changes, this sketch showed much more than the publisher expected.

+5
source

, IJG, , , , - . EXIF, ...

+1

script. . , , . , . jhead , ( , , , ). jhead () , . ImageDescription . , , . : exiv2 , exiftool .

+1

JPEG (ITU-T.81) , field/syntax, , jpeg. โ€‹โ€‹ , . , , . , , utlitiy (, windows fc/b) u .

-AD

0

, , - . - , ( ) "".

0

XMP , .

( ) , , jpeg, .

0

Source: https://habr.com/ru/post/1698461/


All Articles