The fastest way to calculate the "visual" image checksum

I am looking to create an ID system for cataloging images. I cannot use md5 (), as that will change if I change the EXIF โ€‹โ€‹tags of the image.

I am currently using the SHA1 checksum calculated by imagemagick. It works great, but in fact it is very slow on large images (~ 15 seconds on a quad-core xeon processor for 21 megapixels JPG).

Are there other โ€œvisualโ€ methods for uniquely identifying an image that would be faster?

+3
source share
4 answers

You can try running MD5 on the actual bitmap data instead of a JPEG file. I tested on my machine (also a quad-core Xeon processor), and the next one works in about 900 ms on a 23 megapixel image.

uint32_t width  = MagickGetImageWidth(imageWand);
uint32_t height = MagickGetImageHeight(imageWand);

uint8_t *imageData = malloc(width * height * 3);

MagickExportImagePixels(imageWand,
   0, 0, width, height, "RGB", CharPixel, imageData);

unsigned char *imageDigest = MD5(imageData, width * height * 3, NULL);

free(imageData);
+2
source

What do you mean by visual checksum? the algorithms you are talking about (md5 / sha / crc) work by the principle of bytes, but do not take into account the visual information of the image. If you convert one of your images to JPEG, two files will show the same image, but have completely different md5 / sha / crc checksums.

exif, , exiv2 . , , . , .

, : exif exiv2 ( ) .

: , ImageMagick , ( ).

+3

, MD5, , . , - , 32- 64- CRC . , CRC; . - MD5. , CRC , , .

exiftool , JPEG, , , .

Intel Core 2 Duo L7100 CPU, 8MP JPEG 1 PPM, 1 . , md5sum, sum sha1sum. , .

, , . :

djpeg -scale 1/8 big.jpg | /usr/bin/sha1sum   # 0.70s
djpeg            big.jpg | /usr/bin/sha1sum   # 2.15s
+1

You should think that someone can crop the image or change the palette, color depth or something else, then the flat checksum will be different, even if visually the original and the changed image still look almost the same. Perhaps there is an effective algorithm for cropped or repainted, for example, Google Images uses to search for similar images.

0
source

Source: https://habr.com/ru/post/1739055/


All Articles