Find Missing Photos from Flickr, Facebook, Instagram, and iPhoto

I have almost 6 years of photos distributed by the following services: Flickr, Facebook, Instagram and saved on iPhoto.

What would be the best way to programmatically find out which photos are missing from each of these services?

Some ideas that I had:

  • Using MD5 image thumbnails?
  • Comparison of timestamps date / time?

I am looking for a way to create a list of URLs / file names that exist on one service but not on another.

I'm not nervous about the language used to decide what works on OS X.

+4
source share
3 answers

Using MD5 image thumbnails .. will not necessarily work, as different services crop their images differently. They also compress their images in different ways, so you won’t be able to run larger samples with md5.

Unfortunately, services like facebook also emit all EXIF ​​data.

Here is one possible solution:

I bet you can split the images into 2x2 pieces and get the average color for each grid cell. You will have four points per image. To judge the similarities, you simply make the sum of the squared differences between the images.

It is basically just four times RGB for an image 4 times. Doing this 4 times helps to account for rotation.

For easier, faster and more reliable analysis, I also offer the TinEye API.

If you want to write a similarity calculation algorithm yourself, look here for ideas:

Fingerprint for comparing the similarity of many images

+4
source

I will make the assumption that you already know how to receive photos through various APIs from each service and that the difficult part of the problem is comparing photos. Check out the following SO answers on how to do this:

And if you don't mind paying for a web service that does this, you can try Tineye's Match Engine .

+1
source

I think that creating a local centralized database of your photos should be the starting point of your work. So, if you do not already have such a database (or it is not being updated), you should continue and download each information from all your accounts.

This task should not be too difficult. There are several formal / informal methods and tools for downloading entire accounts from these social networks.

  • Facebook gives you a convenient zipfile with all your images, wall posts, etc., just go to account settings , and then select download a copy your data.
  • Flickr has a nice tool called Bulkr to upload all your photos.
  • Instagram does not provide official tools for this task, but you can choose, for example, Instagram Downloader and Instaport .
  • iPhoto must already be in sync.

Now that all of your photos are on your PC, you will have to figure out which ones are identical, and so on. I think this issue should provide a solution to this problem.

Personally, I vote for this method , in the hope that pHash can be compiled under OS X If pHash compiles and works, you can do the first pass of MD5 , SHA1 or anything else to determine the exact match. If there is no such match, you can run pHash to see how close both images are.

I could (given enough time) script everything in bash under Linux. I suppose this might work under Mac OS X , but you can probably achieve the same result and possibly even less code in Cocoa.

When you find out which photos are missing from this service, you can finally click them on this service. But I suppose another question begins here :)

+1
source

Source: https://habr.com/ru/post/1436977/


All Articles