Current scenario
I have an application that collects Instagram images from a specific hashtag. I use pagination to retrieve all images and store data (not images) locally in the database. The first application call removes all images. Subsequent calls are only collected by those that are newer than the newest image in the local database. Otherwise, I will need to make thousands of page requests through all the images in popular tags. And this needs to be done every few minutes if the images that appear without too much delay in the application. There is a problem: when users put tags on old images, these images are then not retrieved by my application - due to the performance design of just-select-new images.
Attempt to solve
I looked at the real-time API, but it seems to me that it is designed in such a way that makes it unacceptable. This is what it sends in real time to the tag:
{ "subscription_id": "2", "object": "tag", "object_id": "nofilter", "changed_aspect": "media", "time": 1297286541 }
I would think that there would be a list of media identifiers representing new / changed content from which I could get the actual content - but it is not. My current solution is to get new content every few minutes and then do full rescans every hour. This is suboptimal from both the user and the performance perspective.
Question
Is it really impossible to do this in a more elegant way? I understand that Instagram does not send the full content to the update in real time, but sending identifiers should not be a problem in terms of the size of the payload. It seems that the API is useless in this respect - the only use case I can think of where it will be useful is "There is new content for you, hehtag to watch" nofications.
Best, Torbena
source share