How to download an image from HTTP only if the image is newer?

Question

How to download an image from HTTP only if the image is newer?

I would like to implement the following functions:

The C # client connects to the HTTP server and uploads the image to disk.
The next time the client starts checking if the image on the server is newer than the image on the disk, in which case the client overrides the image on the disk.

It's easy for me to upload an image, but I'm not sure how to check if the image on the server is newer. How could I implement it? I suppose I can check the timestamp or image size (or both), but I don't know how to do this.

+6

http c # .net

Daniel Peñalba Aug 7 '12 at 16:34

source share

3 answers

Try specifying the If-Modified-Since request field. http://en.wikipedia.org/wiki/List_of_HTTP_header_fields I am not sure that it is fully supported by each server. Therefore, if it is not supported, and you still get the file (and not 304 if it is supported), you can calculate the checksums and, if they differ, consider the modified file. Or just overwrite - and you will always have a new version.

+6

Andrey Aug 7 '12 at 16:40

source share

You need to read RFC 2616 and the corresponding RFCs (search for 1616 at http://www.rfc-editor.org/cgi-bin/rfcsearch.pl ). In particular, section 13 is of interest, Caching in HTTP, pages 47 - 62. Then, read the appropriate request / response headers and the corresponding status codes that you can return.

You get access to all headers and status values through the HttpWebRequest and HttpWebResponse .

But it should be noted that you can ask the server what you want: in the end, it is the server that decides whether to send you a new view of this URI. You can use the HTTP HEAD verb rather than GET to poll the server about the resource.

The HEAD method is identical to GET, except that the server SHOULD NOT return the message body in the response. The metadata contained in the HTTP headers in response to the HEAD request MUST be identical to the information sent in response to the GET request. This method can be used to obtain meta-information about the entity implied by the request without transmitting the entity body itself. This method is often used to check hypertext links for accuracy, availability, and recent changes.

+2

Nicholas carey Aug 7 '12 at 16:45

source share

Jon hanna · Accepted Answer · 2012-08-07T17:06:16+0000

HttpWebRequest can use the IE cache, so if all the images are in this cache anyway, and the cost of overwriting the file (but not for downloading it) is acceptable, you can simply use this.

If you need to deal with this yourself, then:

Given:

string uri; //URI of the image. DateTime? lastMod; // lastModification date of image previously recorded. Null if not known yet. string eTag; //eTag of image previously recorded. Null if not known yet.

You will need to save them at the end of this and return them again (when this is not a new image) at the beginning. This is for you, given that everything works:

 var req = (HttpWebRequest)WebRequest.Create(uri); if(lastMod.HasValue) req.IfModifiedSince = lastMod.Value;//note: must be UTC, use lastMod.Value.ToUniversalTime() if you store it somewhere that converts to localtime, like SQLServer does. if(eTag != null) req.AddHeader("If-None-Match", eTag); try { using(var rsp = (HttpWebResponse)req.GetResponse()) { lastMod = rsp.LastModified; if(lastMod.Year == 1)//wasn't sent. We're just going to have to download the whole thing next time to be sure. lastMod = null; eTag = rsp.GetResponseHeader("ETag");//will be null if absent. using(var stm = rsp.GetResponseStream()) { //your code to save the stream here. } } } catch(WebException we) { var hrsp = we.Response as HttpWebResponse; if(hrsp != null && hrsp.StatusCode == HttpStatusCode.NotModified) { //unfortunately, 304 when dealt with directly (rather than letting //the IE cache be used automatically), is treated as an error. Which is a bit of //a nuisance, but manageable. Note that if we weren't doing this manually, //304s would be disguised to look like 200s to our code. //update these, because possibly only one of them was the same. lastMod = hrsp.LastModified; if(lastMod.Year == 1)//wasn't sent. lastMod = null; eTag = hrsp.GetResponseHeader("ETag");//will be null if absent. } else //some other exception happened! throw; //or other handling of your choosing }

E-tags are more reliable than the last ones; they are modified if implemented correctly (noting seatpost permissions for changes and reflecting different answers due to different Accept- * headers). However, some implementations are erroneous (IIS6 on a web farm without much configuration, Apache with mod-gzip), so it’s worth extracting the code related to electronic tags and just go to the date.

Edit: if you want to implement HTTP caching even more, you can also keep the expiration date and the maximum age (use the latter if both are present and he does not agree with the first) and fully downloaded if it is earlier than these values. I did this and it works well (I had a cache in memory of objects created from XML returned by different URIs, and if the XML was fresh or not changed, I reused the object), but this may not be appropriate for your needs ( if you want to be fresher than the server offers, or if you will always be outside this window).

How to download an image from HTTP only if the image is newer?

More articles: