How do I indicate that the HTTP status code 304 (NotModified) is not an error condition inside the Amazon S3 GetObject API?

Background

I am trying to use S3 as a large β€œinfinity” caching layer for some β€œpretty” static XML documents. I want to make sure that the client application (which will run on thousands of computers at the same time and requests XML documents many times per hour) downloads only these XML documents if their contents have changed since the last time the client application was downloaded.

An approach

On Amazon S3, we can use the HTTP ETAG for this. By default, Amazon S3 objects have their own ETAG set for the MD5 hash of the object.

Then we can specify the MD5 hash of the XML document inside the GetObjectRequest.ETagToNotMatch property. This ensures that when calling AmazonS3.GetObject (or in my case the async version of AmazonS3.BeginGetObject and AmazonS3.EndGetObject ), if the requested document has the same MD5 hash that is contained in GetObjectRequest.ETagToNotMatch , then S3 will automatically return HTTP code 304 ( NotModified), but the actual content of the XML document is not loaded.

Problem

However, the problem is that when calling AmazonS3.GetObject (or the async equivalent), the Amazon.Net API actually sees the HTTP code 304 (NotModified) as an error and repeats the request for it three times, and then finally Amazon.S3.AmazonS3Exception: Maximum number of retry attempts reached : 3 .

Obviously, I could change this implementation to use AmazonS3.GetObjectMetaData and then compare ETAGs and use AmazonS3.GetObject if they do not match, but then there are two requests for S3 instead of one when the file is out of date. I would prefer that one request be regardless of whether the XML document is loaded or not.

Any ideas? Is this a mistake, or am I missing something? Is there even a way to reduce the number of attempts to one and "handle" the exception (although I feel "yuck" about this route).

Implementation

I am using the AWS SDK for .NET (version 1.3.14).

Here is my implementation (slightly reduced to reduce it):

 public Task<GetObjectResponse> DownloadString(string key, string etag = null) { var request = new GetObjectRequest { Key = key, BucketName = Bucket }; if (etag != null) { request.ETagToNotMatch = etag; } var task = Task<GetObjectResponse>.Factory.FromAsync(_s3Client.BeginGetObject, _s3Client.EndGetObject, request, null); return task; } 

Then I call it like this:

 var dlTask = s3Manager.DownloadString("new one", "d7db7bc318d6eb9222d728747879b52e"); var responseTasks = new[] { dlTask.ContinueWith(x => _log.Error("Error downloading string.", x.Exception), TaskContinuationOptions.OnlyOnFaulted), dlTask.ContinueWith(x => _log.Warn("Downloading string was cancelled."), TaskContinuationOptions.OnlyOnCanceled), dlTask.ContinueWith(x => _log.Info(string.Format("Done with download: {0}", x.Result.ETag)), TaskContinuationOptions.OnlyOnRanToCompletion) }; try { Task.WaitAny(responseTasks); } catch (AggregateException aex) { _log.Error("Error while processing download string.", aex); } _log.Info("Exiting..."); 

Then it produces this log file:

 2011-10-11 13:21:20,376 [11] INFO Amazon.S3.AmazonS3Client - Received response for GetObject (id 2ee99002-d148-4572-b19b-29259534f48f) with status code NotModified in 00:00:01.6140812. 2011-10-11 13:21:20,385 [11] INFO Amazon.S3.AmazonS3Client - Request for GetObject is being redirect to https://s3.amazonaws.com/x/new%20one. 2011-10-11 13:21:20,789 [11] INFO Amazon.S3.AmazonS3Client - Retry number 1 for request GetObject. 2011-10-11 13:21:22,329 [11] INFO Amazon.S3.AmazonS3Client - Received response for GetObject (id 2ee99002-d148-4572-b19b-29259534f48f) with status code NotModified in 00:00:01.1400356. 2011-10-11 13:21:22,329 [11] INFO Amazon.S3.AmazonS3Client - Request for GetObject is being redirect to https://s3.amazonaws.com/x/new%20one. 2011-10-11 13:21:23,929 [11] INFO Amazon.S3.AmazonS3Client - Retry number 2 for request GetObject. 2011-10-11 13:21:26,508 [11] INFO Amazon.S3.AmazonS3Client - Received response for GetObject (id 2ee99002-d148-4572-b19b-29259534f48f) with status code NotModified in 00:00:00.9790314. 2011-10-11 13:21:26,508 [11] INFO Amazon.S3.AmazonS3Client - Request for GetObject is being redirect to https://s3.amazonaws.com/x/new%20one. 2011-10-11 13:21:32,908 [11] INFO Amazon.S3.AmazonS3Client - Retry number 3 for request GetObject. 2011-10-11 13:21:40,604 [11] INFO Amazon.S3.AmazonS3Client - Received response for GetObject (id 2ee99002-d148-4572-b19b-29259534f48f) with status code NotModified in 00:00:01.2950718. 2011-10-11 13:21:40,605 [11] INFO Amazon.S3.AmazonS3Client - Request for GetObject is being redirect to https://s3.amazonaws.com/x/new%20one. 2011-10-11 13:21:40,621 [11] ERROR Amazon.S3.AmazonS3Client - Error for GetResponse Amazon.S3.AmazonS3Exception: Maximum number of retry attempts reached : 3 at Amazon.S3.AmazonS3Client.pauseOnRetry(Int32 retries, Int32 maxRetries, HttpStatusCode status, String requestAddr, WebHeaderCollection headers, Exception cause) at Amazon.S3.AmazonS3Client.handleHttpResponse[T](S3Request userRequest, HttpWebRequest request, HttpWebResponse httpResponse, Int32 retries, TimeSpan lengthOfRequest, T& response, Exception& cause, HttpStatusCode& statusCode) at Amazon.S3.AmazonS3Client.getResponseCallback[T](IAsyncResult result) 2011-10-11 13:21:40,635 [10] INFO Example.Program - Exiting... 2011-10-11 13:21:40,638 [19] ERROR Example.Program - Error downloading string. System.AggregateException: One or more errors occurred. ---> Amazon.S3.AmazonS3Exception: Maximum number of retry attempts reached : 3 at Amazon.S3.AmazonS3Client.pauseOnRetry(Int32 retries, Int32 maxRetries, HttpStatusCode status, String requestAddr, WebHeaderCollection headers, Exception cause) at Amazon.S3.AmazonS3Client.handleHttpResponse[T](S3Request userRequest, HttpWebRequest request, HttpWebResponse httpResponse, Int32 retries, TimeSpan lengthOfRequest, T& response, Exception& cause, HttpStatusCode& statusCode) at Amazon.S3.AmazonS3Client.getResponseCallback[T](IAsyncResult result) at Amazon.S3.AmazonS3Client.endOperation[T](IAsyncResult result) at Amazon.S3.AmazonS3Client.EndGetObject(IAsyncResult asyncResult) at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endMethod, TaskCompletionSource`1 tcs) --- End of inner exception stack trace --- ---> (Inner Exception #0) Amazon.S3.AmazonS3Exception: Maximum number of retry attempts reached : 3 at Amazon.S3.AmazonS3Client.pauseOnRetry(Int32 retries, Int32 maxRetries, HttpStatusCode status, String requestAddr, WebHeaderCollection headers, Exception cause) at Amazon.S3.AmazonS3Client.handleHttpResponse[T](S3Request userRequest, HttpWebRequest request, HttpWebResponse httpResponse, Int32 retries, TimeSpan lengthOfRequest, T& response, Exception& cause, HttpStatusCode& statusCode) at Amazon.S3.AmazonS3Client.getResponseCallback[T](IAsyncResult result) at Amazon.S3.AmazonS3Client.endOperation[T](IAsyncResult result) at Amazon.S3.AmazonS3Client.EndGetObject(IAsyncResult asyncResult) at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endMethod, TaskCompletionSource`1 tcs)<--- 
+6
source share
1 answer

I also posted this question on the Amazon Developer Forum and received a response from an AWS official:

After exploring this, we understand the problem, but we are looking for feedback on how best to deal with it.

The first approach is to return this operation with the GetObjectResponse property, indicating that the object was not returned or the output stream was set to zero. This would be cleaner than a code one, but it creates a slight violation of behavior for anyone who relies on the generated exception, although after three attempts. It will also be incompatible with the CopyObject operation, which throws an exception without any crazy retry.

Another option is that we throw an exception like CopyObject that keeps us consistent and doesn't break any changes, but it's harder to code with that.

If anyone has opinions on how to handle this, answer this topic.

Norm

I already added my thoughts to the stream, if someone else is interested in participating here, this is the link:

AmazonS3.GetObject sees HTTP 304 (NotModified) as an error. A way to resolve this?


NOTE. When it was allowed by Amazon, I update my answer to reflect the result.


UPDATE: (2012-01-24) Still waiting for more information from Amazon.

UPDATE: (2018-12-06) this was fixed in the AWS SDK 1.5.20 in 2013. Https://forums.aws.amazon.com/thread.jspa?threadID=77995&tstart=0

+2
source

Source: https://habr.com/ru/post/899054/


All Articles