Boot from AWS S3 during file upgrade

This may seem like a really basic question, but if I download a file from S3 when it is being updated by another process, do I need to worry about getting an incomplete file?

Example: 200 MB CSV file. User A starts updating the file with 200 MB of new content at a speed of 1 Mbps. After 16 seconds, user B starts downloading the file at a speed of 200 Mbps. Does user B get all 200 MB of the source file, or does user B get ~ 2 MB of user changes A and nothing else?

+5
source share
1 answer

User B receives all 200MB of the source file.

That's why:

Operations

PUT on S3 are atomic. Technically, there is no such thing as a “modification” of an object. What actually happens when an object is overwritten is that the object is replaced with another object with the same key. But the original object is not actually replaced until the new (rewriting) object is loaded completely and successfully ... and even then the rewritten object is technically not “gone” - it was replaced only in the bucket index, so future requests will serve new object.

(The maintenance of a new object is actually documented, as it is not guaranteed that it will always be executed immediately. Unlike loading new objects that are immediately available for download, rewriting existing objects is ultimately consistent, which means that it is possible - but unlikely , - that for a short period of time after you upload an object that the old copy can still be submitted for subsequent requests).

But when you overwrite an object, and version control is not included in the bucket, the old object and new objects are actually stored independently in S3, despite the same key. The old object no longer refers to the bucket index, so you are no longer billed for its storage, and it will soon be deleted from the S3 backup storage. This is not really documented how much later this will happen ... but (tl; dr) overwriting the object that is currently loading should not cause any unexpected side effects.

Single key updates are atomic. For example, if you click on an existing key, subsequent reading may return old data or updated data, but will never record corrupted or partial data.

http://docs.aws.amazon.com/AmazonS3/latest/dev/Introduction.html#ConsistencyModel

+9
source

Source: https://habr.com/ru/post/1244384/


All Articles