Amazon S3 - A different lifecycle rule for a "subdirectory" than for a parent "directory"

Let's say I have the following data structure:

  • /
  • / Foo
  • / Foo / bar
  • / Foo / baz

Can I assign the following life cycle rules to him:

  • /(1 month)
  • / foo (2 months)
  • / foo / bar (3 months)
  • / foo / baz (6 months)

The official documentation, unfortunately, is inconsistent in this regard. It doesn't seem to work with the AWS console, which makes me somewhat doubtful that the SDK / REST will be different :)

Otherwise, my main problem: I have 4 types of projects. The most rudimentary type has several thousand projects, while others have several tens. Each type that I am required to keep for a different period of time. Each project contains hundreds of thousands of objects. It looks more or less:

  • type A, 90% of projects, x storage required
  • type B, 6% of projects, 2x storage required
  • type C, 3% of projects, 4 times required
  • type D, 1% of projects, 8x storage required

So far so simple. But. Projects can be updated or changed from one type to another. And, as I said, I have several thousand instances of the first type, so I can’t write specific rules for each of them (remember 1000 rules for each bucket). And since they can be updated from one type to another, I cannot just paste them into my own folders (for example, only projects of a certain type) or a bucket. Or so I think? Are there any other options for me, besides iterating over each object, every time I want to clear files with expired dates - which I would rather not do because of the large number of objects?

Maybe some kind of file "move / transfer" between buckets that do not change the creation time metadata, and is not expensive for our server?

It would be very important :)

+11
source share
2 answers

Lifecycle policies are based on a prefix , not a subdirectory.

So, if objects corresponding to the foo/ prefix should be deleted after 2 months, it is not logical to request that objects with the foo/bar/ prefix be deleted after 3 months, since they will be deleted after 2 months ... since they also correspond prefix foo/ . Prefix means prefix. Separators are not a factor in life cycle rules.

Also note that keys and prefixes in S3 do not start with / . A policy that affects the entire array uses an empty string as a prefix, not / .

Also, you probably want to remember trailing slashes when specifying prefixes, because foo/bar corresponds to the file foo/bart.jpg , and foo/bar/ does not.

Iterating over objects for deletion is not as bad as you do, because calling the list objects API returns 1000 objects per request (or less if you want) and allows you to specify both a prefix and a separator (usually you will use / as a separator, if you want the answers to be grouped using the pseudo-folder model used by the console to create a hierarchical display) ... and each object key and data label are provided in the XML response. There is also an API request to delete multiple objects in one call.

Any kind of moving, transferring, copying, etc. will always reset the date the object was created. Even changing metadata, because objects are immutable. Each time you move, move, copy or β€œrename” an object (which actually copies and deletes) or change metadata (which are actually copied to the same key with different metadata), you actually create a new object.

+10
source

@Zardii you can use unique s3 object tags [1] for objects under these prefixes

You can then apply the lifecycle policy by tag with a different retention / retention period.

[1] https://docs.aws.amazon.com/AmazonS3/latest/dev/object-tagging.html

Prefix - S3 Tags

/ tag => delete_after_one_month

/ foo tag => delete_after_two_months

/ foo / bar tag => delete_after_three_months

/ foo / baz tag => delete_after_six_month

0
source

Source: https://habr.com/ru/post/1243539/


All Articles