Removing subdirectories from S3 from code, defining the 'delimiter' parameter, Groovy code

I have an observation that might help others working with S3, and the question below. The sample code here is in Groovy using JavaSys JetS3t, but these concepts apply to any programming language.

I found a lot of documentation here on Slashdot and elsewhere that claimed that S3 does not have the concept of subdirectories inside buckets. This is mostly true. If you want to delete files, you will find that they must first be found:

//assume we are looking for all files in 'stuff' directory files = s3.listObjects(bucket, 'stuff/', null) 

Now, if you delete these files, you will still be left with something that is very similar to a subdirectory in a bucket. You will still see the "stuff /" list. Therefore, it made me wonder if it was true that there really were no subdirectories. It turns out, however, that there are no real subdirectories, but some file is masked under the name subdir and is displayed in the listing. With a little spelunking, I decided that this is another S3 object that has a key name with a special string _ $ folder $ attached to the key. So you can remove this by doing the following (assuming the example above):

  s3.deleteObject(bucket, 'stuff_$folder$') 

Now you will no longer see any subdirectory specified for things in this bucket. Although I have not tested this, I assume that the stuff / folder file should already be empty before trying to remove the key "stuff_ $ folder $". It amazes me that in all posts here it is never mentioned, so anyone who tries to delete the entire subdirectory probably has the subdirectory itself!

If you go back to my original call to listObjects and do this instead:

  files = s3.listObjects(bucket, 'stuff', null) //note, no trailing slash 

You will see the stuff_ $$ variable returned in the results. My problem is that you can also get other objects that start with the key "material" but are not contained in the "subdirectory". Therefore, you must be careful. Therefore, I prefer to pass "stuff /" as a key, and then handle the stuff_ $ folder_ object separately.

This leads me to the final question. I cannot get a clear explanation of what the final parameter means in the call to listObjects (bucket, key, delimiter). What is a "separator". This does not mean "file delimiter" (as in '/'). I searched and cannot find an example that illustrates what this means or how it is used. I want to know, as if the utility and flexibility of listObjects that I would like to know were still improved. Can someone give an example illustrating the use and value of a separator parameter? I am sure this is something simple, and I just cannot find a good example.

+4
source share
1 answer

The delimiter is clumsy names. This makes sense if you consider it a suffix. From the S3 documentation - http://aws.amazon.com/releasenotes/Amazon-S3/213 or if you prefer a slightly different explanation http://www.bucketexplorer.com/documentation/amazon-s3--search-on-objects -in-bucket.html

Groups of keys that have a common prefix interrupted by a special separator can now be collapsed by this prefix for listing purposes. This allows applications to view their keys hierarchically, very similar to how you navigate through directories in the file system.

For example, if you have a bucket containing the following keys (named with built-in slashes for simulating directories) photos / 2006/index.html photos / 2006 / January / img0001.jpg ... photos / 2006 / January / img0999.jpg Photos /2006/February/img1000.jpg ... A list with the prefix = "photos / 2006 /" and Delimiter = "/" will return the keys and "subdirectories" at the photo level / 2006 (index.html, January, February, .. .), but would not include any .jpg keys in deeper levels.

Think of it as enough. Your separator may be .html, .jpg or something like that.

+1
source

Source: https://habr.com/ru/post/1383064/


All Articles