Block the entire directory on S3

Question

Block the entire directory on S3

If I have a directory with ~ 5000 small files on S3, is there a way to easily zip up the entire directory and leave the resulting zip file on S3? I need to do this without having to manually access each file myself.

Thanks!

+6

amazon-s3 amazon-web-services s3cmd

Jin May 03 '13 at 21:54

source share

1 answer

BraveNewCurrency · Accepted Answer · 2013-05-03T22:12:42+0000

No, there is no magic bullet.

(As an aside, you should understand that in S3 there is no such thing as a “directory.” There are only objects with paths. You can get list directories, but the “/” character is not magic - you can get prefixes with any character that Do you want to.)

As someone noted, a “pre-fastener” of them can help both download speed and upload speed. (By duplicating storage.)

If the download is a bottleneck, it looks like you are downloading in serial. S3 can support 1000 simultaneous connections to the same object without breaking a sweat. You will need to run tests to find out how many connections are better, since too many connections from one box can be throttled by S3. You may need to complete the TCP configuration at 1000 connections per second.

The "solution" is highly dependent on your data access patterns. Try reinstalling the problem. If your downloads in a single file are infrequent, it might make sense to group them 100 at a time in S3, and then split them as needed. If these are small files, it makes sense to cache them in the file system.

Or it might make sense to store all 5,000 files as a single large zip file in S3 and use an “smart client” that can load specific ranges of zip file to serve individual files. (S3 supports byte ranges, as I recall.)

Block the entire directory on S3

More articles: