How to delete multiple files in an S3 bucket using AWS CLI

Suppose I have an S3 bucket named xyz

In this bucket, I have hundreds of files. But I want to delete only 2 files named purple.gif and worksheet.xlsx

Can I do this from an AWS command-line tool with one call to rm ?

This did not work:

 $ aws s3 rm s3://xyz/worksheet.xlsx s3://xyz/purple.gif Unknown options: s3://xyz/purple.gif 

From manual, it doesn't seem like you can delete a list of files explicitly by name. Does anyone know how to do this? I prefer not to use the --recursive flag.

+14
source share
8 answers

You cannot use s3 rm , but you can use s3api delete-objects :

 aws s3api delete-objects --bucket xyz --delete '{"Objects":[{"Key":"worksheet.xlsx"},{"Key":"purple.gif"}]}' 
+12
source

You can do this by specifying the --exclude or --include multiple times. But for this you have to use --recursive .

If you have multiple filters, remember that the order of the filter parameters is important . The rule is that filters that appear later in a command take precedence over filters that appear earlier in a command.

 aws s3 rm s3://xyz/ --recursive --exclude "*" --include "purple.gif" --include "worksheet.xlsx" 

Here, all files will be excluded from the command, except purple.gif and worksheet.xlsx .

If you are not sure, always try --dryrun first and check which files will be deleted.

Source: Using Exclusion and Inclusion Filters

+35
source

USING UNIX WILDCARDS WITH AWS S3 (AWS CLI)

The AWS command line interface does not currently support UNIX wildcard support in the command path argument. However, it is fairly easy to reproduce this function using the --exclude and --include options, available in several aws s3 commands.

Wildcards available for use:

"*" - matches everything

"?" - Matches any single character

"[]" - matches any single character in brackets

"[!]" - matches any single character not enclosed in brackets

A few things to keep in mind when using --include and --exclude with the aws s3 command:

You can use any number of --include and --exclude options .

Parameters passed later take precedence over parameters passed earlier (in the same command).

All files and objects are enabled by default on ', so to include only certain files you need to use "exclude" and then "include". --recursive should be used together with --include and --exclude , otherwise the commands will only perform operations on one file / object.

Examples: Copy all files from the working directory to the big data dumpster:

aws s3 cp ./ s3://big-datums/ --recursive

Delete all .java files from the big data dumpster:

aws s3 rm s3://big-datums/ --recursive --exclude "*" --include "*.java"

Delete all files in the big data area with the file extension with "j" or "c" (".csv", ".java,". Json ",." Jpeg ", etc.):

aws s3 rm s3://big-datums/ --recursive --exclude "*" --include "*.[jc]*"

Copy the ".txt" and ".csv" files from the large S3 database to the local working directory:

aws s3 cp s3://big-datums/ . --recursive --exclude "*" --include "*.txt" --include "*.csv"

 #Copy all files from working directory to the big-datums bucket: aws s3 cp ./ s3://big-datums/ --recursive #Delete all ".java" files from the big-datums bucket: aws s3 rm s3://big-datums/ --recursive --exclude "*" --include "*.java" #Delete all files in the big-datums bucket with a file extension beginning with "j" or "c" (".csv", ".java, ".json", ."jpeg", etc.): aws s3 rm s3://big-datums/ --recursive --exclude "*" --include "*.[jc]*" #Copy ".txt" and ".csv" files from big-datums S3 bucket to local working directory: aws s3 cp s3://big-datums/ . --recursive --exclude "*" --include "*.txt" --include "*.csv" ''' 
+4
source

I found this useful on the command line. I had over 4 million files, and it took almost a week to empty the bucket. This is convenient because the AWS console is not described in the logs.

Note. You need to install jquery

  aws s3api list-object-versions --bucket YOUrBUCKEtNAMeHERe-processed --output json --query 'Versions[].[Key, VersionId]' | jq -r '.[] | "--key '\''" + .[0] + "'\'' --version-id " + .[1]' | xargs -L1 aws s3api delete-object --bucket YOUrBUCKEtNAMeHERe 
+1
source

Note that:

aws s3 rm s3://xyz / --recursive --include "\*.gif" deletes all files in the path, including "\*.gif"

aws s3 rm s3://xyz / --recursive --exclude "\*" --include "\*.gif" deletes only files matching "\*.gif"

+1
source

Apparently aws s3 rm only works with single files / objects.

The following is a bash command that works with some success (a bit slow, but it works):

 aws s3 ls s3://bucketname/foldername/ | awk {'print "aws s3 rm s3://bucketname/foldername/" $4'} | bash 

Please note that you may have problems if your object names have spaces or funny characters. This is because the aws s3 ls command will not list such objects.

0
source

This solution will work when you want to specify a wildcard for the object name.

 aws s3 ls dmap-live-dwh-files/backup/mongodb/oms_api/hourly/ | grep order_2019_08_09_* | awk {'print "aws s3 rm s3://dmap-live-dwh-files/backup/mongodb/oms_api/hourly/" $4'} | bash 
0
source

If you use the AWS CLI, you can filter the LS results with grep regex and delete them. For instance,

aws s3 ls s3://BUCKET | awk '{print $4}' | grep -E -i '^2015-([0-9][0-9])\-([0-9][0-9])\-([0-9][0-9])\-([0-9][0-9])\-([0-9][0-9])\-([0-9a-zA-Z]*)' | xargs -I% bash -c 'aws s3 rm s3://BUCKET/%'

It is slow but working

0
source

Source: https://habr.com/ru/post/1263030/


All Articles