How to list objects by extension from s3 api?

Is there any way to search for objects in S3 by extension, and not just a prefix?

Here is what I have now:

ListObjectsResponse r = s3Client.ListObjects(new Amazon.S3.Model.ListObjectsRequest()
{
    BucketName = BucketName,
    Marker = marker,
    Prefix = folder, 
    MaxKeys = 1000
});

So, I need to list all * .xls files in my bucket.

+3
source share
4 answers

I do not think this is possible with S3.

The best solution is to "index" S3 using a database (Sql Server, MySql, SimpleDB, etc.) and make your requests against this.

+5
source

You really don't need a separate database to do this for you.

S3 . , ".xls" , . , , , (: XLS-myfile.xls). S3 API "XLS".

+3

, , . python boto3, , .

, . , , "" / "", .

s3_client = boto3.client('s3')
bucket = 'my-bucket'
prefix = 'my-prefix/foo/bar'
paginator = s3_client.get_paginator('list_objects_v2')
response_iterator = paginator.paginate(Bucket=bucket, Prefix=prefix)

file_names = []

for response in response_iterator:
    for object_data in response['Contents']:
        key = object_data['Key']
        if key.endswith('.json'):
            file_names.append(key)

print file_names
+3

I repeat after receiving file information. The end result will be in dict

import boto3

s3 = boto3.resource('s3')

bucket = s3.Bucket('bucket_name')

#get all files information from buket
files = bucket.objects.all()

# create empty list for final information
files_information = []

# your known extensions list. we will compare file names with this list
extensions = ['png', 'jpg', 'txt', 'docx']

# Iterate throgh 'files', convert to dict. and add extension key.
for file in files:
    if file.key[-3:] in extensions:
        files_information.append({'file_name' : file.key, 'extension' : file.key[-3:]})
    else:
        files_information.append({'file_name' : file.key, 'extension' : 'unknown'})


print files_information
+1
source

Source: https://habr.com/ru/post/1785818/


All Articles