Python S3 download zip file

I have zip files uploaded to S3. I would like to upload them for processing. I do not need to store them permanently, but I need to temporarily process them. How can i do this?

+4
source share
3 answers

Because working software> full documentation :

Bot2

import zipfile
import boto
import io

# Connect to s3
# This will need your s3 credentials to be set up 
# with `aws configure` using the aws CLI.
#
# See: https://aws.amazon.com/cli/
conn = boto.s3.connect_s3()

# get hold of the bucket
bucket = conn.get_bucket("my_bucket_name")

# Get hold of a given file
key = boto.s3.key.Key(bucket)
key.key = "my_s3_object_key"

# Create an in-memory bytes IO buffer
with io.BytesIO() as b:

    # Read the file into it
    key.get_file(b)

    # Reset the file pointer to the beginning
    b.seek(0)

    # Read the file as a zipfile and process the members
    with zipfile.ZipFile(b, mode='r') as zipf:
        for subfile in zipf.namelist():
            do_stuff_with_subfile()

Bot3

import zipfile
import boto3
import io

# this is just to demo. real use should use the config 
# environment variables or config file.
#
# See: http://boto3.readthedocs.org/en/latest/guide/configuration.html

session = boto3.session.Session(
    aws_access_key_id="ACCESSKEY", 
    aws_secret_access_key="SECRETKEY"
)

s3 = session.resource("s3")
bucket = s3.Bucket('stackoverflow-brice-test')
obj = bucket.Object('smsspamcollection.zip')

with io.BytesIO(obj.get()["Body"].read()) as tf:

    # rewind the file
    tf.seek(0)

    # Read the file as a zipfile and process the members
    with zipfile.ZipFile(tf, mode='r') as zipf:
        for subfile in zipf.namelist():
            print(subfile)

Tested on MacOSX with Python3.

+10
source

If speed is a concern, a good approach would be to select an EC2 instance close enough to your S3 bucket (in the same region), and use this instance to unpack / process your zip files.

. .

. , EC2.

+2

I believe you heard botowhichPython interface to Amazon Web Services

You can get keyfrom s3to file.

import boto
import zipfile.ZipFile as ZipFile

s3 = boto.connect_s3() # connect
bucket = s3.get_bucket(bucket_name) # get bucket
key = bucket.get_key(key_name) # get key (the file in s3)
key.get_file(local_name) # set this to temporal file

with ZipFile(local_name, 'r') as myzip:
    # do something with myzip

os.unlink(local_name) # delete it

You can also use tempfile. See creating and reading from tempfile for more information.

+1
source

Source: https://habr.com/ru/post/1538650/


All Articles