Boto return flow

I have a server where files are uploaded, I want them to be sent to s3 using boto, I have to do some data processing mainly, since it is uploaded to s3.

The problem I am facing is the way they load. I need to provide a recordable stream that receives incoming data and load it. I need a readable stream. So it’s as if I have two ends that don’t connect. Is there any way to download s3 writable? If so, it would be easy, and I could pass the download stream to s3, and it would execute the chain.

If not, I have two free ends that I need something in between the buffer that can be read from the load to save this move, and set the read method that I can give boto to read. But by doing this, I am sure that I will need to cut out the s3 boot portion, which I would prefer to avoid as I use twisted.

I have the feeling that I am making the situation much more complicated, but I cannot come up with a simple solution. This should be a common problem, I'm just not sure how to put it in words

+4
source share
2 answers

boto is a Python library with a blocking API. This means that you have to use threads to use it while maintaining the reconciliation operation that Twisted provides (just like you have to use threads to have any concurrency when using boto '' without '' Twisted, i.e. Twisted does not help make boto non-blocking or parallel).

Instead, you can use txAWS, a Twisted-based library for interacting with AWS. txaws.s3.client provides methods for interacting with S3. If you are familiar with boto or AWS, some of them should already be familiar. For example, create_bucket or put_object .

txAWS would be better if it provided a streaming API so you can upload it to S3 as the file loads to you. I think this is currently under development (based on the new HTTP client in Twisted, twisted.web.client.Agent ), but maybe not yet available in the release.

+3
source

You just need to wrap the stream in a file as an object. Therefore, essentially, the stream object must have a read method that blocks until the file is fully loaded.

After that you just use the s3 API

 bucketname = 'my_bucket' conn = create_storage_connection() buckets = conn.get_all_buckets() bucket = None for b in buckets: if b.name == bucketname: bucket = b if not bucket: raise Exception('Bucket with name ' + bucketname + ' not found') k = Key(bucket) k.key = key k.set_contents_from_filename(MyFileLikeStream) 
-1
source

Source: https://habr.com/ru/post/1437660/


All Articles