AWS: how to fix an S3 event replacing a space with a + sign in object keywords in json

I have a lamba function to copy objects from bucket 'A' to bucket 'B', and everything works fine until an object called 'New Text Document.txt' is created in the bucket 'A', json which is being built in the S3 event , key as "key": "New + Text + .txt Document".

spaces are replaced by "+". I know this is a known issue, serialization on the Internet. But I'm not sure how to fix this, and the incoming json itself has "+" and "+", maybe in fact in the file name. e.g. "New + Text Document.txt".

Therefore, I cannot blindly have logic in the '+' by '' space in my lambda function.

Because of this problem, when the code tries to find the file in the bucket, it does not find it.

Please offer.

+14
source share
3 answers

What I did to fix it

java.net.URLDecoder.decode(b.getS3().getObject().getKey(), "UTF-8")


{
    "Records": [
        {
            "s3": {
                "object": {
                    "key": "New+Text+Document.txt"
                }
            }
        }
    ]
}

So, now the value of JSon, "New + Text + Document.txt", is correctly converted to New Text Document.txt.

This fixed my problem, please suggest if this is a very correct solution. Will there be any angular case that might break my implementation.

+7
source

I came across this looking for a solution for lambda written in python instead of java; "urllib.parse.unquote_plus" worked for me, it handled the file correctly with spaces and + signs:

from urllib.parse import unquote_plus
import boto3


bucket = 'testBucket1234'
# uploaded file with name 'foo + bar.txt' for test, s3 Put event passes following encoded object_key
object_key = 'foo %2B bar.txt'
print(object_key)
object_key = unquote_plus(object_key)
print(object_key)

client = boto3.client('s3')
client.get_object(Bucket=bucket, Key=object_key)
+10

, :

getS3().getObject().getUrlDecodedKey()

,

getS3().getObject().getKey()
0

Source: https://habr.com/ru/post/1680201/


All Articles