Embed zipped file in RedShift

Question

Embed zipped file in RedShift

I have a file in s3 that is encrypted. I would like to insert it into the RedShift database. The only way my research has shown this is to run an instance of ec2. Move the file there, unzip it and send it back to S3. Then paste it into my RedShift table. But I am trying to do all this from the JavaSDK from an external machine and do not want to use an instance of Ec2. Is there a way to simply unzip the EMR file? Or paste the archived file directly into RedShift?

.Zip files are not .gzip

+4

amazon-redshift

Dan Ciborowski - MSFT Jul 19 '13 at 13:08

source share

3 answers

add gzip , see http://docs.aws.amazon.com/redshift/latest/dg/c_loading-encrypted-files.html we can use the Java client to execute SQL

+2

coderz Jun 13 '14 at 7:52

source share

if your gzip file try running the command

copy mutable from 's3: //abc/def/yourfilename.gz' CREDENTIALS 'aws_access_key_id = xxxxx; aws_secret_access_key = yyyyyy 'delimiter', ' gzip

-2

Sandesh deshmane Jun 13 '14 at 17:05

source share

Joe harris · Accepted Answer · 2013-07-23T11:01:05+0000

You cannot directly paste the archived file into Redshift according to Guy's comments.

Assuming this is not a one-time task, I would suggest using AWS Data Pipeline to do this job. See an example of copying data between S3 buckets. Modify the example to unzip and then gzip your data, not just copy it.

Use ShellCommandActivity to execute a shell script that does this work. I would suggest that this script can call Java if you select and apply AMI as your EC2 resource (YMMV).

Data Pipeline is highly efficient for this type of work because it automatically starts and shuts down the EC2 resource, and you don’t have to worry about finding a new instance name in your scripts.

Embed zipped file in RedShift

More articles: