How to copy data in bulk with Kinesis & # 8594; Redshift

When I read about the AWS data pipeline, the idea hit right away - create statistics in kinesis and create a task in the pipeline that will consume data from kinesis, and COPY its redshift every hour. All in one go.

But it looks like there is no node in the pipeline that can consume kinesis. So now I have two possible action plans:

  • Create an instance in which Kinesis data will be consumed and sent to S3 hourly. The pipeline will be copied from there to Redshift.
  • Use Kinesis and perform COPY directly on Redshift in place.

What should I do? Is it not possible to connect Kinesis to red switching using only AWS, without a special code?

+5
source share
3 answers

This can now be done without custom code through a new managed service called Kinesis Firehose . It manages the required buffering intervals, temporary downloads to s3, uploads to Redshift, error handling and automatic bandwidth management.

+5
source

This is already done for you! If you use the Kinesis connector library, there is a built-in connector for Redshift

https://github.com/awslabs/amazon-kinesis-connectors

Depending on the logic you need to handle, a connector can be very simple to implement.

+2
source

You can create and organize a complete pipeline using InstantStack to read data from Kinesis, convert it and paste it into any Redshift or S3.

0
source

Source: https://habr.com/ru/post/1207420/


All Articles