Can sparks replace ETL tool

Existing process - data from the original structure is copied to the intermediate level of Redshift. Then use ETL tools, such as Informatica, Telend, to load Datamart / datawarehouse into the fact table and dimensions step by step. All connections occur within the database layer (ETL pushes queries into the database) - Can Spark replace the ETL tool and perform the same processing and loading of data in Redshift? - What are the advantages and disadvantages of this architecture?

+6
source share
2 answers

Over the past 4 years, I have worked hard on projects for the transition to new ETL jobs.

ETL

  1. SLA. , . business critical.

  2. , ETL , .

  3. . ETL , .

, ETL . Spark hadoop - , , .

Spark SQL . ML/Graph ETL . Spark- . .

spark . , . ETL .

Redshift , , , - spark.

, ETL Spark.

- , , , , Hadoop ETL. , ETL , .

+1

Informatica Spark. Informatica BDM 10.1 Spark, Informatica Spark ( Scala) . , , Spark , , , ETL, !!!

0

Source: https://habr.com/ru/post/1661986/


All Articles