Creating a parquet file in AWS Lambda

I get a set (1 MB) of S3 CSV / JSON files that I would like to convert to Parquet. I expected that I could easily convert these files to Parquet using the Lambda function.

Looking at Google, I did not find a solution for this, without having a kind of Hadoop.

Since this is a file conversion, I cannot believe that there is no easy solution for this. Does anyone have Java / Scala sample code for this conversion?

+6
source share
1 answer

I don’t think there is a way to convert to parquet format using AWS Lambda. However, one simple way is to use Glue Crawler to lift it from S3, and then convert ETL Job to parquet and store it where you need it.

0
source

Source: https://habr.com/ru/post/1013822/


All Articles