Saving / Exporting Spark SQL Zeppelin Query Results

We use apache zeppelin to analyze our datasets. We have several queries that we would like to run, which have a large number of results that are returned from them, and would like to run the query in zeppelin, but save the results (display is limited to 1000). Is there an easy way to get zeppelin to save all query results in an s3 bucket, maybe?

+4
source share
1 answer

I managed to hack into a laptop that effectively does what I want using the scala interpreter.

z.load("com.databricks:spark-csv_2.10:1.4.0")
val df= sqlContext.sql("""
select * from table
""")

df.repartition(1).write
    .format("com.databricks.spark.csv")
    .option("header", "true")
    .save("s3://amazon.bucket.com/csv_output/")

, z.load, , , - % dep, scala

+2

Source: https://habr.com/ru/post/1653850/


All Articles