Saving / Exporting Spark SQL Zeppelin Query Results

Question

Saving / Exporting Spark SQL Zeppelin Query Results

We use apache zeppelin to analyze our datasets. We have several queries that we would like to run, which have a large number of results that are returned from them, and would like to run the query in zeppelin, but save the results (display is limited to 1000). Is there an easy way to get zeppelin to save all query results in an s3 bucket, maybe?

+4

apache-spark-sql apache-zeppelin

vcetinick Sep 7 '16 at 0:55

source share

1 answer

vcetinick · Accepted Answer · 2017-02-01T00:08:26+0000

I managed to hack into a laptop that effectively does what I want using the scala interpreter.

z.load("com.databricks:spark-csv_2.10:1.4.0")
val df= sqlContext.sql("""
select * from table
""")

df.repartition(1).write
    .format("com.databricks.spark.csv")
    .option("header", "true")
    .save("s3://amazon.bucket.com/csv_output/")

, z.load, , , - % dep, scala

Saving / Exporting Spark SQL Zeppelin Query Results

More articles: