How to convert the Avro data file to CSV

I have an avro data file and I need to convert it to a csv file. Avro totext does not currently support the use of custom schemas. Are there any tools that do this? Should I just encode it with Avro tools?

+6
source share
3 answers
//Spark2.0 +   
    import com.databricks.spark.avro._

    //Read avro file
    val df = spark.read.avro("/FileStore/tables/279ltrs61490238208016/twitter.avro")
    df.printSchema()
    df.count()
    df.show()


    //Write to csv file
    df.write
      .option("header", "true")
      .csv("/FileStore/tables/279ltrs61490238208016/twitter_out.csv")

    //Read csv file and display contents
    val df1 = spark.read.option("header", true).csv("/FileStore/tables/279ltrs61490238208016/twitter_out.csv")
    df1.count()
    df1.printSchema()
    df1.show()
    df1.count()
+2
source

I asked the same question and I just used the Spark API to do this:

Read the data as:

val sqlContext = new SQLContext(sc)
val avro = sqlContext.read.format("com.databricks.spark.avro").load("/path/to/your/data")

or

val sqlContext = new SQLContext(sc)
val avro = sqlContext.avroFile("/path/to/your/data")

And then you can do something like:

val csv = avro.map(_.mkString(","))

And then, to see the results, you can check this by doing something like:

csv.take(2).foreach(println)
+1
source

CSV- avro, avro Encoder Decoder spf4j-avro. ( , json). Csv Encoder/Decoders / / csv, , .

The code is in CSV . If you want to see how you can use it, there is an example of how you can implement JAX-RS MessageBody (Reader / Writer) in .

0
source

Source: https://habr.com/ru/post/1547968/


All Articles