Is there an easy way to convert a given Row object to json?
Found this about converting a whole Dataframe to json output:
Spark Row for JSON
But I just want to convert one line to json. Here is the pseudo code for what I'm trying to do.
More precisely, I read json as input in a Dataframe. I am creating a new output, which is mainly based on columns, but with one json field for all the information that does not fit into the columns.
My question is: what is the easiest way to write this function: convertRowToJson ()
def convertRowToJson(row: Row): String = ???
def transformVenueTry(row: Row): Try[Venue] = {
Try({
val name = row.getString(row.fieldIndex("name"))
val metadataRow = row.getStruct(row.fieldIndex("meta"))
val score: Double = calcScore(row)
val combinedRow: Row = metadataRow ++ ("score" -> score)
val jsonString: String = convertRowToJson(combinedRow)
Venue(name = name, json = jsonString)
})
}
Psidom Solutions:
def convertRowToJSON(row: Row): String = {
val m = row.getValuesMap(row.schema.fieldNames)
JSONObject(m).toString()
}
only works if the row has only one level that does not contain nested rows. This is the diagram:
StructType(
StructField(indicator,StringType,true),
StructField(range,
StructType(
StructField(currency_code,StringType,true),
StructField(maxrate,LongType,true),
StructField(minrate,LongType,true)),true))
I also tried Artem's suggestion, but this did not compile:
def row2DataFrame(row: Row, sqlContext: SQLContext): DataFrame = {
val sparkContext = sqlContext.sparkContext
import sparkContext._
import sqlContext.implicits._
import sqlContext._
val rowRDD: RDD[Row] = sqlContext.sparkContext.makeRDD(row :: Nil)
val dataFrame = rowRDD.toDF() //XXX does not compile
dataFrame
}