For example, suppose I have a DataFrame:
var myDF = sc.parallelize(Seq(("one",1),("two",2),("three",3))).toDF("a", "b")
I can convert it to RDD[(String, Int)] with a map:
var myRDD = myDF.map(r => (r(0).asInstanceOf[String], r(1).asInstanceOf[Int]))
Is there a better way to do this, possibly using a DF schema?
source share