I am using Hortonworks 2.6 with 5 nodes. I'm spark-submitup to YARN (with 16 GB of RAM and 4 cores).
I have an RDD conversion that works fine in local, but not with the yarnmain URL.
rdd1 has such meanings as:
id name date
1 john 10/05/2001 (dd/mm/yyyy)
2 steve 11/06/2015
I would like to change the date format from dd/mm/yyyyto mm/dd/yy, so I wrote a method transformations.transformthat I use in the function RDD.mapas follows:
rdd2 = rdd1.map { rec => (rec.split(",")(0), transformations.transform(rec)) }
transformations.transform the method is as follows:
object transformations {
def transform(t: String): String = {
val msg = s">>> transformations.transform($t)"
println(msg)
msg
}
}
Actually, the above code works fine locally, but not in the cluster. The method simply returns the result, as if it maplooked like this:
rdd2 = rdd1.map { rec => (rec.split(",")(0), rec) }
recdoes not seem to be passed to the method transformations.transform.
I use the action to run the method transformations.transform(), but no luck.
val rdd3 = rdd2.count()
println(rdd3)
println , transformations.transform. ?