I have 4rdds of type RDD: ((int, int, int), values), and my rdds
rdd1: ((a,b,c), value)
rdd2:((a,d,e),valueA)
rdd3:((f,b,g),valueB)
rdd4:((h,i,c),valueC)
How can I join rdds, for example, rdd1 join rdd2 on "a" rdd1 join rdd2 on "b" and rdd1 join rdd3 on "c"
why exit finalRdd: ((a,b,c),valueA,valueB,valueC,value))to Scala?
I tried to do this with collectAsMap, but it did not work well and throws an exception
code for rdd1 only join rdd2
val newrdd2=rdd2.map{case( (a,b,c),d)=>(a,d)}.collectAsMap
val joined=rdd1.map{case( (a,b,c),d)=>(newrdd2.get(a).get,b,c,d)}
Example
rdd1: ((1,2,3),animals)
rdd2:((1,anyInt,anyInt),cat)
rdd3:((anyInt,2,anyInt),cow )
rdd 4: ((anyInt,anyInt,3),parrot)
the conclusion should be ((1,2,3),animals,cat,cow,parrot )