In Scala, PairRDD is a type of Tuple, and you can access its members using _1
and _2
. However, Java has no built-in Tuples, so you need to use methods to get these elements. It should look like this since Java always requires brackets for any function.
JavaRDD<String> corpus = sc.wholeTextFiles("docs/*.md").map(a -> a._2());
Edit: It seems that in Scala, the implicit parameter is passed to the map
method, which means you must explicitly pass it to Java. See here for the Java Doc and here for the Scala documentation.
Edit 2: After several hours of searching for the answer, JavaRDD was found.
source share