Convert Scala Expression to Java 1.8

I am trying to convert this Scala expression to Java:

val corpus: RDD[String] = sc.wholeTextFiles("docs/*.md").map(_._2) 

This is what I have in Java:

 RDD<String> corpus = sc.wholeTextFiles("docs/*.md").map(a -> a._2); 

But I get an a._2 : error message.

Bad return type in lambda expression: String cannot be converted to R

If I move on to the super method, this is what I see:

 package org.apache.spark.api.java.function; import java.io.Serializable; public interface Function<T1, R> extends Serializable { R call(T1 var1) throws Exception; } 
+5
source share
2 answers

In Scala, PairRDD is a type of Tuple, and you can access its members using _1 and _2 . However, Java has no built-in Tuples, so you need to use methods to get these elements. It should look like this since Java always requires brackets for any function.

 JavaRDD<String> corpus = sc.wholeTextFiles("docs/*.md").map(a -> a._2()); 

Edit: It seems that in Scala, the implicit parameter is passed to the map method, which means you must explicitly pass it to Java. See here for the Java Doc and here for the Scala documentation.

Edit 2: After several hours of searching for the answer, JavaRDD was found.

+5
source

You can use values() to get the result you want in Java here:

 JavaRDD<String> corpus = sc.wholeTextFiles("docs/*.md").values(); 

Please note that here the JavaRDD type is not RDD

+2
source

Source: https://habr.com/ru/post/1244480/


All Articles