Order RDD [String]

Consider

val animals = List("penguin","ferret","cat").toSeq val rdd = sc.makeRDD(animals, 1) 

I would like to order this RDD. I am new to Scala and a bit confused about how to do this.

+6
source share
1 answer

RDD documentation can be found here . Take a look at sortBy :

 sortBy[K]( f: (T) ⇒ K, ascending: Boolean = true, numPartitions: Int = this.partitions.size ) 

K is the type of RDD fragment you are sorting. f is a function that you can define elsewhere with def and pass it by name, or you can create one anonymous string (more scala -like). ascending and numPartitions should be clear.

So, considering all this, try:

 rdd.sortBy[String]({animal => animal}) 

Then try the following:

 rdd.sortBy[String]({animal => animal}, false) 

And then this one, which sorts RDD by the number of letters "e" on behalf of the animal, from most to least:

 rdd.sortBy[Int]({a => a.split("").filter(char => char == "e").size}, false) 

It should be noted that the original rdd not sorted - a new, sorted RDD is returned by the operation.

+6
source

Source: https://habr.com/ru/post/988206/


All Articles