RDD documentation can be found here . Take a look at sortBy :
sortBy[K]( f: (T) ⇒ K, ascending: Boolean = true, numPartitions: Int = this.partitions.size )
K is the type of RDD fragment you are sorting. f is a function that you can define elsewhere with def and pass it by name, or you can create one anonymous string (more scala -like). ascending and numPartitions should be clear.
So, considering all this, try:
rdd.sortBy[String]({animal => animal})
Then try the following:
rdd.sortBy[String]({animal => animal}, false)
And then this one, which sorts RDD by the number of letters "e" on behalf of the animal, from most to least:
rdd.sortBy[Int]({a => a.split("").filter(char => char == "e").size}, false)
It should be noted that the original rdd not sorted - a new, sorted RDD is returned by the operation.
source share