On Spark RDD, take and takeOrdered methods

I'm a little confused about how Spark rdd.take (n) and rdd.takeOrdered (n) work. Can someone explain these two methods to me with a few examples? Thanks.

+4
source share
1 answer

To explain how ordering works, we create an RDD with integers from 0 to 99:

val myRdd = sc.parallelize(Seq.range(0, 100))

Now we can do:

myRdd.take(5)

which will extract the first 5 elements of the RDD, and we get an Array [Int] containing the first 5 integers of myRDD: '0 1 2 3 4 5' (without the ordering function, only the first 5 elements in the first position of 5)

takeOrdered (5) : 5 RDD [Int], :

myRdd.takeOrdered(5)( Ordering[Int].reverse)

5 . : '99 98 97 96 95 '

RDD, :

myRdd.takeOrdered(5)( Ordering[Int].reverse.on { x => ??? })

5 RDD [Int] .

+7

Source: https://habr.com/ru/post/1614700/


All Articles