I donβt know how effective this is, as it depends on the current and future optimization in the Spark engine, but you can try the following:
rdd.zipWithIndex.filter(_._2==9).map(_._1).first()
The first function converts RDD into a pair (value, idx) with idx starting at 0. The second function takes an element with idx == 9 (10th). The third function takes its original value. Then the result is returned.
The first function can be activated by the execution mechanism and influence the behavior of the entire processing. Give it a try.
In any case, if n is very large , this method is effective in that it does not need to assemble an array of the first n elements in the node driver.
source share