Repartition () does not affect RDD partition size

I am trying to resize an RDD partition using the repartition () method. The method call on RDD succeeds, but when I explicitly check the size of the partition using the partition.size RDD property, I return the same number of partitions as it has: -

scala> rdd.partitions.size
res56: Int = 50

scala> rdd.repartition(10)
res57: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[19] at repartition at <console>:27

At this point, I perform some actions, such as rdd.take (1), just for a forced evaluation, just in case it matters. And then I will check the partition size again: -

scala> rdd.partitions.size
res58: Int = 50

As you can see, it does not change. Can anyone answer why?

+4
source share
1 answer

-, , , repartition . -, repartition RDD , RDD, . , coalesce, . .

+11

Source: https://habr.com/ru/post/1598637/


All Articles