you can use the counter against the foreachPartition () API to achieve it.
Here is a Java program that prints the contents of each section of JavaSparkContext context = new JavaSparkContext (conf);
JavaRDD<Integer> myArray = context.parallelize(Arrays.asList(1,2,3,4,5,6,7,8,9)); JavaRDD<Integer> partitionedArray = myArray.repartition(2); System.out.println("partitioned array size is " + partitionedArray.count()); partitionedArray.foreachPartition(new VoidFunction<Iterator<Integer>>() { public void call(Iterator<Integer> arg0) throws Exception { while(arg0.hasNext()) { System.out.println(arg0.next()); } } });
source share