Your idea is very neat, and it is a pity that there is no such function (AFAIK).
I just rephrased your idea into a slightly shorter code. Firstly, I believe that for parallel folding it is useful to use the concept of monoid - it is a structure with an associative operation and a zero element. Associativity is important because we do not know the order in which we combine the result, which is calculated in parallel. And the zero element is important, so that we can break the calculations into blocks and start adding each of zero. This is nothing new, it's just what fold expects for Scala collections.
Further, Scala Iterator already has a useful grouped(Int): GroupedIterator[Seq[A]] method grouped(Int): GroupedIterator[Seq[A]] , which slices an iterator in a fixed-size sequence. This is very similar to your split . This allows us to cut the input data into blocks of a fixed size, and then apply the methods of the parallel Scala collection to them:
def parFold[A](c: Iterator[A], blockSize: Int)(implicit monoid: Monoid[A]): A = c.grouped(blockSize).map(_.par.fold(monoid.zero)(monoid)) .fold(monoid.zero)(monoid);
We add each block using a parallel collection structure, and then (without any parallelization) combine the intermediate results.
Example:
// Example: object SumMonoid extends Monoid[Long] { override val zero: Long = 0; override def apply(x: Long, y: Long) = x + y; } val it = Iterator.range(1, 10000001).map(_.toLong) println(parFold(it, 100000)(SumMonoid));
source share