Passing a functor using std :: cref is likely to be counterproductive, but I don't do promises. Only empirical testing in the exact context of interest can be final. In general, for tbb :: parallel_for, my recommendation is:
- Pass lambda by value.
- If there are semantic considerations that dictate the capture mode, use lambda objects by reference if they are not small objects that are cheap to copy. Remember that normally captured variables will be available much more often than a copy of lambda.
Does TBB pay the heap allocation cost for the functor? The answer is definitely not for the signature of the parallel_for (first, * last *, functor) form, because this form passes the functor by reference.
For the signature of the form parallel_for (range, * functor *), as in the question, the answer is “no extra cost”. This is not a bunch - the direct assignment of a functor. But each task created by TBB has a copy of the functor, and tasks are distributed in heaps (usually quickly through local free lists). Using std :: cref will not change the fact that tasks are distributed in heaps. Using std :: cref will simply add an extra level of indirection.
In fact, I was a little surprised that one of the forms tbb :: parallel_for passes the functor by reference, and the other by value. I forgot the reason, and I'm sure the TBB group must have discussed this. The choice may have been motivated by which tests and machines were available at the time they were introduced, or perhaps the PPL compatibility issue with the “first, last” form, which does not appear to require the functor to be available for copy. As previously outlined, the trade-off between transmission performance compared to transmission in size is not simple. A pass-through makes the transfer of the functor cheap, but adds the cost of indirection to each click (if the compiler cannot optimize it).
As for the lifetime of the functor argument, it just has to exist for the duration of the parallel_for call.
source share