How to parallelize std :: partition using TBB

Does anyone have any tips on efficiently parallelizing std :: partition using TBB? Is it already done?

That's what I think:

  • if the array is small, std :: partition it (serial) and return
  • else, treat the array as 2 alternating arrays using custom iterators (alternate blocks the size of a cache)
  • run the parallel partition task for each pair of iterators (recurse to step 1)
  • swap elements between two sections / middle pointers *
  • returns a merged section / middle pointer

* I hope that in the average case this area will be small compared to the length of the array or compared to the necessary swaps, if the partition of the array is in adjacent pieces.

Any thoughts before I try?

+4
source share
4 answers

I would see this as a degenerate case of parallel sorting of a sample. (A parallel code for sorting samples can be found here .) Let N be the number of elements. A degenerate sample type will require a temporary space of & theta; (N) has? Theta; (N) work and & Theta; (P + lg N) span (critical path). The last two values ​​are important for analysis, since acceleration is limited by work / range.

I assume that the input is a random access sequence. Steps:

  • Select a temporary array large enough to save a copy of the input sequence.
  • K-. K - . P, K = max (4 * P, L) , L . "4 * P" .
  • std:: partition. . "" . , , ( ++ 11) .
  • , . 3. , . . 100 , .
  • . .

4 3 5, & Theta; (lg N), , .

tbb:: parallel_for 3 5, affinity_partitioner, 5 , 3.

, & Theta; (N) Theta; (N). .

+3

- std::partition_copy ? :

  • std::partition, , , - .
  • parallelism .
  • , ()
  • , .

parallel_for ( ) tbb::parallel_for_each ( - ), . "" "" . , :

std::partition_copy

  • ( ) atomic<size_t> , , ( )
+2

, Divide-and-Conquer ( parallel_for)? :

  • . [, ] [, ), [ , ).
  • std:: partition .
  • . parallel_for.

.

0

, , - , ?

... , :

  • , . , , , , .
  • , , , . 10 , , . std::partition , .

, , . - , :)

0

Source: https://habr.com/ru/post/1542401/


All Articles