Best way to copy multiple std :: vectors to 1? (Multithreaded)

That's what I'm doing:

I take the bezier points and interlace the bezier and then save the result to std::vector<std::vector<POINT>.

Bezier’s calculation slowed me down, so I did it.

I start with std::vector<USERPOINT>, which is a point structure and two other points for beziers.

I divide them into ~ 4 groups and assign each thread to do 1/4 work. To do this, I created 4 std::vector<std::vector<POINT> >to store the results from each thread. In the end, all the points must be in 1 continuous vector, before I used multithreading, I turned to this directly, but now I leave the size of the 4 produced vectors by streams and insert them into the original vector, in the correct order. This works, but, unfortunately, part of the copy is very slow and makes it slower than without multithreading. So now my new bottleneck is copying the results to a vector. How could I do this more efficiently?

thank

+3
source share
3 answers

, . , , . , ( - ), , , ( ..), . , / , - , () resize() push_back() , ( , , ).

Edit: ', , . , std::vector<std::vector<POINT> >, . (, ). , - , "" , .

std::vector<USERPOINT> inputs; // input data   
std::vector<std::vector<POINT> > outputs; // space for output data

const int thread_count = 4;

struct work_packet {           // describe the work for one thread
    USERPOINT *inputs;         // where to get its input
    std::vector<POINT> *outputs;   // where to put its output
    int num_points;                // how many points to process
    HANDLE finished;               // signal when it done.
};

std::vector<work_packet> packets(thread_count); // storage for the packets.
std::vector<HANDLE> events(thread_count);       // storage for parent handle to events

outputs.resize(inputs.size);                    // can't resize output after processing starts.

for (int i=0; i<thread_count; i++) {
    int offset = i * inputs.size() / thread_count;
    packets[i].inputs = &inputs[0]+offset;
    packets[i].outputs = &outputs[0]+offset;
    packets[i].count = inputs.size()/thread_count;
    events[i] = packets[i].done = CreateEvent();

    threads[i].process(&packets[i]);
}


// wait for curves to be generated (Win32 style, for the moment).
WaitForMultipleObjects(&events[0], thread_count, WAIT_ALL, INFINITE);

, , , outputs , , .

+4

, , Mulitthreading, , , , . - , , , .

, .. .

, , ? std::copy?

0

. .

-1

Source: https://habr.com/ru/post/1753456/