Sorting a vector using streams

Are vectors defined in C ++ STL re-entrant or thread safe? Can I use two threads and work (in this case sorting) on ​​two halves of a vector without using a mutex? For instance:

int size = (Data_List.size())/2; Thread A() { ................ // Do we need to lock Data_list with a mutex sort(Data_List.begin(),Data_List.begin()+size,cmp); } Thread B() { ....// Do we need to lock Data_list with a mutex sort(Data_List.begin()+(size+1),Data_List.end(),cmp); } 

My question is: do we need to block Data_List access using a mutex?

Note: cmp function is a normal comparison function.

+4
source share
5 answers

As long as the threads work in different areas of memory, and your comparison function only works with this memory and local variables, you should be fine. In fact, you β€œblock” each half of the table, dividing the work between threads and allowing the thread to work only on its half of the data.

+6
source

Almost, but not quite. This code as a whole is not thread safe, even assuming that the implementation provides reasonable guarantees of thread safety for the vector.

Suppose Data_List::value_type is a type for which your architecture does not provide atomic access. For example, on x86, byte records and aligned words and dword records are atomic, but there are no short records, and user-type records are not unless they are good size and aligned. If your UDT is size 3, you may have a problem.

To perform a non-atomic short write, an implementation can perform two byte writes. Or he can load a word, change two bytes and write the word back. On architectures where byte-writing is expensive, the latter is plausible.

Now suppose your implementation does the last. Suppose further that your division of a vector, as a rule, leaves the first half of the word in one part, and the second half of the word in the other. Finally, suppose that two different threads simultaneously try to change the word at the same time. This may go wrong - they can read the same value, both change it, but then one record will happen before the other, so one of the modifications will be overwritten and lost.

Atomic byte access by default is not universal, and I doubt that any implementation makes atomic byte access by default. Thus, a vector<bool> is a possible problem, even if other types of vectors are safe: your division of the vector may drop to the middle of the byte. Not that you usually sort bools this way, but there are other operations in which you can try to separate the vectors, or your code may be common.

You can usually expect access to words to be atomic (and in C or C ++, int access). But this is not guaranteed by the locale, hence sig_atomic_t . You say that your cmp is an int comparison function that assumes that your vector possibly contains an int. But you can perfectly compare shorts using the int comparison function, so I don't know.

What you really need to do is very important to check your implementation / platform and see what it says about secure memory access from multiple threads. This is probably good for an int vector.

+3
source

You can use the parallel GCC mode (if you use GCC) for streaming versions of various algorithms. http://gcc.gnu.org/onlinedocs/libstdc++/manual/parallel_mode.html

+2
source

technically, the standard says that these classes are not thread safe, so if you install data from things like the [] operator, then technically there are several records on the same object. On the other hand, it is safe to work with individual parts of the c array ... so there seems to be a conflict here. If you want to be scratchy, clear the address of the first element and consider it as a c-array. Many of the answers here say this is fine as long as you are on separate parts of the array, and although this is probably true for many implementations

-> I think it is important to note that the implementation is not safe.

+2
source

You should not have a problem if the threads work with disjoint parts of the vector.

+1
source

Source: https://habr.com/ru/post/1302943/


All Articles