Based on your explanations, you want local lists to be merged at the end of your algorithm. If each thread finds a lot of solutions, using mutexes will significantly slow down your calculations. From what I understand from your context, you should think that memory is pretty cheap today, but it's hard to understand without knowing the size of the solution.
There may also be list merging algorithms that are minimal in memory space (i.e. by copying a small amount of data at a time), which can solve problems with memory size.
Thus, there are hybrid solutions to your problem. For example, you can create a shared container and split it to assign partitions for each stream, interlaced or block. This will allow you to remove the mutex for each access, but this will require a complex container growth mechanism (since it does not seem possible to know how many decisions you will have in advance).
source share