I recommend something like AMD CodeXL or Intel VTune , CodeXL is free; Intel VTune has a free academic license, if applicable to you, or you can try the 30-day trial. Both of them work in Linux.
At the most basic level, these tools can identify hot spots, for example, by measuring how much time you spend inside the std::mutex methods. Each tool has even more advanced analysis methods / tools that can help you in the future. You do not need to change your code at all, although you may need to verify that you are compiled with debugging symbols and / or have not split the binaries. You probably also want to avoid extreme optimization levels such as -O3 , and stick with -O1 , -O2 or -Og .
PS: Like all optimization requests, I have to remind you that it always measures where your performance problems really are before you start the optimization . No matter how worried you are about a lock conflict, confirm your problems with the profiler before making a huge effort trying to alleviate any conflict you may or may not have.
source share