The advantage of using multiple sets of SIMD instructions simultaneously

I am writing a highly parallel application, multi-threaded. I already have an SSE accelerated stream class. If I were to write an accelerated MMX stream class and then run both at the same time (one SSE stream and one MMX stream per core), would performance improve noticeably?

I would think that this setting will help to hide the latency of the memory, but I would like to be sure before starting time in it.

+4
source share
2 answers

The SSE and MMX instruction sets use the same set of vector processing execution units in the CPU. Thus, starting an SSE stream and an MMX stream will have the same resources available to each stream, as if they were running two SSE threads (or two MMX threads). The only difference is the instructions that exist in the SSE, but not in the MMX (since the SSE is an extension of the MMX). But in this case, the MMX is likely to be slower because it does not have these more advanced instructions available to it.

So the answer is: No, you will not see a performance improvement compared to running two SSE threads.

+8
source

SSE and MMX use the same registers, so it doesn't matter which of the two you use (of course, when using MMX and SSE)

The best question is how SSE is implemented on your target CPU. Does it have an SSE unit per core? (possibly) If so, then you can also run SSE instructions for each thread.

If it has a common SSE block between the cores, then different threads will fight for it, so much will not be achieved by following the SSE instructions in several threads. (I don’t know if any processors really share the SSE node between threads, so consider this as a hypothetical case)

-1
source

Source: https://habr.com/ru/post/1309860/


All Articles