Why is only one of the distortions performed by SM in cuda?

In some CUDA materials, I often found the following words:

"At any time, only one of the warps is performed by SM."

Here I do not quite understand, since each SM can simultaneously launch hundreds or thousands of threads, why only one warp, which is 32 threads, can be executed at a time?

Thanks!

+1
source share
2 answers

CUDA, , , SM 8 , 4 ( 4 ). , 4- SMT, 32 SM.

, GPU SM. 30, 30 x 32 warps = 960 , . , , , . 960 "" , 960 .

+4

Tesla, . SM . warp warp (, ) . SM . CUDA , . /, ..

1.x (Tesla)

  • 1 warp SM
  • 1 warp

Compute Capability 2.0 (Fermi 1st Generation)

  • 2 warp SM
  • 1 warp

Compute Capability 2.1 (Fermi 2nd Generation)

  • 2 warp SM
  • 1 2 warp

3.x (Kepler)

  • 4 warp SM
  • 1 2 warp
+3

Source: https://habr.com/ru/post/1527116/


All Articles