Why is only one of the distortions performed by SM in cuda?

Question

Why is only one of the distortions performed by SM in cuda?

In some CUDA materials, I often found the following words:

"At any time, only one of the warps is performed by SM."

Here I do not quite understand, since each SM can simultaneously launch hundreds or thousands of threads, why only one warp, which is 32 threads, can be executed at a time?

Thanks!

+1

cuda

Hailiang zhang Nov 19 '12 at 22:27

source share

2 answers

Tesla, . SM . warp warp (, ) . SM . CUDA , . /, ..

1.x (Tesla)

1 warp SM
1 warp

Compute Capability 2.0 (Fermi 1st Generation)

2 warp SM
1 warp

Compute Capability 2.1 (Fermi 2nd Generation)

2 warp SM
1 2 warp

3.x (Kepler)

4 warp SM
1 2 warp

+3

Greg Smith 20 . '12 2:49

Paul R · Accepted Answer · 2012-11-19T22:36:19+0000

CUDA, , , SM 8 , 4 ( 4 ). , 4- SMT, 32 SM.

, GPU SM. 30, 30 x 32 warps = 960 , . , , , . 960 "" , 960 .

Why is only one of the distortions performed by SM in cuda?

More articles: