NVIDIA GPU Scheduling

I have some doubts about scheduling nvidia GPU tasks.

(1) If the warping of threads in the block (CTA) is completed, but there are other warps left, will this warp wait until the rest are over? In other words, all threads in a block (CTA) free their resource when all threads are finished, okay? I think this point should be correct, since the threads in the block share the shared memory and another resource, these resources are allocated in the CTA size manager.

(2) If all threads in a block (CTA) hang for a long time, for example, access to global memory? Will new CTA threads take up a resource, such as a processor? In other words, if a block (CTA) is sent to SM (stream processors), if it takes a resource before it is completed?

I would be grateful if anyone would recommend me any book or articles on GPU architecture. Thank!

+4
source share
2 answers

(CTA) SM , SM ( , , , ,...). , . . SM. SM warp, . , , ( Pascal). warp SM warp-id.

warp , warp , warp, warp, warp-id .

, , SM , .

, , , warp warp. , . , , 1-2 . . warp , , . , . . SM , ALU, .

​​CUDA, . CUDA / . . SM SM , /, , , ..

+5

. , , . , , Pascal, .

( ):

Q1. , ?

. , , - , .

Q2. , ? ?

, . .

< 3.2 . 3.2+ , , ( parallelism) gpu.

, - . , , , CUDA, , . , , .

+3

Source: https://habr.com/ru/post/1677900/


All Articles