Disabling ALL asynchronous execution in CUDA programs

According to the CUDA programming guide, you can disable the start of the asynchronous kernel at run time by setting the environment variable (CUDA_LAUNCH_BLOCKING = 1).

This is a useful debugging tool. I also want to determine the advantage of my code in using parallel cores and translations.

I also want to disable other simultaneous calls, in particular cudaMemcpyAsync.

Do CUDA_LAUNCH_BLOCKINGthese kinds of calls affect the addition of starting the kernel? I suspect not. What would be the best alternative? I can add calls cudaStreamSynchronize, but I would prefer a runtime solution. I can run in the debugger, but it will affect the time and defeat the goal.

+3
source share
1

CUDA_LAUNCH_BLOCKING API . , 0, , , .

+1

Source: https://habr.com/ru/post/1786259/


All Articles