Hi, I have doubts about programming in CUDA. I have the following code:
int main () { for (;;) { kernel_1 (x1, x2, ....); kernel_2 (x1, x2 ...); kernel_3_Reduction (x1);
So, when are the copies made and how can I make sure that kernel_1, kernel_2 kernel_3 have completed their tasks?
source share