Porting a program to CUDA - a kernel inside another kernel?

I am trying to parallelize a function containing several procedures. Function:

void _myfunction(M1,M2){
    for (a = 0; a < A; a++) {
       Amatrix = procedure1(M1) /*contains for loops*/;
       Bmatrix = procedure2(M1) /*contains for loops*/;

       ...
       for ( z = 1 ; z < Z ; z++ ){
                 calculations with Amatrix(z) and obtain AAmatrix 
                 calculations with Bmatrix(z) and obtain BBmatrix    
          for ( e = 1; e < E; e++) { 
                 calculations with AAmatrix(e) and obtain CCmatrix 
                 calculations with BBmatrix(e) and obtain DDmatrix
          }
       }
       for (q = 0; q < Q; q++){ calculations with CCMatrix(q) }
       for (m = 0; m < M; m++){ calculations with DDMatrix(q) }
    }
}

As for the functions procedure1()and procedure2(), I ported them to CUDA, and everything is going well (each of these procedures has its own for loops). The reason these procedures are separated is because they are conceptually independent algorithms that are the opposite of the rest of the code, which has a more general concept.

CUDA, , . , , . , _myfunction(arg1,arg2,..) , , , . - , , , , .

: - , CUDA?

P.S: GeForce 9600GT (Compute Capability 1.1) CUDA Toolkit 5.0.

-2
1

CUDA, . - . ​​ . Dynamic Parallelism . Compute 1.1 . , Dynamic Parallelism CUDA Kepler. , , (, ). , . , . :

# 1: , . № 2: , , CUDA. , .

+2

Source: https://habr.com/ru/post/1616349/


All Articles