Let's say you declare a new variable in the CUDA core, and then use it in multiple threads, for example:
__global__ void kernel(float* delt, float* deltb) { int i = blockIdx.x * blockDim.x + threadIdx.x; float a; a = delt[i] + deltb[i]; a += 1; }
and the kernel call looks something like the following, with several threads and blocks:
int threads = 200; uint3 blocks = make_uint3(200,1,1); kernel<<<blocks,threads>>>(d_delt, d_deltb);
- Is "a" stored on the stack?
- Is a new "a" created for each thread during initialization?
- Or will each thread independently access "a" at an unknown time, potentially ruining the algorithm?
source share