I learned how to create reliable code in OpenCL and look at the following kernel code:
string kernel_code =
" void kernel simple_add(global const int *A, "
" global const int *B, "
" global int *C, int n) { "
" "
" int index = get_global_id(0); "
" C[index]=A[index]+B[index]; "
" } ";
And using a special code to send it to the GPU:
Kernel ker(program, "simple_add");
ker.setArg(0, buffer_A);
ker.setArg(1, buffer_B);
ker.setArg(2, buffer_C);
ker.setArg(3, N);
q.enqueueNDRangeKernel(ker,NullRange,NDRange(32),NDRange(32));
q.finish();
The fact is that I use more work items than necessary, so I think I should check to see if the index is indexed by kernel code. But I do not use it and do not check the errors returned by enqueueNDRangeKernel, or upon completion it gives me CL_SUCCESS .. Somehow this should not give me errors for any reason or I don’t know how to get them .. What is the answer?
source
share