cudaMemcpyToSymbol copies a constant variable, here you are trying to copy a few bytes of type int (allocated by ARRAY) to a pointer of type int * . These types do not match, therefore invalid type . To do this, you need to copy ARRAY from int (allocated) to the device (static length) ARRAY from int (constant), for example:
__device__ __constant__ int dic[LEN];
An example from the CUDA C Programming Guide (which I suggest you read is not bad!):
__constant__ float constData[256]; float data[256]; cudaMemcpyToSymbol(constData, data, sizeof(data)); cudaMemcpyFromSymbol(data, constData, sizeof(data));
As far as I know, you can also cudaMemcpyToSymbol pointer to a pointer (unlike your example, where you copy an array to a pointer), but be careful, only the pointer will be constant, not the memory that it points to your device. If you are going to follow this route, you need to add cudaMalloc , and then cudaMemcpyToSymbol received ptr in the device’s memory to your __constant__ var device. AGAIN, in this case the values of the array will NOT be constant - there will ONLY be a pointer to the memory.
Your call for this case will look something like this:
int * d_dic; cudaMalloc((void **) &d_dic, num_codewords*(bSizeY*bSizeX)*sizeof(int)); cudaMemcpyToSymbol(c_dic_ptr, &d_Dic, sizeof(int *));
You should also exchange your CUDA calls while debugging the internal error checking logic. I used the following talonmies logic:
__inline __host__ void gpuAssert(cudaError_t code, char *file, int line, bool abort=true) { if (code != cudaSuccess) { fprintf(stderr,"GPUassert: %s %s %d\n", cudaGetErrorString(code), file, line); if (abort) exit(code); } } #define gpuErrchk(ans) { gpuAssert((ans), __FILE__, __LINE__); }
To make a call, simply wrap your CUDA call as follows:
gpuErrchk(cudaMemcpyToSymbol(dic, flatDic, num_codewords*(bSizeY*bSizeX)*sizeof(int)));
Programming ends with an error message if you have distribution problems or other common errors.
To test your kernel, do something like:
MyKernel<<<BLK,THRD>>>(vars...); //Make sure nothing went wrong. gpuErrchk(cudaPeekAtLastError()); gpuErrchk(cudaDeviceSynchronize());
Thanks talonmies for error checking code!
Note:
Even if you made vanilla cudaMemcpy , your code failed because you did not have cudaMalloc ed memory for your array - in this case, the error will most likely be the equivalent of the GPU segfault (most likely t218>), since the pointer will have in some unwanted value, and you will try to write memory with the address specified by this spam value.