CUDA cudaMemcpyToSymbol () throws error "invalid argument"

Problem

I am trying to copy an int array to the device read-only memory, but I keep getting the following error:

[ERROR] 'invalid argument' (11) in 'main.cu' on line '386'

The code

A lot of code was developed there, so I'm going to simplify what I have.

I declared the __constant__ variable of the device at the top of my main.cu file outside of any function.

 __device__ __constant__ int* dic; 

I also have a host variable, flatDic , which is malloc'ed as follows, inside main() :

 int* flatDic = (int *)malloc(num_codewords*(bSizeY*bSizeX)*sizeof(int)); 

Then I try to copy the contents of flatDic to dic by doing this, also in main() :

 cudaMemcpyToSymbol(dic, flatDic, num_codewords*(bSizeY*bSizeX)*sizeof(int)); 

This cudaMemcpyToSymbol() calls the line 386 main.cu, and where the above error is called.

What i tried

Here is what I still tried to solve the problem:

I tried all of the following, always returning the same error:

 cudaMemcpyToSymbol(dic, &flatDic, num_codewords*(bSizeY*bSizeX)*sizeof(int)); cudaMemcpyToSymbol(dic, flatDic, num_codewords*(bSizeY*bSizeX)*sizeof(int)); cudaMemcpyToSymbol(dic, &flatDic, num_codewords*(bSizeY*bSizeX)*sizeof(int), 0, cudaMemcpyHostToDevice); cudaMemcpyToSymbol(dic, flatDic, num_codewords*(bSizeY*bSizeX)*sizeof(int), 0, cudaMemcpyHostToDevice); 

I also tried cudaMalloc() dic variable before calling cudaMemcpyToSymbol() . There are no errors in cudaMalloc() , but the cudaMemcpyToSymbol() error persists.

 cudaMalloc((void **) &dic, num_codewords*(bSizeY*bSizeX)*sizeof(int)); 

I also carefully study web pages, documentation, forums, examples, etc., but to no avail.

Does anyone see something wrong with my code? Thanks in advance.

+4
source share
1 answer

cudaMemcpyToSymbol copies a constant variable, here you are trying to copy a few bytes of type int (allocated by ARRAY) to a pointer of type int * . These types do not match, therefore invalid type . To do this, you need to copy ARRAY from int (allocated) to the device (static length) ARRAY from int (constant), for example:

 __device__ __constant__ int dic[LEN]; 

An example from the CUDA C Programming Guide (which I suggest you read is not bad!):

 __constant__ float constData[256]; float data[256]; cudaMemcpyToSymbol(constData, data, sizeof(data)); cudaMemcpyFromSymbol(data, constData, sizeof(data)); 

As far as I know, you can also cudaMemcpyToSymbol pointer to a pointer (unlike your example, where you copy an array to a pointer), but be careful, only the pointer will be constant, not the memory that it points to your device. If you are going to follow this route, you need to add cudaMalloc , and then cudaMemcpyToSymbol received ptr in the device’s memory to your __constant__ var device. AGAIN, in this case the values ​​of the array will NOT be constant - there will ONLY be a pointer to the memory.

Your call for this case will look something like this:

 int * d_dic; cudaMalloc((void **) &d_dic, num_codewords*(bSizeY*bSizeX)*sizeof(int)); cudaMemcpyToSymbol(c_dic_ptr, &d_Dic, sizeof(int *)); 

You should also exchange your CUDA calls while debugging the internal error checking logic. I used the following talonmies logic:

 __inline __host__ void gpuAssert(cudaError_t code, char *file, int line, bool abort=true) { if (code != cudaSuccess) { fprintf(stderr,"GPUassert: %s %s %d\n", cudaGetErrorString(code), file, line); if (abort) exit(code); } } #define gpuErrchk(ans) { gpuAssert((ans), __FILE__, __LINE__); } 

To make a call, simply wrap your CUDA call as follows:

 gpuErrchk(cudaMemcpyToSymbol(dic, flatDic, num_codewords*(bSizeY*bSizeX)*sizeof(int))); 

Programming ends with an error message if you have distribution problems or other common errors.

To test your kernel, do something like:

 MyKernel<<<BLK,THRD>>>(vars...); //Make sure nothing went wrong. gpuErrchk(cudaPeekAtLastError()); gpuErrchk(cudaDeviceSynchronize()); 

Thanks talonmies for error checking code!

Note:
Even if you made vanilla cudaMemcpy , your code failed because you did not have cudaMalloc ed memory for your array - in this case, the error will most likely be the equivalent of the GPU segfault (most likely t218>), since the pointer will have in some unwanted value, and you will try to write memory with the address specified by this spam value.

+3
source

Source: https://habr.com/ru/post/1401076/


All Articles