Using Eigen 3.3 in the CUDA Core

From November 2016 you can compile CUDA code that references Eigen3.3 - see this answer

This answer is not what I am looking for, and can now be "deprecated" in the sense that now there may be an easier way, since the following is written in docs

Starting with Eigen 3.3, you can now use Eigen objects and algorithms in CUDA kernels. However, only a subset of functions is supported to ensure that dynamic allocation will not be triggered internally by the CUDA core.

See also here . Unfortunately, I could not find any example of how this might look.

My question

Now you can write a kernel, for example, the following , which should simply calculate a bunch of point products?

__global__ void cu_dot(Eigen::Vector3d *v1, Eigen::Vector3d *v2, double *out, size_t N) { int idx = blockIdx.x * blockDim.x + threadIdx.x; if(idx < N) { out[idx] = v1[idx].dot(v2[idx]); } return; } 

I can compile this, but it does not seem to work. When I try to copy data to the host, I get illegal memory access . Note that I initially store Vector3d as `std :: vector, and then, accordingly, use

 cudaMalloc((void **)&p_device_v1, sizeof(Eigen::Vector3d)*n); cudaMemcpy(p_v1_device, v1.data(), sizeof(Eigen::Vector3d)*n, cudaMemcpyHostToDevice); 

I created a MWE project using CMake at https://github.com/GPMueller/eigen-cuda

+5
source share
1 answer

In the MWE project on github, you wrote:

 double dot(std::vector<Eigen::Vector3d> v1, std::vector<Eigen::Vector3d> v2) { ... // Dot product cu_dot<<<(n+1023)/1024, 1024>>>(v1.data(), v2.data(), dev_ret, n); 

The v1.data() and v2.data() are in the CPU memory. You need to use pointers in the memory of the GPU, i.e.

 // Dot product cu_dot<<<(n+1023)/1024, 1024>>>(dev_v1, dev_v2, dev_ret, n); 

The results of the CPU and GPU are not identical, but there is a problem with the code, i.e. You are not reducing multiple dot products.

+5
source

Source: https://habr.com/ru/post/1261185/


All Articles