Creating a static CUDA library to communicate with a C ++ program

I am trying to link the CUDA core with a C ++ autotools project, but it seems it cannot go through the linking step.

I have a GPUFloydWarshall.cu file that contains the core and C shell function that I would like to host in the libgpu.a library. This will fit the rest of the project. Is it possible?

Secondly, then the library must be linked to ten other libraries for the main executable that mpicxx currently uses.

I am currently using / generating the following commands to compile and create the libgpu.a library

nvcc -rdc=true -c -o temp.o GPUFloydWarshall.cu nvcc -dlink -o GPUFloydWarshall.o temp.o -L/usr/local/cuda/lib64 -lcuda -lcudart rm -f libgpu.a ar cru libgpu.a GPUFloydWarshall.o ranlib libgpu.a 

When all this is related to the main executable, I get the following error:

 problem/libproblem.a(libproblem_a-UTRP.o): In function `UTRP::evaluate(Solution&)': UTRP.cpp:(.text+0x1220): undefined reference to `gpu_fw(double*, int)' 

The gpu_fw function is my wrapper function.

+6
source share
1 answer

Is it possible?

Yes it is possible. And creating a wrapper function around it (not CUDA) makes it even easier. You can make your life even easier if you rely on the C ++ link to everything (you mentioned the C shell function). mpicxx is an alias of the C ++ compiler / linker, and cuda files (.cu) default to the behavior of the C ++ compiler / linker. Here is a simple question that discusses building cuda code (encapsulated in a wrapper function) in a static library.

Secondly, then the library must be linked to ten other libraries for the main executable that mpicxx currently uses.

When you have a C / C ++ wrapper (not CUDA) exposed in your library, the link should not be different from the usual link of regular libraries. You still have to pass the cuda runtime libraries and any other cuda libraries that you can use at the link stage, but this is the same as any other libraries your project might affect.

EDIT:

It is not clear that you need to use device binding to what you want to do. (But this is acceptable, it complicates the situation a little.) In any case, your construction of the library is not entirely correct, now you have shown the sequence of commands. The device link command creates an object to associate with a device that does not contain all of the necessary host parts. To get everything in one place, we want to add to the library both GPUFloydWarshall.o (which has fragments associated with the device) and temp.o (which contains fragments of the host code).

Here is a complete example:

 $ cat GPUFloydWarshall.cu #include <stdio.h> __global__ void mykernel(){ printf("hello\n"); } void gpu_fw(){ mykernel<<<1,1>>>(); cudaDeviceSynchronize(); } $ cat main.cpp #include <stdio.h> void gpu_fw(); int main(){ gpu_fw(); } $ nvcc -rdc=true -c -o temp.o GPUFloydWarshall.cu $ nvcc -dlink -o GPUFloydWarshall.o temp.o -lcudart $ rm -f libgpu.a $ ar cru libgpu.a GPUFloydWarshall.o temp.o $ ranlib libgpu.a $ g++ main.cpp -L. -lgpu -o main -L/usr/local/cuda/lib64 -lcudart $ ./main hello $ 
+3
source

Source: https://habr.com/ru/post/978102/


All Articles