The Golang calling the CUDA library

I am trying to call a CUDA function from Go code. I have the following three files.

test.h:

int test_add(void); 

test.cu:

 __global__ void add(int *a, int *b, int *c){ *c = *a + *b; } int test_add(void) { int a, b, c; // host copies of a, b, c int *d_a, *d_b, *d_c; // device copies of a, b, c int size = sizeof(int); // Allocate space for device copies of a, b, c cudaMalloc((void **)&d_a, size); cudaMalloc((void **)&d_b, size); cudaMalloc((void **)&d_c, size); // Setup input values a = 2; b = 7; // Copy inputs to device cudaMemcpy(d_a, &a, size, cudaMemcpyHostToDevice); cudaMemcpy(d_b, &b, size, cudaMemcpyHostToDevice); // Launch add() kernel on GPU add<<<1,1>>>(d_a, d_b, d_c); // Copy result back to host cudaMemcpy(&c, d_c, size, cudaMemcpyDeviceToHost); // Cleanup cudaFree(d_a); cudaFree(d_b); cudaFree(d_c); return 0; } 

test.go:

 package main import "fmt" //#cgo CFLAGS: -I. //#cgo LDFLAGS: -L. -ltest //#cgo LDFLAGS: -lcudart //#include <test.h> import "C" func main() { fmt.Printf("Invoking cuda library...\n") fmt.Println("Done ", C.test_add()) } 

I am compiling CUDA code with:

 nvcc -m64 -arch=sm_20 -o libtest.so --shared -Xcompiler -fPIC test.cu 

All three files - test.h, test.cu and test.go are in the same directory. The error I get when I try to build with go is "undefined reference to" test_add "".

I have very little experience with C / C ++ and am just new to CUDA.

I have been trying to solve my problem in two days and will be very grateful for any input.

Thanks.

+5
source share
1 answer

It seems, at least in this case, the go C import expects the function to be provided by a C style link .

CUDA (i.e. nvcc) basically follows C ++ patterns and provides a default C ++ reference (including the name of the mangling function, etc.)

You can force part of the code to be provided externally using C, rather than a reference to the C ++ style, using extern "C" {...code...} . This is a C ++ language feature and is not specific to CUDA or nvcc.

Therefore, the problem can be solved with the following test.cu modification:

 extern "C" { int test_add(void) { ... code ... }; } 
+2
source

Source: https://habr.com/ru/post/1244325/


All Articles