All explicit specialization declarations must be visible when instantiating the template. Your explicit specialization declaration appears only in the kernals.cu translation block, but not in main.cu.
The following code does work correctly (in addition to adding the __global__ qualifier in the explicit instantiation instructions).
#include<cuda.h> #include<cuda_runtime.h> #include<stdio.h> #include<conio.h> template<bool approx> __global__ void kernel() { if(approx) { printf("True branch\n"); } else { printf("False branch\n"); } } template __global__ void kernel<false>(); int main(void) { kernel<false><<<1,1>>>(); getch(); return 0; }
EDIT
In C ++, template functions are not compiled until an explicit function instantiation is encountered. From this point of view, CUDA, which now fully supports templates, behaves exactly like C ++.
To make a concrete example, when the compiler finds something like
template<class T> __global__ void kernel(...params...) { ... T a; ... }
it just checks the syntax of the function, but does not create object code. So, if you compile a file with one template function, as described above, you will have an "empty" object file. This is reasonable since the compiler did not know what type is assigned to a .
The compiler only creates object code when it detects the explicit creation of a function template. At this point, how compilation of template functions works, and this behavior introduces a restriction for projects with several files: the implementation (definition) of the template function must be in the same file as its declaration. Thus, you cannot separate the interface contained in kernels.cuh in a header file separate from kernels.cu , which is the main reason why the first version of your code does not compile. Accordingly, you must include both the interface and the implementation in any file that uses templates, namely: you must include both kernels.cuh and kernels.cu
Since no code is generated without explicitly creating an instance, compilers allow the inclusion of more than once from the same template file with declarations and definitions in the project without binding errors.
There are several guides for using templates in C ++. The C ++ Template Guide for Idiots - Part 1 , in addition to the annoying name, will provide you with a step-by-step introduction to this topic.