I am sure that the really short answer is no.
Although CUDA has support for a dynamic / JIT device, it is important to remember that the binding process itself is still static.
Thus, you cannot defer loading a specific function in an existing compiled GPU payload at runtime, as you can, in a normal dynamic link loading environment. And the linker still requires that at the time of the link there should be one instance of all the objects and characters of the code, whether it is a priori or at run time. Thus, you could freely link together precompiled objects with different versions of the same code, if only one instance of everything is present when the session is completed and the code is loaded into the context. But as much as possible.
source share