A simple console program will not work if cudaMalloc is called

Question

A simple console program will not work if cudaMalloc is called

The following simple program never terminates when a cudaMalloc call is made. Commenting out only cudaMalloc makes it work fine.

#include <iostream> using std::cout; using std::cin; #include "cuda.h" #include "cutil_inline.h" void PrintCudaVersion(int version, const char *name) { int versionMaj = version / 1000; int versionMin = (version - (versionMaj * 1000)) / 10; cout << "CUDA " << name << " version: " << versionMaj << "." << versionMin << "\n"; } void ReportCudaVersions() { int version = 0; cudaDriverGetVersion(&version); PrintCudaVersion(version, "Driver"); cudaRuntimeGetVersion(&version); PrintCudaVersion(version, "Runtime"); } int main(int argc, char **argv) { //CUresult r = cuInit(0); << These two lines were in original post //cout << "Init result: " << r << "\n"; << but have no effect on the problem ReportCudaVersions(); void *ptr = NULL; cudaError_t err = cudaSuccess; err = cudaMalloc(&ptr, 1024*1024); cout << "cudaMalloc returned: " << err << " ptr: " << ptr << "\n"; err = cudaFree(ptr); cout << "cudaFree returned: " << err << "\n"; return(0); }

It runs on Windows 7, CUDA 4.1, CUDA 3.2. I am tracing the return from main via CRT to ExitProcess (), from which it never returns (as expected), but the process never ends. From VS2008 I can stop debugging OK. From the command line I have to kill the console window.

Program output:

 Init result: 0 CUDA Driver version: 4.1 CUDA Runtime version: 3.2 cudaMalloc returned: 0 ptr: 00210000 cudaFree returned: 0

I tried to make the distribution amount so large that cudaMalloc would fail. He made and reported an error, but the program still will not work. Therefore, apparently, this is connected only with the call to cudaMalloc, and not with the presence of allocated memory.

Any ideas on what's going on here?

EDIT: in the second sentence, I made a mistake - I need to eliminate both cudaMalloc and cudaFree in order to get the program to exit. Leaving one of them, you hang up.

EDIT: Although there are many references to the fact that CUDA driver versions are backward compatible, this problem disappeared when I returned the driver to V3.2.

+6

c ++ windows cuda

Steve fallows Dec 15 '11 at 21:18

source share

1 answer

Vlad · Answer 1 · 2011-12-15T22:25:50+0000

It seems that you are mixing the driver API ( cuInit ) with the runtime API ( cudaMalloc ).

I don't know if something funny is happening (or should happen) backstage, but one thing you could try is to remove cuInit and see what happens.

A simple console program will not work if cudaMalloc is called

More articles: