I am currently working on a gpu server that has 4 Tesla T10 gpu. Although I continue to test kernels and often have to kill processes with ctrl-C, I have added a few lines to the end of the simple device request code. The code is below:
#include <stdio.h> // Print device properties void printDevProp(cudaDeviceProp devProp) { printf("Major revision number: %d\n", devProp.major); printf("Minor revision number: %d\n", devProp.minor); printf("Name: %s\n", devProp.name); printf("Total global memory: %u\n", devProp.totalGlobalMem); printf("Total shared memory per block: %u\n", devProp.sharedMemPerBlock); printf("Total registers per block: %d\n", devProp.regsPerBlock); printf("Warp size: %d\n", devProp.warpSize); printf("Maximum memory pitch: %u\n", devProp.memPitch); printf("Maximum threads per block: %d\n", devProp.maxThreadsPerBlock); for (int i = 0; i < 3; ++i) printf("Maximum dimension %d of block: %d\n", i, devProp.maxThreadsDim[i]); for (int i = 0; i < 3; ++i) printf("Maximum dimension %d of grid: %d\n", i, devProp.maxGridSize[i]); printf("Clock rate: %d\n", devProp.clockRate); printf("Total constant memory: %u\n", devProp.totalConstMem); printf("Texture alignment: %u\n", devProp.textureAlignment); printf("Concurrent copy and execution: %s\n", (devProp.deviceOverlap ? "Yes" : "No")); printf("Number of multiprocessors: %d\n", devProp.multiProcessorCount); printf("Kernel execution timeout: %s\n", (devProp.kernelExecTimeoutEnabled ? "Yes" : "No")); return; } int main() { // Number of CUDA devices int devCount; cudaGetDeviceCount(&devCount); printf("CUDA Device Query...\n"); printf("There are %d CUDA devices.\n", devCount); // Iterate through devices for (int i = 0; i < devCount; ++i) { // Get device properties printf("\nCUDA Device #%d\n", i); cudaDeviceProp devProp; cudaGetDeviceProperties(&devProp, i); printDevProp(devProp); } printf("\nPress any key to exit..."); char c; scanf("%c", &c); **for (int i = 0; i < devCount; i++) { cudaSetDevice(i); cudaDeviceReset(); }** return 0; }
My request is related to a for loop immediately before the end of main (), in which I install each device one at a time, and then use the cudaResetDevice command. I have a strange feeling that this code, although it does not produce any errors, but I can not reset all devices. Instead, the program only resets the default device ie device 0 each time. Can someone tell me what to do with reset of each of the 4 devices.
thanks
source share