I recently changed the surface reference of my algorithm for a surface object. Then I noticed that the program runs slower.
Here is a comparison for a simple example when I populate a 3D floating array [400 * 400 * 400] with a constant value.
Surface API
Time: 9.068928 ms
surface<void, cudaSurfaceType3D> s_volumeSurf; ... surf3Dwrite(value, s_volumeSurf, px*sizeof(float), py, pz, cudaBoundaryModeTrap);
Surface Feature API
Time: 14.960256 ms
cudaSurfaceObject_t l_volSurfObj; ... surf3Dwrite(value, l_volSurfObj, px*sizeof(float), py, pz, cudaBoundaryModeTrap);
This has been tested on the GTX 680 with Compute Capability 3.0 and CUDA 5.0.
Does anyone have an explanation for this difference?
source share