Half precision floating point in CUDA

Is there half floating point accuracy in CUDA?

Background: I want to manipulate the texture of opengl using glTexSubImage3D with the data from the PBO that I generate using CUDA. The texture is stored in GL_INTENSITY16 format (which is a half-precision afaik with a half precision), and I don't want to use glPixelTransferf (GL_x_SCALE, ...) to scale by integer values, since it seems much faster without scaling.

Any tips?

+6
source share
1 answer

CUDA only supports 32 and 64-bit floating precision types.

Both driver and runtime APIs support binding to half-floating textures, but final reads within the kernel return a value that is advanced to a 32-bit floating-point number. The standard CUDA libraries include the __half2float() and __float2half_rn() functions for converting between floating point types with half and one precision (half fields stored in a 16-bit integer value). Thus, one could do the manipulation in 32-bit cores with reading and writing using 16-bit types. But for the native 16-bit floating point, I think you're out of luck.


EDIT will add that in 2015, NVIDIA expanded its half-precision floating-point support with the CUDA 7.5 toolkit, adding half and half2 and internal functions to handle them. It has also been announced that the (not yet released) Pascal architecture will support IEE754-2008-compatible hardware operations with half precision.

+13
source

Source: https://habr.com/ru/post/893681/


All Articles