CUDA only supports 32 and 64-bit floating precision types.
Both driver and runtime APIs support binding to half-floating textures, but final reads within the kernel return a value that is advanced to a 32-bit floating-point number. The standard CUDA libraries include the __half2float() and __float2half_rn() functions for converting between floating point types with half and one precision (half fields stored in a 16-bit integer value). Thus, one could do the manipulation in 32-bit cores with reading and writing using 16-bit types. But for the native 16-bit floating point, I think you're out of luck.
EDIT will add that in 2015, NVIDIA expanded its half-precision floating-point support with the CUDA 7.5 toolkit, adding half and half2 and internal functions to handle them. It has also been announced that the (not yet released) Pascal architecture will support IEE754-2008-compatible hardware operations with half precision.
source share