OpenCL: 32-bit and 64-bit popcnt command on GPU?

I want to write a program for the GPU (preferably OpenCL), and most of the calculation consists of counting the number 1 in a bitmap (packed as long or int).

So, on modern processors, I would just use my own __popcnt instruction. I read in several places on the Internet that modern graphics processors, this instruction is also present in hardware, which will be a huge acceleration for me. (at least for 32-bit, not sure about 64)

However, I have not found anywhere how to get this instruction. So:

1) how do I know which GPUs have this instruction? (I still need to buy my GPU, so it will be a modern high-end ... maybe the Radeon HD7000 or nVidia Kepler series)

2) how to call this instruction from OpenCL (or a similar GPU language)?

+6
source share
1 answer

This is available as the cl_amd_popcnt extension. I have a Radeon 6870 card and opteron 6128 cpu that support the extension.

Even better news for you is that with OpenCL 1.2 it is no longer an extension. See the popcount list on the reference map and in the specification. The AMD 7xxx series hardware is compatible with OCL 1.2, and I think the new Nvidia stuff too.

"T is a type of char, charn, uchar, ucharn, short, shortn, ushort, ushortn, int, intn, uint, uintn, long, longn, ulong or ulongn, where n is 2, 3, 4, 8 or 16"

T popcount (T x) returns the number of filled (non-zero) bits in x.

http://www.khronos.org/registry/cl/sdk/1.2/docs/OpenCL-1.2-refcard.pdf

http://www.khronos.org/registry/cl/specs/opencl-1.2.pdf

+5
source

Source: https://habr.com/ru/post/907644/


All Articles