Using __constant qualifer in OpenCL kernels

I'm having problems using the __constant qualifier in OpenCL kernels. My platform is Snow Leopard.

I tried to initialize the read-only CL memory object on the GPU by copying my persistent array from the host to it. Then I set the kernel argument in the same way as the __global memory arguments, but this does not work properly, but I do not see errors or warnings. I also tried using the data directly in the clSetKernelArg function, as with the float and int types, and it does not work.

Am I making any mistakes or is something wrong with the Apple implementation? I would like to see any working examples of how this is done, both OpenCL (gpu) and host code.

+2
source share
3 answers

I doubt that there is anything so fundamental to Apple's implementation. I used the following OpenCL Hello World Example application to get an overview of the basics.

In this example, I replaced __global float* input with __constant float* input , and it worked fine. You should also make sure your buffer is CL_MEM_READ_ONLY using something like clCreateBuffer(context, CL_MEM_READ_ONLY, sizeof(float) * count, NULL, NULL) .

From reading the spec, I think __constant => __global + CL_MEM_READ_ONLY.

I am running Snow Leopard on an MBP of 15 ".

+4
source

There are some errors with the way the Apple OpenCL compiler handles __constant variables on the GPU. If the compiler log says something like

 OpenCL Build Error : Compiler build log: Error while compiling the ptx module: CLH_ERROR_NO_BINARY_FOR_GPU PTX Info log: PTX Error log: 

then I had the same error as yours, and an error was detected on it. People at Apple marked this as a duplicate (apparently rdar: // 7217974), so I guess this is a known issue and they are working on it.

+3
source

"From reading the spec, I think __constant => __global + CL_MEM_READ_ONLY."

In fact, when you specify _constant instead of __global, you tell your device to save this data in another part of the memory. In some devices, it is true that it can be the same, but it cannot be others. For example, on NVIDIA cards you only have 64kb of read-only memory and mb downloads for __global. The advantage of __constants is that it is cached on NVIDIA devices :)

You can request your device: (example of my request for a device)

CL_DEVICE_MAX_MEM_ALLOC_SIZE: 128 MB

CL_DEVICE_GLOBAL_MEM_SIZE: 255 MB

CL_DEVICE_LOCAL_MEM_SIZE: 16 KB

CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: 64 KB

+3
source

Source: https://habr.com/ru/post/1494709/


All Articles