Creating a local array in the OpenCL core

I have an OpenCL core that should treat an array as multiple arrays, where each sum of the sub-cameras is stored in a local cache array.

For example, imagine a fowling array:

[[1, 2, 3, 4], [10, 30, 1, 23]] 
  • Each workgroup receives an array (in our example, two workgroups);
  • Each work item processes two indexes of the array (for example, it multiplies the index of the local_id value), where the result of the work item is stored in the general array of the workgroup.

     __kernel void test(__global int **values, __global int *result, const int array_size){ __local int cache[array_size]; // initialise if (get_local_id(0) == 0){ for (int i = 0; i < array_size; i++) cache[i] = 0; } barrier (CLK_LOCAL_MEM_FENCE); if(get_global_id(0) < 4){ for (int i = 0; i<2; i++) cache[get_local_id(0)] += values[get_group_id(0)][i] * get_local_id(0); } barrier (CLK_LOCAL_MEM_FENCE); if(get_local_id(0) == 0){ for (int i = 0; i<array_size; i++) result[get_group_id(0)] += cache[i]; } } 

The problem is that I cannot determine the size of the cache array using the kernel parameter, but I need to have a dynamic kernel.

How can I create it dynamically? as a malloc function in c ...

Or is the only solution available is sending a temp array for my kernel function?

+6
source share
1 answer

This can be achieved by adding an __local array as a kernel parameter:

 __kernel void test(__global int **values, __global int *result, const int array_size, __local int * cache) 

and providing the desired kernel parameter size:

 clSetKernelArg(kernel, 3, array_size*sizeof(int), NULL); 

Local memory will be allocated when the kernel is called. Please note that additional checks may be required to ensure that the required local memory size does not exceed the device limit.

+15
source

Source: https://habr.com/ru/post/949135/


All Articles