Calling BLAS Procedures Inside OpenCL Cores

I am currently executing some image processing algorithms using OpenCL. Basically, my algorithm requires solving a linear system of equations for each pixel. Each system is independent of the others, so the transition to a parallel implementation is natural.

I looked at several BLAS packages, such as ViennaCL and AMD APPML , but it seems that they all have the same usage pattern (calling the BLAS host that must be executed on the CL device).

I need a BLAS library that can be called inside the OpenCL kernel so that I can solve several linear systems in parallel.

I found this similar question on AMD forums.

thanks

+4
source share
1 answer

Impossible. ClBLAS procedures make a series of kernel launches; some of the "solvable" kernel launches are really complex. ClBLAS routines take cl_mem and commandQueues as arguments. Therefore, if your buffer is already on the device, clBLAS will directly affect it. It does not accept a host buffer or controls host device transfer>

If you want to see which kernel is generated and running, uncomment this line https://github.com/clMathLibraries/clBLAS/blob/master/src/library/blas/generic/common.c#L461 and create clBLAS. It resets all cores called

0
source

Source: https://habr.com/ru/post/1501390/


All Articles