OpenCL: Launching Multiple Processors / GPUs

I want to run parallel tasks on the GPU and CPU using multiple OpenCL devices. Standard examples from the AMD SDK are not well understood on this subject. Can you advise additional lessons or examples on this subject? Any advice will do.

Thanks.

+6
source share
5 answers

In the tutorial and information on using multiple devices, you can refer to section 4.12 of the AMD-APP-SDK Programming Guide

+1
source

Performing parallel tasks on multiple devices requires dynamic planning for good efficiency, because you never know the exact performance of any device - it depends on the current load (not only your program, but also all others), current hours (it can change significantly for most processors and GPUs depending on the current energy-saving profile or load). In addition, actual performance may depend on your input.

Of course, you can write all the necessary code yourself, like all other answers, but, in my opinion, this is a waste of time, and it is much better to use the existing solution. I recommend using StarPU. I used StarPU in my OpenCL project and it worked very well. StarPU comes with examples of how to write code that can efficiently use multiple GPUs and processors.

StarPU :

Traditional processors have reached the architectural limits that they intend to address heterogeneous multi-core designs and hardware specialization (for example, coprocessors, accelerators ...). However, the use of such machines presents many complex problems at all levels, from programming models and compilers to developing scalable hardware solutions. Designing efficient runtime systems for these architectures is an important issue. StarPU typically makes it much easier to use high-performance libraries or compiler environments to use heterogeneous multi-core machines, possibly equipped with GPGPUs or Cell processors: instead of handling low-level problems, programmers can focus on algorithmic problems.

There is another SkePU project, but I have not tried it myself:

SkePU :

SkePU is such a skeleton programming framework for multi-core processors and multiprocessor systems. This is a C ++ template library with six parallel data and one parallel task skeleton, two types of containers and support for execution on systems with multiple GPUs, both with CUDA and OpenCL. SkePU recently developed support for hybrid execution, dynamic scheduling for performance and load balancing by implementing a backend for the StarPU runtime system.

If you use Google for "dynamic planning of gpu cpu opencl", you can find even more useful free or commercial projects and documentation.

+5
source

There is nothing stopping you from doing this. You will need to provide all the devices you want to use to call clCreateContext() , and then create at least one command queue for each of them. Depending on what you are trying to do, you may need to learn more sophisticated task scheduling methods, for example. Using commands from command queues and events to schedule tasks on different devices.

+1
source

With clGetPlatforms, you'll find out if you have more than one platform or not. If you run the nVidia GPU board and AMD processor, you will find on the platforms. One platform for AMD SDK and one for implementing nVidia CUDA OpenCL. With clGetDevices, you will find available devices for each platform. It can be one per platform, such as 1xGPU and 1xCPU.

For each device, create a context with clCreateContext, and then you can run both in parallel.

+1
source

The OpenCL Programming Guide by Aftab Munshi and others will provide you with more details.

0
source

Source: https://habr.com/ru/post/885807/


All Articles