STL, iostream, new, delete in C / C ++ for CUDA

Can I use STL, iostream, new, delete in C / C ++ for CUDA?

+6
source share
2 answers

If you have a Fermi class GPU (so computing power> = 2.0) and use CUDA 4.0 or later, then both new and delete can be used for device code. STL and iostream containers and algorithms are not supported.

If you want to use "STL like" operations with CUDA, you might be interested in the Thrust template library. It allows the host code to transparently interact with the GPU using container types and implements a number of very useful parallel data primitives, such as sorting, pruning, and scanning. Note that this is still a host-side device, Thrust and its containers cannot be used inside your own kernel code.

+9
source

Let me break it down a little.

General case: can I use the standard C ++ XYZ library on the GPU?

No, you cannot use the standard library code on the GPU (i.e. in the code on the device side). The most direct obstacle is that the standard library is not aimed at the CUDA compiler - without indicating that its code must be compiled for both the host and for execution on the device side. But even if this technical problem was not provided to someone, there are various reasons why a fairly large part of the standard library will not work as it is or not at all on the GPU.

STL

As the hints show, the Thrust library provides some STL-like functions in a convenient and beautifully packaged form. But this is still basically a "no" as an answer to your question, because:

  • Its interface is the side of the side, not the side of the device. That is, he will do something for you using the GPU, but they will be under the hood; it is not a toolkit for writing your own code on the device side.
  • It covers a small part of the STL: as the data structures go, they are essentially just vectors (AFAIK - I did not comb the code); GPU streaming using iostreams or similar abstraction is not supported.

iostreams

No, you cannot use iostream in code on the CUDA device side. However, we have C-style printf: printf("my_int_value is %05d\n", my_int_value); . This is a completely different beast than the standard printf() library, although, since it needs to send data via the PCI bus and force the driver to receive it on the output stream of the process on the host side.

See the CUDA Programming Guide section on formatted output for details.

new and delete

The new and delete operators work just like on malloc() and free() devices - which are different from the host and a bit limited; see RobertCrovella answer for this question and links in it.

I would advise, however, that you think very carefully about whether you really need to do the allocation and release of memory on the device; this is likely to be costly in terms of performance, and often / usually you can achieve more by preallocating memory through an API call on the host side.

0
source

Source: https://habr.com/ru/post/906755/


All Articles