The fastest way to access device vector elements directly on the host

I am sending you to the next page http://code.google.com/p/thrust/wiki/QuickStartGuide#Vectors . See the second paragraph for

Also note that you can access individual device_vector elements using the standard bracket notation. However, since each of these requires a cudaMemcpy call to access, they should be used sparingly. We will consider more effective methods later.

I searched throughout the document, but I could not find a more efficient technique. Does anyone know the fastest way to do this? How to get quick access to a device / device pointer on a host?

+4
source share
2 answers

The โ€œmore efficient methodsโ€ referenced by the manual are Thrust algorithms. It is more efficient to access (or copy over the PCI-E bus) millions of elements at a time than to access a single element, since the fixed cost of exchanging the CPU / GPU is amortized.

There is no faster way to copy data from the GPU to the CPU than by calling cudaMemcpy , as this is the most primitive way for a CUDA programmer to complete a task.

+3
source

If you have a device_vector that you need to process more on, try saving the data on the device and processing it using Thrust algorithms or your own kernels. If you need to read only a few values โ€‹โ€‹from device_vector, just access the values โ€‹โ€‹directly using parenthesized notation. If you need to access multiple values, copy device_vector to host_vector and read the values โ€‹โ€‹from there.

 thrust::device_vector<int> D; ... thrust::host_vector<int> H = D; 
0
source

Source: https://habr.com/ru/post/1388358/


All Articles