The fastest way to access device vector elements directly on the host

Question

The fastest way to access device vector elements directly on the host

I am sending you to the next page http://code.google.com/p/thrust/wiki/QuickStartGuide#Vectors . See the second paragraph for

Also note that you can access individual device_vector elements using the standard bracket notation. However, since each of these requires a cudaMemcpy call to access, they should be used sparingly. We will consider more effective methods later.

I searched throughout the document, but I could not find a more efficient technique. Does anyone know the fastest way to do this? How to get quick access to a device / device pointer on a host?

+4

cuda thrust

Programmer Dec 28 '11 at 19:28

source share

2 answers

If you have a device_vector that you need to process more on, try saving the data on the device and processing it using Thrust algorithms or your own kernels. If you need to read only a few values from device_vector, just access the values directly using parenthesized notation. If you need to access multiple values, copy device_vector to host_vector and read the values from there.

 thrust::device_vector<int> D; ... thrust::host_vector<int> H = D;

0

Roger dahl Jan 29 '12 at 20:55

source share

Jared hoberock · Accepted Answer · 2011-12-28T22:05:50+0000

The “more efficient methods” referenced by the manual are Thrust algorithms. It is more efficient to access (or copy over the PCI-E bus) millions of elements at a time than to access a single element, since the fixed cost of exchanging the CPU / GPU is amortized.

There is no faster way to copy data from the GPU to the CPU than by calling cudaMemcpy , as this is the most primitive way for a CUDA programmer to complete a task.

The fastest way to access device vector elements directly on the host

More articles: