GPU programming - transfer bottlenecks

Question

GPU programming - transfer bottlenecks

I would like my GPU to do some calculations for me, I'm interested in measuring the speed of loading and loading textures, because my “textures” are the data that the GPU needs to process.

I know that transferring from main memory to GPU memory is the preferred way, so I expect such an application to be effective only if there is a lot of data that needs to be processed and small results will be read.

Anyway, any such test application? I mean, to measure the bandwidth of the main memory <> GPU ...

EDIT (clarification of the question):

As soon as the application that you started appeared and it issued 2 numbers:

mb / s transfer rate between the main memory and the graphics card memory, from the main TO graph, texture downloads
Transfer speed mb / s between the main memory and the memory of the graphics card, from the graph TO main, texture download

I just wanted to get it done again.

GET ANOTHER IMAGE (something detected):

Here http://www.benchmarkhq.ru/english.html?/be_mm.html (TexBench search) is an application that measures ONE WAY bandwidth ...

+4

benchmarking cuda gpu-programming

Daniel Mošmondor Mar 10 '10 at 18:09

source share

3 answers

First: the difference between global (GPU) memory and texture is determined by the cache. Textures have this, global memory doesn't.

Secondly: the transfer speed from the host to the device (GPU) is the same for textures and for global memory.

Third: the transfer speed from the host to the device (GPU) depends on the generation of the GPU and is determined by the PCI-Express bus and the size of your data.

See, for example: http://www.accelereyes.com/wiki/index.php?title=GPU_Memory_Transfer

+1

artaak Mar 10 '10 at 20:21

source share

you can use the cuda profile to tell you the time spent on the cuda function, including the memory transfer time. You can write a very simple test transmission example and measure it. it would be better, in my opinion, when measuring your specific test cases.

See CUDA_PROFILE and how to use it. http://www.drdobbs.com/cpp/209601096?pgno=2

your question is a little difficult to understand, do you want to measure the transfer between the host and the GPU (texture cache doesn't really matter) or are the textures read from inside the kernel?

0

Anycorn Mar 10 '10 at 18:20

source share

Tom · Accepted Answer · 2010-03-10T21:26:44+0000

To measure the host memory bandwidth for a device, you can use the bandwidthTest sample from the CUDA SDK (download from the CUDA website ).

GPU programming - transfer bottlenecks

More articles: