CUDA memory release painfully slow

Question

CUDA memory release painfully slow

I highlight some floating point arrays (quite large, i.e. 9,000,000 items) on the GPU using cudaMalloc((void**)&(storage->data), size * sizeof(float)) . At the end of my program, I will free this memory using cudaFree(storage->data); .

The problem is that the first release is very slow, about 10 seconds, while others are almost instantaneous.

My question is this: what can cause this difference? Is disadaptation memory on the GPU generally slow?

+4

c memory-management cuda

Wookai Jan 28 '10 at 23:14

source share

2 answers

should not be so slow, on Linux with cuda 2.2 it takes a split second. Have you tried running host and device profilers to find out why it is slow? how much you have allocated a separate distribution ?, who has some kind of penalty, but not so big.

+1

Anycorn Jan 28 '10 at 23:22

source share

Eric · Accepted Answer · 2010-01-29T13:16:57+0000

As stated on the NVIDIA forums, this is almost certainly a problem with the way you choose things, not cudaFree.

CUDA memory release painfully slow

More articles: