Cublas cublasZgemm () is slower than expected

Question

Cublas cublasZgemm () is slower than expected

According to nvidia . cublasZgemm is 6 times faster than Intel MKL.

However, on my PC (i7 2600, Nvidia gtx560, OS: linux 64bit) cublasZgemm is a bit slower than MKL.

I use numpy.dot (), which comes with an enthought python distribution that associates numpy with MKL 10.3.

The matrix multiplication function using cublasZgemm is compiled into a shared library and called using ctypes in a python script.

When multiplying two complex matrices, 1024x1024. numpy.dot () took 84 ms. The ctypes call function spent 110 ms, and part of cublasZgemm () took 97 ms.

I wonder why cublassZgemm is not as fast as nvidia stated?

+4

python cuda cublas ctypes

lucas peng Mar 04 '12 at 13:29

source share

1 answer

talonmies · Accepted Answer · 2012-03-05T16:47:16+0000

I wonder why cublassZgemm is not as fast as nvidia stated?

The short answer is that you used a much slower GPU to perform zgemm benchmarking than NVIDIA used to generate its performance metrics. Your GTX560 is probably about eight times slower than the Telsa M2090 that NVIDIA used in your link.

Cublas cublasZgemm () is slower than expected

More articles: