Fast LAPACK / BLAS for matrix multiplication

I am studying the Armadillo C ++ library for linear algebra at the moment. As I understand it, it uses the LAPACK / BLAS library for basic matrix operations (for example, matrix multiplication). As a Windows user, I downloaded LAPACK / BLAS from here: http://icl.cs.utk.edu/lapack-for-windows/lapack/#running . The problem is that matrix multiplications are very slow compared to Matlab or even R. For example, Matlab multiplies two 1000x1000 matrices in ~ 0.15 seconds on my computer, R takes ~ 1 second, whereas C ++ / Armadillo / LAPACK / BLAS takes more than 10 seconds to do this.

So, Matlab is based on highly optimized libraries for linear algebra. My question is, is there a faster LARACK / BLAS taken for use from Armadillo? Alternatively, is there a way to extract the Matlab linear algebra libraries in some way and use them in C ++?

+6
source share
4 answers

LAPACK does not perform matrix multiplication. He is BLAS, which provides matrix multiplication.

If you have a 64-bit operating system, I recommend that you try the 64-bit version of BLAS first. This will give you an instant doubling of performance.

Secondly, look at a high-performance BLAS implementation such as OpenBLAS . OpenBLAS uses both vectography and parallelization (i.e., multi-core). This is a free (no cost) open source project.

Matlab internally uses the Intel MKL library, which you can also use with the Armadillo library. Intel MKL is a closed source, but is free for non-commercial use. Please note that OpenBLAS can get multiplication performance by a matrix that is on the same level or better than Intel MKL.

Note that high-performance linear algebra is easier to perform on Linux and Mac OS X than on Windows.

+12
source

Added to the above, you should also use a high level of optimization:

  • Be sure to use the O2 or O3 flag.

  • Link to the aforementioned high-performance (and possibly multi-threaded) BLAS libraries . AFAIK MKL is only available for Unix platforms, although if you use a Linux box such as cygwin inside windows, this should be fine, I think. OpenBLAS is also multithreaded.

  • In many libraries, setting the NDEBUG character (for example, passing the -DNDEBUG compiler flag) disables expensive range checking and assertions. Armadillo has its own symbol, ARMA_NO_DEBUG , which can either be set manually, or you can edit the config.hpp header file (located in the armadillo include directory) and uncomment the corresponding line. I assume that since you were able to enable external use of BLAS in armadillo, you should still be familiar with this configuration file ...

I made a quick comparison between armadillo / MKL_BLAS and Matlab on my intel core-i7 workstation. For C ++ exe, I used -O3, MKL BLAS and had ARMA_NO_DEBUG. I multiplied 1000x1000 random matrices 100 times and averaged the multiplication time. The C ++ implementation was about 1.5 times faster than Matlab.

Hope this helps

+4
source

Is there a way to extract the Matlab linear algebra libraries in some way and use them in C ++? Yes, for a C ++ call function call matlab, refer to this link: How to call Matlab functions from C ++

+1
source

Several C ++ libs for linear algebra provide an easy way to link to a highly optimized lib library.

take a look at http://software.intel.com/en-us/articles/intelr-mkl-and-c-template-libraries

You should be able to link Armadillo with MKL for more performance, but this is a commercial package,

0
source

Source: https://habr.com/ru/post/949404/


All Articles