Fast (exploded) vector operations MATLAB ... very fast

I have been writing MATLAB scripts since some time, and yet, I don’t understand how this works “under the hood”. Consider the following script that performs some calculations using (large) vectors in three different ways:

  • Vector operations MATLAB;
  • Simple for a loop that performs the same calculations across components;
  • An optimized cycle, which should be faster than 2., so as to avoid some allocation and some assignment.

Here is the code:

N = 10000000; A = linspace(0,100,N); B = linspace(-100,100,N); C = linspace(0,200,N); D = linspace(100,200,N); % 1. MATLAB Operations tic C_ = C./A; D_ = D./B; G_ = (A+B)/2; H_ = (C_+D_)/2; I_ = (C_.^2+D_.^2)/2; X = G_ .* H_; Y = G_ .* H_.^2 + I_; toc tic X; Y; toc % 2. Simple cycle tic C_ = zeros(1,N); D_ = zeros(1,N); G_ = zeros(1,N); H_ = zeros(1,N); I_ = zeros(1,N); X = zeros(1,N); Y = zeros(1,N); for i = 1:N, C_(i) = C(i)/A(i); D_(i) = D(i)/B(i); G_(i) = (A(i)+B(i))/2; H_(i) = (C_(i)+D_(i))/2; I_(i) = (C_(i)^2+D_(i)^2)/2; X(i) = G_(i) * H_(i); Y(i) = G_(i) * H_(i)^2 + I_(i); end toc tic X; Y; toc % 3. Opzimized cycle tic X = zeros(1,N); Y = zeros(1,N); for i = 1:N, X(i) = (A(i)+B(i))/2 * (( C(i)/A(i) + D(i)/B(i) ) /2); Y(i) = (A(i)+B(i))/2 * (( C(i)/A(i) + D(i)/B(i) ) /2)^2 + ( (C(i)/A(i))^2 + (D(i)/B(i))^2 ) / 2; end toc tic X; Y; toc 

I know that one should always try to vectorize the calculations after being built by MATLAB on matrices / vectors (thus, this is currently not always the best choice), so I expect something like:

 C = A .* B; 

faster than:

 for i in 1:N, C(i) = A(i) * B(i); end 

What I do not expect is that it is really faster even in the above script, despite the fact that the second and third methods that I use only go through one cycle, while the first method performs many vector operations (which is theoretically are a "for" loop every time). This leads me to conclude that MATLAB has some magic that allows (for example):

 C = A .* B; D = C .* C; 

will be faster than one “for” cycle with some work inside it.

So:

  • What is the magic that allows you to quickly complete the 1st part?
  • when you write "D = A. * B", does MATLAB actually do component calculation with a for loop, or just keep track that D contains some multiplication of "bla" and "bla"?

EDIT

  • Suppose I want to implement the same calculations using C ++ (using maybe some library). Will the first MATLAB method be faster than the third implemented in C ++? (I will answer this question myself, just give me some time.)

EDIT 2

According to the request, there is an experiment execution time:

Part 1: 0.237143

Part 2: 4.440132 of which 0.195154 for distribution

Part 3: 2.280640 of which 0.057500 for placement

and without JIT:

Part 1: 0.337259

Part 2: 149.602017 of which 0.033886 for placement

Part 3: 82.167713 of which 0.010852 for distribution

+4
source share
2 answers

The first one is the fastest because vector code can be easily interpreted for a small number of optimized C ++ library calls. Matlab can also optimize it at a higher level, for example, replace G*H+I optimized mul_add(G,H,I) instead of add(mul(G,H),I) in its core.

The second cannot be easily converted to C ++ calls. It needs to be interpreted or compiled. The most modern approach for scripting languages ​​is JIT compilation. The Matlab JIT compiler is not very good, but that does not mean that it is. I do not know why MathWorks do not improve it. Thus, # 2 is so slow that # 1 is faster, it even does more “math” operations.

Julia’s language was invented as a compromise between Matlab expression and C ++ speed. The same non-vectorized code ( julia vs matlab ) is very fast because JIT compilation is very good.

+3
source

Regarding performance optimization, I follow the @memyself suggestion using a profiler for both approaches, as stated in the 'for' loop vs vectorization in MATLAB .

For educational purposes, it makes sense to experiment with numerical algorithms, for anything else I would go with well-established libraries .

0
source

Source: https://habr.com/ru/post/1483673/


All Articles