Efficient Elementary Matrix Operations in Julia

I need to perform a discretized convolution of (complex) matrices and define the following function in Julia:

function convolve(M::Array{Complex{Float64},2}, K::Array{Float64,2}, p::Int) (n,m) = size(M) res = zeros(Complex{Float64},n) for k=1:p for l=1:n res[l] += M[l,k]*K[l,end-p+k] end end return res end 

I use it like this:

 M=complex(rand(2000,2000)) K=rand(2000,2000) @time convolve(M,K,2000,0) 

Now it is relatively fast and surprisingly faster (about 3 times) than the vectorized version, where I replace the inner loop with res += M[:,k].*K[:,end-p+k] . (I think this is due to the large memory allocation for temporary arrays, I can live with it).

But the vectorized MATLAB code runs about 5 times faster:

 function res = convolve(M, K, p) n = size(M,1); res = zeros(n,1); for k=1:p res = res + M(:,k).*K(:,end-p+k); end end 

What am I doing wrong, and how do I get Julia to perform such multiple multiplications as fast as MATLAB? Is this an indexing issue?

Note. I checked with @code_warntype that there is no funny thing with type indecision (no Any or Union , etc.), but the problem can be more subtle. The @code_llvm macro gives an amazingly long output, but I'm not an expert, so it's hard for me to understand what is happening.

+5
source share
1 answer

The next version on my machine is faster:

 function convolve2(M::Array{Complex{Float64},2}, K::Array{Float64,2}, p::Int) (n,m) = size(M) res = zeros(Complex{Float64},n) offset = size(K,2)-p (p>m || offset<0) && error("parameter p ($p) out of bounds") @inbounds for k=1:p @inbounds @simd for l=1:n res[l] += M[l,k]*K[l,offset+k] end end return res end 

Check out the @simd add- @simd , which uses vector instructions now on many CPUs.

EDIT: An attempt at performance in the OP code seems to be due to the use of end in the K index in the hot loop line. Overriding Base.trailingsize with @inline makes LLVM inline end (on my machine) and makes two versions work at the same speed. Used code:

 import Base: trailingsize @inline function Base.trailingsize(A, n) s = 1 for i=n:ndims(A) s *= size(A,i) end return s end 

See comments on issue # 19389 regarding this.

+3
source

Source: https://habr.com/ru/post/1260145/


All Articles