Dense vector update by sparse vector in Julia slow

I am using Julia version 0.4.5 and I have the following problem: As far as I know, taking an internal product between a sparse vector and a dense vector should be as fast as updating a dense vector with a sparse vector. The latter is much slower.

A = sprand(100000,100000,0.01) w = rand(100000) @time for i=1:100000 w += A[:,i] end 26.304380 seconds (1.30 M allocations: 150.556 GB, 8.16% gc time) @time for i=1:100000 A[:,i]'*w end 0.815443 seconds (921.91 k allocations: 1.540 GB, 5.58% gc time) 

I created a simple sparse matrix type, native code, and the add code is the same as the internal product.

Am I doing something wrong? I feel that there must be a special function that performs the operation w + = A [:, i], but I could not find it.

Any help is appreciated.

+5
source share
2 answers

I asked the same question about GitHub, and we came to the following conclusion. The SparseVector type was added with Julia 0.4 and with it the BLAS function LinAlg.axpy !, which updates in place the (possibly dense) vector x sparse vector y times scalar a , i.e. efficiently performs x += a*y . However, in Julia 0.4 it is not executed properly. It only works in Julia 0.5

 @time for i=1:100000 LinAlg.axpy!(1,A[:,i],w) end 1.041587 seconds (799.49 k allocations: 1.530 GB, 8.01% gc time) 

However, this code is still not optimal, as it creates SparseVector A [:, i]. You can get an even faster version with the following function:

 function upd!(w,A,i,c) rowval = A.rowval nzval = A.nzval @inbounds for j = nzrange(A,i) w[rowval[j]] += c* nzval[j] end return w end @time for i=1:100000 upd!(w,A,i,1) end 0.500323 seconds (99.49 k allocations: 1.518 MB) 

This is exactly what I needed to achieve, after some research we managed to get there, thanks to everyone!

+1
source

Assuming you want to calculate w += c * A[:, i] , there is a simple way to vectorize it:

 >>> A = sprand(100000, 100000, 0.01) >>> c = rand(100000) >>> r1 = zeros(100000) >>> @time for i = 1:100000 >>> r1 += A[:, i] * c[i] >>> end 29.997412 seconds (1.90 M allocations: 152.077 GB, 12.73% gc time) >>> @time r2 = sum(A .* c', 2); 1.191850 seconds (50 allocations: 1.493 GB, 0.14% gc time) >>> all(r1 == r2) true 

First, create a vector c constants to be multiplied. Then multiply columns A by elements by the value of c ( A .* c' , it translates inside). Finally, reduce the columns of A (part of sum(.., 2) ).

0
source

Source: https://habr.com/ru/post/1247301/


All Articles