Vectorization Steps
1] Execute sum(bsxfun(@times, V(:,t:end), K(:,t)),1) for all columns in V for all columns in K with matrix multiplication -
sum_mults = V.'*K
This will give us a 2D array with each column representing the operation sum(bsxfun(@times,.. at each iteration.
2] Step 1 gave us all the possible amounts, as well as the values โโthat need to be summed, do not align in the same line by iteration, so we need to do a little more work before summing the lines. The rest of the work involves getting a shifted version. To do this, you can use logical indexing with an upper and lower triangular boolean mask. Finally, we summarize each line for final output. So this piece of code will look like this -
valid_mask = tril(true(size(sum_mults))); sum_mults_shifted = zeros(size(sum_mults)); sum_mults_shifted(flipud(valid_mask)) = sum_mults(valid_mask); out = sum(sum_mults_shifted,2);
Runtime Tests -
%// Inputs V = rand(1000,1000); K = rand(1000,200); disp('--------------------- With original loopy approach') tic [MN] = size(K); S = zeros(1, M); for t = 1 : N S(1,1:end-t+1) = S(1,1:end-t+1) + sum(bsxfun(@times, V(:,t:end), K(:,t)),1); end toc disp('--------------------- With proposed vectorized approach') tic sum_mults = V.'*K; %//' valid_mask = tril(true(size(sum_mults))); sum_mults_shifted = zeros(size(sum_mults)); sum_mults_shifted(flipud(valid_mask)) = sum_mults(valid_mask); out = sum(sum_mults_shifted,2); toc
Exit -
--------------------- With original loopy approach Elapsed time is 2.696773 seconds. --------------------- With proposed vectorized approach Elapsed time is 0.044144 seconds.
source share