Vectorization
A first look at the for-loop code tells us that since photAbs is a binary array, each of which scales according to each Data element, this binary function can be used for vectorization. This is abuse in code here -
function RNGData = RNGAnsys_vect1(N,Data) %// Get the 2D Matrix of random ones and zeros photAbsAll = randint(N,numel(Data)); %// Take care of multData internally by summing along the columns of the %// binary 2D matrix and then multiply each element of it with each scalar %// taken from Data by performing elementwise multiplication sumData = Data.*sum(photAbsAll,1); %// Divide by n, but account for 0.5 average by n/2 RNGData = (sumData./(N/2))'; %//' return;
After profiling, it seems that the bottleneck is the random component of the binary array. Thus, using the faster random binary array creator proposed in this smart solution , the above function can be optimized in the same way -
function RNGData = RNGAnsys_vect2(N,Data) %// Create a random binary array and sum along the columns on the fly to %// save on any variable space that would be required otherwise. %// Also perform the elementwise multiplication as discussed before. sumData = Data.*sum(rand(N,numel(Data))<0.5,1); %// Divide by n, but account for 0.5 average by n/2 RNGData = (sumData./(N/2))'; %//' return;
Using the creator of an intelligent binary arbitrary array, the source code can also be optimized, which will be used for a fair comparative analysis between cycle-optimized and vectorized codes later. Here is the code optimized for the loop -
function RNGData = RNGAnsys_opt1(N,Data) multData = zeros(N,numel(Data)); for i = 1:numel(Data) %// Create N number of random 0 or 1 using a smart approach %// Then, multiply each element in the molar data by the random numbers multData(:,i) = Data(i) * rand(N,1)<.5; end sumData = sum(multData,1); % sum each individual energy level data point RNGData = (sumData/(N/2))'; % divide by n, but account for 0.5 average by n/2 return;
Benchmarking
Benchmarking code
N = 15000; %// Kept at this value as it going out of memory with higher N's. %// Size of dataset is more important anyway as that decides how %// well is vectorized code against a for-loop code DS_arr = [50 100 200 500 800 1500 5000]; %// Dataset sizes timeall = zeros(2,numel(DS_arr)); for k1 = 1:numel(DS_arr) DS = DS_arr(k1); Data = rand(1,DS); f = @() RNGAnsys_opt1(N,Data);%// Optimized for-loop code timeall(1,k1) = timeit(f); clear f f = @() RNGAnsys_vect2(N,Data);%// Vectorized Code timeall(2,k1) = timeit(f); clear f end %// Display benchmark results figure,hold on, grid on plot(DS_arr,timeall(1,:),'-ro') plot(DS_arr,timeall(2,:),'-kx') legend('Optimized for-loop code','Vectorized code') xlabel('Dataset size ->'),ylabel('Time(sec) ->') avg_speedup = mean(timeall(1,:)./timeall(2,:)) title(['Average Speedup with vectorized code = ' num2str(avg_speedup) 'x'])
results

Concluding observations
Based on the experience that I have had so far with MATLAB , neither for loops nor for vectorized methods are suitable for all situations, but it all depends on the specific situation.