In MATLAB, I have a for loop
that has a lot of interactions to go through and populate the sparse
matrix. The program is very slow, and I would like to optimize it to see it soon. On two lines, I use the find
, and the MATLAB editor warns me that using logical indexing
instead of find
will improve performance. My code is very similar to the one presented to the mathworks newreader reader, the mathworks newsreader recommendation , where there is a vector of values ββand a vector of unique value generated from it. Uses find
to get an index in unique values ββ(to update values ββin a matrix). In short, the code is:
positions = find(X0_outputs == unique_outputs(j,1)); % should read positions = X0_outputs == unique_outputs(j,1);
But the last line is not an index, but a vector of zeros and ones. I have an illustrative example, make a set of indices; tt=round(rand(1,6)*10)
:
tt = 3 7 1 7 1 7
Make a unique vector; ttUNI=unique(tt)
ttUNI = 1 3 7
Use find to get the position index of a value in a set of unique values; find(ttUNI(:) == tt(1))
ans = 2
Compare using logical indexing; (ttUNI(:) == tt(1))
ans = 0 1 0
Having a value of 2
much more useful than this binary vector when I need to update indexes for a matrix. For my matrix, I can say mat(find(ttUNI(:) == tt(1)), 4)
and it works. Whereas using (ttUNI(:) == tt(1))
requires further processing.
Is there a neat and efficient way to do what is needed? Or is the use of find
inevitable under such circumstances?
UPDATE I will include the user recommended code here: @Jonas to better understand the problem that I have and report some results of the profiling tool.
ALL_NODES = horzcat(network(:,1)',network(:,2)'); NUM_UNIQUE = unique(ALL_NODES);%unique and sorted UNIQUE_LENGTH = length(NUM_UNIQUE); TIME_MAX = max(network(:,3)); WEEK_NUM = floor((((TIME_MAX/60)/60)/24)/7);%divide seconds for minutes, for hours, for days and how many weeks %initialize tensor of temporal networks temp = length(NUM_UNIQUE); %making the tensor a sparse 2D tensor!!! So each week is another replica of %the matrix below Atensor = sparse(length(NUM_UNIQUE)*WEEK_NUM,length(NUM_UNIQUE)); WEEK_SECONDS = 60*60*24*7;%number of seconds in a week for ii=1:size(network,1)%go through all rows/observations WEEK_NOW = floor(network(ii,3)/WEEK_SECONDS) + 1; if(WEEK_NOW > WEEK_NUM) disp('end of weeks') break end data_node_i = network(ii,1); Atensor_row_num = find(NUM_UNIQUE(:) == data_node_i)... + (WEEK_NOW-1)*UNIQUE_LENGTH; data_node_j = network(ii,2); Atensor_col_num = find(NUM_UNIQUE(:) == data_node_j); %Atensor is sparse Atensor(Atensor_row_num,Atensor_col_num) = 1; end
Here UNIQUE_LENGTH = 223482
and size(network,1)=273209
. I wounded the profiler tool
for several minutes, which was not enough to complete the program, but to achieve a stable state when the time ratio did not change too much. Atensor_row_num = find(NUM_UNI..
45.6% and Atensor_col_num = find(NUM_UNI...
43.4% . The line with Atensor(Atensor_row_num,Atenso...
, which highlights the values ββin the sparse
matrix, is only 8.9% . The length of the NUM_UNIQUE
vector NUM_UNIQUE
quite large, so find
is an important aspect of the code; even more important than a little matrix manipulation. Any improvement here would be significant. I don't know if there is a more efficient logical progression for this algorithm, and not for a simple replacement approach find
.