MATLAB: the fastest way to count unique numbers from two combinations of numbers in an integer vector

For a vector of integers such as:

X = [1 2 3 4 5 1 2] 

I would like to find a very fast way to count the number of unique combinations with 2 elements.

In this case, a combination of two numbers:

 [1 2] (occurs twice) [2 3] (occurs once) [3 4] (occurs once) [4 5] (occurs once) [5 1] (occurs once) 

In its current form, I do it in MATLAB as follows

 X = [1 2 3 4 5 1 2]; N = length(X) X_max = max(X); COUNTS = nan(X_max); %store as a X_max x X_max matrix for i = 1:X_max first_number_indices = find(X==1) second_number_indices = first_number_indices + 1; second_number_indices(second_number_indices>N) = [] %just in case last entry = 1 second_number_vals = X(second_number_indices); for j = 1:X_max COUNTS(i,j) = sum(second_number_vals==j) end end 

Is there a faster / smarter way to do this?

+6
source share
2 answers

Here is a super fast way:

 >> counts = sparse(x(1:end-1),x(2:end),1) counts = (5,1) 1 (1,2) 2 (2,3) 1 (3,4) 1 (4,5) 1 

You can convert to a full matrix just like: full(counts)


Here is the equivalent solution using accumarray :

 >> counts = accumarray([x(1:end-1);x(2:end)]', 1) counts = 0 2 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 
+12
source

EDIT: @Amro provided a much better solution (well, which is better in the vast majority of cases, I suspect that my method will work better if MaxX very large and X contains zeros - this is because the presence of zeros precludes the use of sparse and the big MaxX slows down the accumarray approach as it creates a MaxX matrix from MaxX).

EDIT: Thanks to @EitanT for specifying the improvement that can be made with accumarray .

Here is how I would decide:

 %Generate some random data T = 20; MaxX = 3; X = randi(MaxX, T, 1); %Get the unique combinations and an index. Note, I am assuming X is a column vector. [UniqueComb, ~, Ind] = unique([X(1:end-1), X(2:end)], 'rows'); NumComb = size(UniqueComb, 1); %Count the number of occurrences of each combination Count = accumarray(Ind, 1); 

All unique consecutive combinations of two elements are now stored in UniqueComb , and the corresponding counts for each unique combination are stored in Count .

+1
source

Source: https://habr.com/ru/post/944497/


All Articles