Create a matrix containing all combinations of elements taken from n vectors

This question arises quite often in one form or another (see here or here ), so I decided to present it in general form and give an answer that may serve for future reference.

Given an arbitrary number of n vectors, possibly of different sizes, generate an n -column matrix whose rows describe all combinations of elements taken from these vectors (Cartesian product).

For example,

 vectors = { [1 2], [3 6 9], [10 20] } 

should give

 combs = [ 1 3 10 1 3 20 1 6 10 1 6 20 1 9 10 1 9 20 2 3 10 2 3 20 2 6 10 2 6 20 2 9 10 2 9 20 ] 
+44
arrays matrix matlab combinations cartesian-product
Feb 20 '14 at 0:02
source share
4 answers

The ndgrid function almost gives an answer, but has one caveat: n output variables must be explicitly defined for invocation. Since n arbitrary, it is best to use a comma-separated list (generated from an array of cells with n cells) for output. The resulting n matrices are then combined into the desired n -column matrix:

 vectors = { [1 2], [3 6 9], [10 20] }; %// input data: cell array of vectors n = numel(vectors); %// number of vectors combs = cell(1,n); %// pre-define to generate comma-separated list [combs{end:-1:1}] = ndgrid(vectors{end:-1:1}); %// the reverse order in these two %// comma-separated lists is needed to produce the rows of the result matrix in %// lexicographical order combs = cat(n+1, combs{:}); %// concat the n n-dim arrays along dimension n+1 combs = reshape(combs,[],n); %// reshape to obtain desired matrix 
+41
Feb 20 '14 at 0:02
source share

A little easier ... if you have the Neural Network toolkit, you can just use combvec :

 vectors = {[1 2], [3 6 9], [10 20]}; combs = combvec(vectors{:}).' % Use cells as arguments 

which returns the matrix in a slightly different order:

 combs = 1 3 10 2 3 10 1 6 10 2 6 10 1 9 10 2 9 10 1 3 20 2 3 20 1 6 20 2 6 20 1 9 20 2 9 20 

If you need a matrix that is in the question, you can use sortrows :

 combs = sortrows(combvec(vectors{:}).') % Or equivalently as per @LuisMendo in the comments: % combs = fliplr(combvec(vectors{end:-1:1}).') 

which gives

 combs = 1 3 10 1 3 20 1 6 10 1 6 20 1 9 10 1 9 20 2 3 10 2 3 20 2 6 10 2 6 20 2 9 10 2 9 20 

If you look at the internals of combvec (type edit combvec in the command window), you will see that it uses a different code than @LuisMendo. I can’t say it is more efficient overall.

If you have a matrix whose rows are akin to an earlier array of cells, you can use:

 vectors = [1 2;3 6;10 20]; vectors = num2cell(vectors,2); combs = sortrows(combvec(vectors{:}).') 
+25
Feb 20 '14 at 0:23
source share

I conducted a comparative analysis of the two proposed solutions. The benchmarking code is based on the timeit function and is included at the end of this post.

I consider two cases: three vectors of size n and three vectors of sizes n/10 , n and n*10 respectively (both cases give the same number of combinations). n changes to a maximum of 240 (I choose this value to avoid using virtual memory on my laptop).

The results are shown in the following figure. It can be seen that the solution based on ndgrid takes less time than combvec . It is also interesting to note that the time spent on combvec varies to a lesser extent in different cases.

enter image description here




Benchmarking code

Function for solving ndgrid :

 function combs = f1(vectors) n = numel(vectors); %// number of vectors combs = cell(1,n); %// pre-define to generate comma-separated list [combs{end:-1:1}] = ndgrid(vectors{end:-1:1}); %// the reverse order in these two %// comma-separated lists is needed to produce the rows of the result matrix in %// lexicographical order combs = cat(n+1, combs{:}); %// concat the n n-dim arrays along dimension n+1 combs = reshape(combs,[],n); 

Function for combvec solution:

 function combs = f2(vectors) combs = combvec(vectors{:}).'; 

Script to measure time by calling timeit for these functions:

 nn = 20:20:240; t1 = []; t2 = []; for n = nn; %//vectors = {1:n, 1:n, 1:n}; vectors = {1:n/10, 1:n, 1:n*10}; t = timeit(@() f1(vectors)); t1 = [t1; t]; t = timeit(@() f2(vectors)); t2 = [t2; t]; end 
+11
Jun 26 '14 at 22:56
source share

Here is my method that made me giggle with delight using nchoosek , although it is no better than @Luis Mendo made the decision.

In the above example, after 1000 starts, this solution took my car an average of 0.00065935 s compared to the decision made 0.00012877 s. For larger vectors following @Luis Mendo benchmarking, this solution is consistently slower than the accepted answer. However, I decided to publish it in the hope that maybe you will find something useful:

The code:

 tic; v = {[1 2], [3 6 9], [10 20]}; L = [0 cumsum(cellfun(@length,v))]; V = cell2mat(v); J = nchoosek(1:L(end),length(v)); J(any(J>repmat(L(2:end),[size(J,1) 1]),2) | ... any(J<=repmat(L(1:end-1),[size(J,1) 1]),2),:) = []; V(J) toc 

gives

 ans = 1 3 10 1 3 20 1 6 10 1 6 20 1 9 10 1 9 20 2 3 10 2 3 20 2 6 10 2 6 20 2 9 10 2 9 20 Elapsed time is 0.018434 seconds. 

Explanation:

L gets the length of each vector using cellfun . Although cellfun is basically a loop, it is effective here, given that your number of vectors should be relatively low to make this problem even practical.

V combines all vectors for easy access later (this assumes that you have entered all your vectors as strings. V 'will work for column vectors.)

nchoosek gets all the ways to select elements n=length(v) from the total number of elements L(end) . There will be more combinations than what we need.

 J = 1 2 3 1 2 4 1 2 5 1 2 6 1 2 7 1 3 4 1 3 5 1 3 6 1 3 7 1 4 5 1 4 6 1 4 7 1 5 6 1 5 7 1 6 7 2 3 4 2 3 5 2 3 6 2 3 7 2 4 5 2 4 6 2 4 7 2 5 6 2 5 7 2 6 7 3 4 5 3 4 6 3 4 7 3 5 6 3 5 7 3 6 7 4 5 6 4 5 7 4 6 7 5 6 7 

Since there are only two elements in v(1) , we need to throw out any lines where J(:,1)>2 . Similarly, where J(:,2)<3 , J(:,2)>5 , etc. Using L and repmat , we can determine if each element of J in the corresponding range, and then use any to remove lines that are any bad element.

Finally, these are not the actual values ​​from V , but only the indices. V(J) will return the desired matrix.

+2
Sep 19 '15 at 2:44
source share



All Articles