What is a quick way to group values ​​by index?

I have an array of indices, I and values, X, and I want to create an array of C cells, so that C {i} = X (I == i). What is the fastest and best way to calculate C?

The easiest way is to evaluate C{i} = X(I==i) for all unique s in s (approach 1):

 for i = unique(I) C{i} = X(I == i); end 

Another naive approach would be to loop through all of me in I and add the corresponding x to C (approach 2):

 C = cellfun(@(x)(zeros(1,0)),cell(1,max(indices)),'UniformOutput',false); for j = 1:length(I) i = I(j); C{i} = cat(2,C{i},X(j)); end 

None of the approaches are very fast. For comparison, give the opportunity to generate some test data:

 I = floor(rand(1,N)*M)+1; X = rand(1,N); 

With N = 1000000, M = 1000 , two approaches are used:

  • Approach 1: 4.79 seconds
  • Approach 2: 11.1 seconds

Here approach 1 is better (still very slow). Changing the problem parameters to N = 1000000, M = 10000 significantly changes the situation:

  • Approach 1: 48.5 seconds
  • Approach 2: 10.3 seconds

In principle, both approaches are several orders of magnitude too slow. What is the best way to evaluate C?

Edit: The correct answer is obviously Jonas below. I use the results for comparison. Compared to the methods above, the order of the elements in C is different. In addition, the following gives an identical conclusion:

 C = accumarray(I',X,[],@(x){x'})'; 
  • N = 100000, M = 1000 : 0.0397 seconds
  • N = 100000, M = 10000 : 0.145 seconds
+4
source share
1 answer

The fastest way to write (and perhaps the fastest to run) is accumarray

 C = accumarray(I,X,[],@(x){x}); 
+5
source

Source: https://habr.com/ru/post/1302506/


All Articles