Optimization of data extraction from MATLAB matrix?

Question

Optimization of data extraction from MATLAB matrix?

For an n-dimensional matrix of values: what is the most efficient way to get values with arbitrary indices (i.e. coordinates)?

eg. in a 5x5 random matrix, if I need values in (1,1) (2,3) and (4,5), what is the most efficient way to return only values in these coordinates?

If I provide these coordinates in a separate matrix, for example, is there one MATLAB line that can do the job? Sort of:

x=rand(5,5); y=[[1,1];[2,3];[4,5]]; z=x(y);

Except that this does not work.

One caveat, however, for various reasons, I cannot use linear indexing - results must be returned using the original indexes. And the size of these matrices is potentially very large, so I do not want to use loops as well.

+1

matrix matlab

empedia Nov 05 '09 at 13:23

source share

3 answers

Why is sub2ind not suitable for this problem? I do not see the need for a logical mask; eg.

 z = x(sub2ind(size(x),y(:,1),y(:,2)))

should also work.

0

shabbychef Nov 05 '09 at 19:13

source share

Arriving at the party after the music stopped, but I could not help myself ...

If you need "full" indexing due to an error in the toolbar, and the toolbar loads only part of the matrix at a time, you might think about how to use the toolbar. Big gains in efficiency with large matrices are achieved with two things.

1) do not make copies of things that do not need to be copied; this includes, for example, creating a logical array of the size of the original matrix (although it is nominally "efficient", it takes one byte per element. If your matrix is too large to fit directly into memory, even a matrix that is 1/8 size is probably significant)

2) maintain memory consistency: access to memory "in the same region" or slowdown with a large number of disk replacement operations; even when everything fits into memory, maintaining “cache consistency” can lead to significant performance improvements. If you can access the elements of the matrix in the order in which they are stored, everything will speed up significantly.

To go to the first point, you need to look for a method that does not require a full copy (so Jacob's answer will be absent). To access the second one, you need to sort the indexes before accessing the matrix - this way, any elements that can be accessed "from the same memory block" will be.

The two methods are combined in the following. I assume that numel(y) << numel(x) - in other words, you are only interested in a relatively small number of x elements. If not, sorting the y-vector will actually slow you down a lot:

 x = rand(5,5); y = [1 1; 2 3; 4 5]; s = sub2ind(size(x), y(:,1), y(:,2)); % from the linear index we get access order [ySorted yOrder] = sort(s); % find the first, second index in the right access order: y1 = y(yOrder, 1); y2 = y(yOrder, 2); % access the array using conventional indexing: z = arrayfun(@(a,b)x(a,b), y1, y2); % now put things back in the right order: [rev revOrder] = sort(yOrder); z = z(revOrder);

I compared this using a 10000x10000 x matrix and a 5000x2 y random element search vector. Comparing with Jacob's code, I got

 my method: 51 ms his method: 225 ms

Increasing the size of the search vector to 50000x2, the values

 my method: 523 ms his method: 305 ms

In other words, which method will work best depends on the number of elements that you want to access. Also note that using the logical matrix L implicitly leads to sequential access to the large matrix x , but when you create this matrix, you accidentally gain access to memory ...

Please note that one question you had was “there is one liner” - and the answer is “yes”. If you have your arrays x and y , as defined, then

 z = arrayfun(@(a,b)x(a,b),y(:,1),y(:,2));

is really one line and does not use linear indexing ...

0

Floris Jul 31 '13 at 19:39

source share

Jacob · Accepted Answer · 2009-11-05T13:42:05+0000

If you are against using linear indexing and loops, the only alternative to AFAIK is logical indexing. But if y always comes in the form that you suggested, you need to create a logical matrix from the indices specified in y .

Could you explain why linear indexing is not allowed?

In any case, if you need a really stupid answer (which I can provide with this great information):

z = diag(x(y(:,1),y(:,2)))

Of course, it will be useless to create a huge matrix and extract diagonal elements (those that you need) from it, but this is done in one line, etc.

EDIT: If the constraint uses linear indexing of the source data, you can use linear indexing to create a logical matrix and index x with this. For instance.

 % Each element of L is only one byte L = false(size(x)); % Create the logical mask L(sub2ind(size(x),y(:,1),y(:,2))) = true; % Extract the required elements z = x(L);

Similarly, for a 3-dimensional matrix:

 x = rand(3,3,3); y = [1 1 1;2 2 2;3 3 3]; L = false(size(x)); L(sub2ind(size(x),y(:,1),y(:,2),y(:,3))) = true; z = x(L);

In addition, logical indexing should be faster than linear indexing, so besides creating a mask, you are in good shape.

Optimization of data extraction from MATLAB matrix?

More articles: