Quick MATLAB method to reorder columns using sorters

I have data (numeric M x N, n> 2) that comes sorted by the first column, and then by the second. Does anyone know of an efficient algorithm that converts data to sort by the second column and then the first? Obviously, sorts (data, [2,1]) do the trick, but I'm looking for something that uses the existing input data structure for more speed, since M is very large.

In addition, the data in the first two columns is a known set of integers (each of which is much smaller than M).

+4
source share
3 answers

Based on the reference documentation for MATLAB R2010b, the SORTROWS function uses the stable version of quicksort . Since stable sorting algorithms "support the relative order of records with equal keys," you can achieve what you want by simply resorting to the already sorted data with respect to the second column:

data = sortrows(data,2); 

This result will maintain the relative order of the elements in the first column, so that the data will be sorted first by the second column and then by the first column.

+5
source

Since the data in the first column is already sorted, you do not need to sort it again. This will be a little faster if you do:

 >> d = rand(10000,2); d = round(d*100); d = sortrows(d,1); >> tic; a1 = sortrows(d, 2); toc; Elapsed time is 0.006805 seconds. 

Versus:

 >> tic; a2 = sortrows(d, [2 1]); toc; Elapsed time is 0.010207 seconds. >> isequal(a1, a2) ans = 1 
+1
source

I continued to tear myself away from this, but could not get it faster than the sorting method. This leads to the fact that each key pair is unique, which I did not mention above.

 % This gives us unique rows of integers between one and 10000, sorted first % by column 1 then 2. x = unique(uint32(ceil(10000*rand(1e6,2))),'rows'); tic; idx = zeros(size(x,1),1); % Work out where each group of the second keys will start in the sorted output. StartingPoints = cumsum([1;accumarray(x(:,2),1)]); % Work out where each group of the first keys is in the input. Ends = find([~all(diff(x(:,1),1,1)==0,2);true(1,1)]); Starts = [1;Ends(1:(end-1))+1]; % Build the index. for i = 1:size(Starts) temp = x(Starts(i):Ends(i),2); idx(StartingPoints(temp)) = Starts(i):Ends(i); StartingPoints(temp) = StartingPoints(temp) + 1; end % Apply the index. y = x(idx,:); toc tic; z = sortrows(x,2); toc isequal(y,z) 

It gives 0.21 seconds for my algorithm and 0.18 for the second (stable for different random seeds).

If anyone sees further speed (except mex), feel free to add.

0
source

Source: https://habr.com/ru/post/1337800/


All Articles