How to remove duplicates from a list of matrices based on some equivalence relation?

Question

How to remove duplicates from a list of matrices based on some equivalence relation?

Given some list of symmetric integer matrices, I want to remove all duplicates in the following equivalence relation:

Two matrices kxk M1 , M2 equivalent if there exists some permutation s on {1,...,k} such that for all i and j in {1,...,k} we have M1_ij = M2_s(i)s(j) , i.e. two matrices are “equal” if I can get one from the other, while rearranging its rows and columns.

Unfortunately, my naive approach (when building a list, check that any of the permutations of the new matrix is already in the list) is too slow.

Probably the faster alternative that I could think of would be to put all matrices into a list, rearrange them to some “canonical permutation”, and then delete duplicates, as described, for example. here . However, I am not sure how to achieve such a "canonical permutation" in the code.

To narrow it down a bit: matrices are relatively small ( k <= 4 ), the list will contain a 5 or 6-digit number of matrices, and the dtype for matrices should be an integer type (currently intc , but I could change that).

The order of the final list does not matter, and if there is any representative of each equivalence class. If necessary, the whole process can take a number of hours, but not days.

Is there a reasonably effective way to achieve this? Have I (once again) missed some cool NumPy or SciPy tool that will help me with this?

As requested, some small examples showing how this equivalence relation works:

Matrices {{1,1,1},{1,2,0},{1,0,3}} and {{1,1,1},{1,3,0},{1,0,2}} equivalent, since the permutation {1,2,3}->{1,3,2} converts one into another.

Matrices {{1,1,1},{1,2,0},{1,0,3}} and {{1,1,0},{1,2,1},{0,1,3}} not equivalent, you cannot change the position of 1s without rearranging the diagonal.

+5

python numpy

Baum mit augen Aug 17 '17 at 13:13

source share

3 answers

PAb · Answer 1 · 2017-08-17T14:23:47+0000

This is an algebraic answer. I suspect there should be a more satisfactory combinatorial response.

You say that two matrices M and M 'are equivalent if there exists a permutation matrix P such that M' = P ^ {- 1} M P.

We use our own expansion of M and M ':

M = Q ^ {- 1} DQ

M = Q '^ {- 1} D' Q '

Where D and D 'are diagonal matrices containing eigenvalues, and Q and Q' are orthogonal matrices.

We can rewrite equality as:

D = D 'before permutation (i.e., two matrices must have the same eigenvalues )

Q '= PQ

Testing the second condition is easy. Given that Q is orthogonal, it reduces to checking whether the matrix point (Q, Q'.T) is a permutation matrix, i.e. If it has only one "1" for each row and column.

The design for the algorithm is as follows:

Take M and M '
Calculate the proper decomposition of (Q, D) and (Q ', D') from M and M '(using np.linalg.eigh)
If they do not have the same eigenvalues (up to numerical accuracy, of course), they are not equivalent
Else, compute np.dot(Q, Q'.T) and check if this is a permutation matrix

I think the bottleneck is its own decomposition, but you only need to do this once per matrix. Hopefully the first test quickly drops many matrices.

Hope this helps.

Paul · Answer 2 · 2017-08-17T13:52:09+0000

You can imagine your matrices as representing the adjacency / weight matrix of the graph, and then check if these two graphs are isomorphic to each other. networkx has a convenient function (and can be installed via pip).

 import numpy as np import networkx as nx from networkx.algorithms.isomorphism import numerical_edge_match # create matrices n = 4 a = np.random.randint(0, 10, size=(n,n)) a = a + aT # ie symmetric b = np.rot90(a, k=2) # ie a rotated by 180 degrees c = np.ones((n,n), dtype=np.int) # counter-example # create graphs ga = nx.from_numpy_matrix(a) gb = nx.from_numpy_matrix(b) gc = nx.from_numpy_matrix(c) # test if isomorphic print "a isomorphic with b:", nx.is_isomorphic(ga, gb, edge_match=numerical_edge_match('weight', 1)) # True print "a isomorphic with c:", nx.is_isomorphic(ga, gc, edge_match=numerical_edge_match('weight', 1)) # False

Jürg merlin spaak · Answer 3 · 2017-08-17T13:38:47+0000

Just use your canonical approach. Find the largest entry in the matrix, put it in the upper right corner. Sort the first column and the first row by them.

 A = np.array([[1,2,3,5], [3,6,2,6], [3,5,7,2], [1,3,6,3]]) a = np.where(A == np.amax(A)) sort_colums = np.argsort(A[a[0]].ravel())[::-1] sort_rows = np.argsort(A[:,a[1]].ravel())[::-1] Col_sorted = A[:,sort_colums] Equiv_class = Col_sorted[sort_rows] #returns [[7, 5, 3, 2], [6, 3, 1, 3], [3, 2, 1, 5], [2, 6, 3, 6]]

As noted in the comments, this only works when matrix entries are found only once. If they occur several times, but not so often, then you can adapt this method by creating several equivalence classes.

How to remove duplicates from a list of matrices based on some equivalence relation?

More articles: