Is there any way to find out if A is a submatrix of B?

I give a quote because I mean, for example:

B = [[1,2,3,4,5], [6,7,8,9,10], [11,12,13,14,15], [16,17,18,19,20]] 

suppose we select row 2,4 and col 1,3, the intersections will give us

 A = [[6,8], [16,18]] 

My question is, suppose I have A and B, is there a way to find out which rows and columns are selected from B to give A?

By the way, it would be better if you could give an answer in python / numpy. But just math or another programming language would be fine.

+6
source share
4 answers

This is a very difficult combinatorial task. In fact, the problem of isomorphism of subgraphs can be reduced to your problem (if the matrix A has only 0-1 entries, your problem is exactly the problem of isomorphism of the subgraph). This problem is known to be NP-complete.

Here is a recursive backtracking solution that is slightly better than coarse, forcing all possible solutions. Note that this still takes exponential time in the worst case. However, if you assume that there is a solution and that there is no ambiguity (for example, all entries in B different), this finds the solution in linear time.

 def locate_columns(a, b, offset=0): """Locate `a` as a sublist of `b`. Yields all possible lists of `len(a)` indices such that `a` can be read from `b` at those indices. """ if not a: yield [] else: positions = (offset + i for (i, e) in enumerate(b[offset:]) if e == a[0]) for position in positions: for tail_cols in locate_columns(a[1:], b, position + 1): yield [position] + tail_cols def locate_submatrix(a, b, offset=0, cols=None): """Locate `a` as a submatrix of `b`. Yields all possible pairs of (row_indices, column_indices) such that `a` is the projection of `b` on those indices. """ if not a: yield [], cols else: for j, row in enumerate(b[offset:]): if cols: if all(e == f for e, f in zip(a[0], [row[c] for c in cols])): for r, c in locate_submatrix(a[1:], b, offset + j + 1, cols): yield [offset + j] + r, c else: for cols in locate_columns(a[0], row): for r, c in locate_submatrix(a[1:], b, offset + j + 1, cols): yield [offset + j] + r, c B = [[1,2,3,4,5], [6,7,8,9,10], [11,12,13,14,15], [16,17,18,19,20]] A = [[6,8], [16,18]] for loc in locate_submatrix(A, B): print loc 

This will output:

 ([1, 3], [0, 2]) 
+10
source

If all you want to know is which rows and columns are selected from B to give A, and don't care about efficiency here, this is a brute force method that stores the results in an array res res [N] tells all locations A [ N] to B. Works even if A [N] exists in several places B.

 B = [[1,2,3,4,5],[6,7,8,9,10],[11,12,13,14,15],[16,17,18,19,20]] A = [[6,8], [16,18]] res = [] for subsetIndex, subset in enumerate(A): k = [] res.append(k) for supersetIndex, superset in enumerate(B): loc = [] try: loc = [(supersetIndex, superset.index(item)) for item in subset] k.append(loc) print A[subsetIndex], "is at ", loc, "in B" except ValueError: pass print res 

output:

 [6, 8] is at [(1, 0), (1, 2)] in B [16, 18] is at [(3, 0), (3, 2)] in B result = [[[(1, 0), (1, 2)]], [[(3, 0), (3, 2)]]] 
+1
source

All / most of the values ​​in the matrix are unique to IE: do they appear only once in matrix B?

The more unique the values, the better the improvement you can make on subgraph isomorphism (SI). If all values ​​are unique , then you can simply do a reverse search on each value to determine its pair of rows / columns, combine the list of rows and columns (separately).

The result is a simple O(N) algorithm, where N = number of rows * number of columns . Of course, the fewer unique values, the more false positives you get that you need to check, and the closer you are to SI, the less simple things get.

+1
source

Here is a brute force solution if all you need is:

 rows = [i for aa in A for i,bb in enumerate(B) if np.in1d(aa, bb).all()] cols = [i for aa in AT for i,bb in enumerate(BT) if np.in1d(aa, bb).all()] submatrix = B[np.ix_(rows, cols)] 

It checks every row A for every row B to make sure all submatrix elements are present. Then he does the same for the columns.

You can speed up the column search by restricting it to only the relevant rows:

 cols = [i for aa in AT for i,bb in enumerate(B[rows].T) if np.equal(aa, bb).all()] 
0
source

Source: https://habr.com/ru/post/978110/


All Articles