Find a matrix in a large matrix

I have a very large n*m S matrix. I want to effectively determine if a submatrix F exists in S A large S matrix may have a size other than 500*500 .

To clarify, consider the following:

 S = 1 2 3 4 5 6 7 8 9 F1 = 2 3 5 6 F2 = 1 2 4 6 

In this case:

  • F1 is inside S
  • F2 not inside S

Each element in the matrix is โ€‹โ€‹a 32-bit integer. I can only think of using a brute force approach to determine if F submatrix of S I found googled to find an efficient algorithm, but I can not find anything.

Is there any algorithm or principle to make this faster? (Or maybe some sort of brute force approach optimization method?)

PS statistics data

 A total of 8 S On average, each S will be matched against about 44 F. The probability of success match (ie F appears in a S) is 19%. 
+4
source share
9 answers

If you want to query several times for the same large matrix and submatrix of the same size. There are many solutions for pre-processing a large matrix.

A similar (or even the same) problem here.

The fastest way to find the mxn submatrix in the MXN matrix

+1
source

It includes matrix pre-processing. It will be hard in memory, but it should be better in terms of computation time.

  • Make sure the submatrix size is smaller than the matrix size before checking.
  • When constructing a matrix, build a construct that maps the value in the matrix to an array of positions (x, y) in the matrix. This will allow you to check for the presence of a submatrix in which candidates may be present . You should use the value in (0,0) in the sub-matrix and get the possible positions of this value in the larger matrix. If the list of positions is empty, you have no candidates, so the submatrix does not exist. There is a beginning (more experienced people may consider this a naive approach).
+1
source

Since you only want to know if a given matrix is โ€‹โ€‹inside another large matrix. If you know how to use Matlab code from C ++, you can directly use ismember from Matlab. Another way might be trying to figure out how ISMB works in Matlab, and then implement the same thing in C ++.

See Find Submatrix Location

+1
source

Since you also flagged the question as C++ , I provide this code. This is a brute force method and certainly not an ideal solution to this problem. For the main matrix SXT and a MXN Sub Matrix, the time complexity of the O(STMN) algorithm.

 cout<<"\nEnter the order of the Main Matrix"; cin>>S>>T; cout<<"\nEnter the order of the Sub Matrix"; cin>>M>>N; // Read the Main Matrix into MAT[S][T] // Read the Sub Matrix into SUB[M][N] for(i=0; i<(SM); i++) { for(j=0; j<(TN); j++) { flag=0; for(p=0; p<M; p++) { for(q=0; q<N; q++) { if(MAT[i+p][j+q] != SUB[p][q]) { flag=1; break; } } if(flag==0) { cout<<"Match Found in the Main Matrix at starting location "<<(i+1) <<"X"<<(j+1); break; } } if(flag==0) { break; } } if(flag==0) { break; } } 
+1
source

Modified deepu-benson code

 int Ma[][5]= { {0, 0, 1, 0, 0}, {0, 0, 1, 0, 0}, {0, 1, 0, 0, 0}, {0, 1, 0, 0, 0}, {1, 1, 1, 1, 0} }; int Su[][3]= { {1, 0, 0}, {1, 0, 0}, }; int S = 5;// Size of main matrix row int T = 5;//Size of main matrix column int M = 2; // size of desire matrix row int N = 3; // Size of desire matrix column int flag, i,j,p,q; for(i=0; i<=(SM); i++) { for(j=0; j<=(TN); j++) { flag=0; for(p=0; p<M; p++) { for(int q=0; q<N; q++) { if(Ma[i+p][j+q] != Su[p][q]) { flag=1; break; } } } if(flag==0) { printf("Match Found in the Main Matrix at starting location %d, %d",(i+1) ,(j+1)); break; } } if(flag==0) { printf("Match Found in the Main Matrix at starting location %d, %d",(i+1) ,(j+1)); break; } } 
+1
source

My initial answer below the break, thinking about this, there are several optimizations, these optimizations relate to the steps of the original answer.

For step B), do not search the entire S tag: you can undo all columns and rows that will not allow F match. (in the example below, only search in the upper left 2x2 matrix). In cases where F is a significant fraction of S , this will save considerable time.

If the range of values โ€‹โ€‹inside S quite low, then creating a lookup table will significantly reduce the time required for step B).


Work with these two matrices

to find mat2 inside mat1

A) Choose one value from the smaller matrix:

mat4

B) find it in a larger

mat3

C) Check adjacent cells to see if they match

mat6 - mat5

0
source

Most of the answer depends on what you do repeatedly. Are you testing a bunch of huge matrices for the same sub-matrix? Are you testing one huge matrix that is looking for a bunch of different sub-matrices?

Do any of the matrices have repeating patterns, or are they good and random, or you can't make any assumptions about the data?

In addition, should the submatrix be subordinate? S contains

 F3 = 1 3 7 9 
0
source

If the data in the matrix are not randomly distributed, it would be useful to conduct some statistical analysis on it. Then you can find the submatrix by comparing its element located by their inverse probability. It can be faster and then by simple brute force.

Say you have a matrix of some normally distributed integers with a Gaussian center of 0. And you want to find a submatrix:

 1 3 -12 -3 43 -1 198 2 2 

You need to start searching 198, and then check the upper right element at 43, then the upper right element at -12, then any 3 or -3 will be executed; and so on. This would significantly reduce the number of comparisons compared to the most cruel solution.

0
source

This can be done in O(N*M*(logN+logM)) .

Equality can be expressed as the sum of squared differences 0:

 sum[i,j](square(S(n+i,m+j)-F(i,j)))=0 sum[i,j]square(S(n+i,m+j))+sum[i,j](square(F(i,j))-2*sum[i,j](S(n+i,m+j)*F(i,j))=0 

The first part can be calculated for all (n, m) in O (N * M) similarly to the current average.

The second part is calculated as usual in O (sizeof (F)), which is smaller than O (N * M).

The third part is the most interesting. This is a convolution that can be calculated in O (N * M * (logN + logM)) using the fast Fourier transform: http://en.wikipedia.org/wiki/Convolution#Fast_convolution_algorithms

0
source

Source: https://habr.com/ru/post/1482724/


All Articles