Looking for intersection of two matrices in Python within tolerance?

Question

Looking for intersection of two matrices in Python within tolerance?

I am looking for the most efficient way to find the intersection of two matrices of different sizes. Each matrix has three variables (columns) and a different number of observations (rows). For example, matrix A:

a = np.matrix('1 5 1003; 2 4 1002; 4 3 1008; 8 1 2005')
b = np.matrix('7 9 1006; 4 4 1007; 7 7 1050; 8 2 2003'; 9 9 3000; 7 7 1000')

If I set the tolerance for each column as col1 = 1, col2 = 2and col3 = 10, I need a function to output indices in aand bthat are within their respective tolerance, for example:

[x1, x2] = func(a, b, col1, col2, col3)
print x1
>> [2 3]
print x2
>> [1 3]

You can see from the indices that element 2 of ais within the tolerance of 1 element b.

I think that I could scroll through each element of the matrix a, check if it is within the allowable values of each element in b, and do it this way. But this is inefficient for very large datasets.

Any suggestions on alternatives to the loop method to accomplish this?

+4

python vectorization numpy matrix intersection

user1566200 Nov 04 '15 at 3:32

source share

1 answer

Divakar · Accepted Answer · 2015-11-04T06:14:50+0000

If you don't mind working with NumPy arrays, you can use broadcastingfor a vectorized solution. Here's the implementation -

# Set tolerance values for each column
tol = [1, 2, 10]

# Get absolute differences between a and b keeping their columns aligned
diffs = np.abs(np.asarray(a[:,None]) - np.asarray(b))

# Compare each row with the triplet from `tol`.
# Get mask of all matching rows and finally get the matching indices
x1,x2 = np.nonzero((diffs < tol).all(2))

Run Example -

In [46]: # Inputs
    ...: a=np.matrix('1 5 1003; 2 4 1002; 4 3 1008; 8 1 2005')
    ...: b=np.matrix('7 9 1006; 4 4 1007; 7 7 1050; 8 2 2003; 9 9 3000; 7 7 1000')
    ...: 

In [47]: # Set tolerance values for each column
    ...: tol = [1, 2, 10]
    ...: 
    ...: # Get absolute differences between a and b keeping their columns aligned
    ...: diffs = np.abs(np.asarray(a[:,None]) - np.asarray(b))
    ...: 
    ...: # Compare each row with the triplet from `tol`.
    ...: # Get mask of all matching rows and finally get the matching indices
    ...: x1,x2 = np.nonzero((diffs < tol).all(2))
    ...: 

In [48]: x1,x2
Out[48]: (array([2, 3]), array([1, 3]))

:. , , , - 3, 3 , -

na = a.shape[0]
nb = b.shape[0]
accum = np.ones((na,nb),dtype=bool)
for i in range(a.shape[1]):
    accum &=  np.abs((a[:,i] - b[:,i].ravel())) < tol[i]
x1,x2 = np.nonzero(accum)

Looking for intersection of two matrices in Python within tolerance?

More articles: