Boolean mask pattern matching

Question

Boolean mask pattern matching

I have an array:

arr = np.array([1,2,3,2,3,4,3,2,1,2,3,1,2,3,2,2,3,4,2,1]) print (arr) [1 2 3 2 3 4 3 2 1 2 3 1 2 3 2 2 3 4 2 1]

I would like to find this template and return booelan mask:

 pat = [1,2,3] N = len(pat)

I am using strides :

 #https://stackoverflow.com/q/7100242/2901002 def rolling_window(a, window): shape = a.shape[:-1] + (a.shape[-1] - window + 1, window) strides = a.strides + (a.strides[-1],) c = np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides) return c print (rolling_window(arr, N)) [[1 2 3] [2 3 2] [3 2 3] [2 3 4] [3 4 3] [4 3 2] [3 2 1] [2 1 2] [1 2 3] [2 3 1] [3 1 2] [1 2 3] [2 3 2] [3 2 2] [2 2 3] [2 3 4] [3 4 2] [4 2 1]]

I find only the positions of the first values:

 b = np.all(rolling_window(arr, N) == pat, axis=1) c = np.mgrid[0:len(b)][b] print (c) [ 0 8 11]

And positioned another vals:

 d = [i for x in c for i in range(x, x+N)] print (d) [0, 1, 2, 8, 9, 10, 11, 12, 13]

Last return in1d :

 e = np.in1d(np.arange(len(arr)), d) print (e) [ True True True False False False False False True True True True True True False False False False False False]

Check mask:

 print (np.vstack((arr, e))) [[1 2 3 2 3 4 3 2 1 2 3 1 2 3 2 2 3 4 2 1] [1 1 1 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0]] 1 2 3 1 2 3 1 2 3

I think my decision is a bit more complicated. Is there an even better, more pythonic solution?

+4

python arrays numpy boolean

jezrael Feb 26 '18 at 11:59

source share

2 answers

Not sure how safe this is, but another way would be to go back to the as_strided for inference. As long as you have only one pat , this is not a problem, I think, and it can work with a lot, but I can not guarantee it, because reading back to as_strided can be a little unpredictable:

 def vview(a): #based on @jaime answer: https://stackoverflow.com/a/16973510/4427777 return np.ascontiguousarray(a).view(np.dtype((np.void, a.dtype.itemsize * a.shape[1]))) def roll_mask(arr, pat): pat = np.atleast_2d(pat) out = np.zeros_like(arr).astype(bool) vout = rolling_window(out, pat.shape[-1]) vout[np.in1d(vview(rolling_window(arr, pat.shape[-1])), vview(pat))] = True return out np.where(roll_mask(arr, pat)) (array([ 0, 1, 2, 8, 9, 10, 11, 12, 13], dtype=int32),) pat = np.array([[1, 2, 3], [3, 2, 3]]) print([i for i in arr[roll_mask(arr, pat)]]) [1, 2, 3, 2, 3, 1, 2, 3, 1, 2, 3]

This seems to work, but I would not give this answer to newbies!

+1

Daniel F Feb 27 '18 at 7:03

source share

Divakar · Accepted Answer · 2018-02-26T12:47:01+0000

We can simplify things at the end with Scipy-enabled binary dilatation -

 from scipy.ndimage.morphology import binary_dilation m = (rolling_window(arr, len(pat)) == pat).all(1) m_ext = np.r_[m,np.zeros(len(arr) - len(m), dtype=bool)] out = binary_dilation(m_ext, structure=[1]*N, origin=-(N//2))

For performance, we can use OpenCV with the ability to match patterns, since we basically do the same here, for example:

 import cv2 tol = 1e-5 pat_arr = np.asarray(pat, dtype='uint8') m = (cv2.matchTemplate(arr.astype('uint8'),pat_arr,cv2.TM_SQDIFF) < tol).ravel()

Boolean mask pattern matching

More articles: