I want to detect consecutive intervals of 1 in a numpy array. In fact, I want to first determine if an element in the array is in the range of at least three 1. For example, we have the following array a:
import numpy as np a = np.array([1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0])
Then the next 1 bold are elements that satisfy the requirement.
[ 1, 1, 1 , 0, 1, 1, 1 , 0, 1, 1, 0, 0, 1, 1, 1 , 0, 0, 1, 1, 1, 1, 1 , 1 , 0]
Further, if two intervals of 1 are separated by no more than two 0, then two intervals constitute a longer interval. Thus, the specified array is characterized as
[ 1, 1, 1, 0, 1, 1, 1 , 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1 , 1 , 0]
In other words, for the original array as input, I want the result to be as follows:
[True, True, True, True, True, True, True, False, False, False, False, False, True, True, True, True, True, True, True, True, True, True, False]
I was thinking of an algorithm to implement this function, but everything I came up with seems complicated. Therefore, I would like to know the best ways to implement this - it would be very helpful if someone can help me.
Update:
I apologize for not asking a question. I want to identify 3 or more consecutive 1s in the array as span 1, and any two spans 1 with one or two 0s between them are identified along with separating 0s as one long span. My goal can be understood as follows: if there is only one or two 0 between flights 1, I consider these 0 errors and should be fixed as 1.
@ ritesht93 provided an answer that almost gives what I want. However, the current answer does not determine the case when there are three intervals of 1, which are separated by 0, which should be identified as one separate interval. For example, for an array
a2 = np.array([0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0])
we should get a conclusion
[False, True, True, True, True, True, True, True, True, True, True, True, True, True, False, False, False, False, False, True, True, True, True, True, False]
Update 2:
I was very inspired and found that the regex-based algorithm is the easiest to implement and understand, although I'm not sure about the effectiveness compared to other methods. In the end, I used the following method.
lst = np.array([0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0]) lst1 = re.sub(r'1{3,}', lambda x:'c'*len(x.group()), ''.join(map(str, lst))) print lst1
which identified intervals 1
0ccc0ccc00cccc00100ccccc0
and then connect the gaps with 1
lst2 = re.sub(r'c{1}0{1,2}c{1}', lambda x:'c'*len(x.group()), ''.join(map(str, lst1))) print lst2
which gives
0ccccccccccccc00100ccccc0
The end result is given
np.array(list(lst2)) == 'c' array([False, True, True, True, True, True, True, True, True, True, True, True, True, True, False, False, False, False, False, True, True, True, True, True, False])