Finding first samples above a threshold is effective in Python (and MATLAB comparison)

Instead of finding all the samples / data points in a list or an array that is larger than a certain one threshold, I would like to find only the first samples where a signalbecomes larger than a threshold. The signal can cross the threshold several times. For example, if I have an example signal:

signal = [1, 2, 3, 4, 4, 3, 2, 1, 0, 3, 2, 1, 0, 0, 1, 1, 4, 8, 7, 6, 5, 0]

and a threshold = 2then

signal = numpy.array(signal)
is_bigger_than_threshold = signal > threshold

will give me all the values ​​in signalwhich are greater than threshold. However, I would like to get only the first samples when the signal becomes greater than the threshold. Therefore, I go through the entire list and make logical comparisons, for example

first_bigger_than_threshold = list()
first_bigger_than_threshold.append(False)
for i in xrange(1, len(is_bigger_than_threshold)):
    if(is_bigger_than_threshold[i] == False):
        val = False
    elif(is_bigger_than_threshold[i]):
        if(is_bigger_than_threshold[i - 1] == False):
            val = True
        elif(is_bigger_than_threshold[i - 1] == True):
            val = False
    first_bigger_than_threshold.append(val)

This gives me the result I was looking for, namely

[False, False, True, False, False, False, False, False, False, True, False, False, False,   
False, False, False, True, False, False, False, False, False]

In MATLAB I would do in a similar way

for i = 2 : numel(is_bigger_than_threshold)
    if(is_bigger_than_threshold(i) == 0)
        val = 0;
    elseif(is_bigger_than_threshold(i))
        if(is_bigger_than_threshold(i - 1) == 0)
            val = 1;
        elseif(is_bigger_than_threshold(i - 1) == 1)
            val = 0;
        end
    end
    first_bigger_than_threshold(i) = val;
end % for

( ) ?

Python,

signal = [round(random.random() * 10) for i in xrange(0, 1000000)]

, 4.45 . MATLAB

signal = round(rand(1, 1000000) * 10);

0.92 .

MATLAB 5 , Python, ?

!

+4
3

Trues, bool, True, :

import numpy as np

signal = np.random.rand(1000000)
th = signal > 0.5
th[1:][th[:-1] & th[1:]] = False
+3

, , Matlab.

import numpy as np

threshold = 2
signal = np.array([1, 2, 3, 4, 4, 3, 2, 1, 0, 3, 2, 1, 0, 0, 1, 1, 4, 8, 7, 6, 5, 0])

indices_bigger_than_threshold = np.where(signal > threshold)[0] # get item
print indices_bigger_than_threshold
# [ 2  3  4  5  9 16 17 18 19 20]
non_consecutive = np.where(np.diff(indices_bigger_than_threshold) != 1)[0]+1 # +1 for selecting the next
print non_consecutive
# [4 5]
first_bigger_than_threshold1 = np.zeros_like(signal, dtype=np.bool)
first_bigger_than_threshold1[indices_bigger_than_threshold[0]] = True # retain the first
first_bigger_than_threshold1[indices_bigger_than_threshold[non_consecutive]] = True

np.where , .

, , threshold, .

BTW, Python/Numpy.

+3

, - , :

import numpy

signal = numpy.array([1, 2, 3, 4, 4, 3, 2, 1, 0, 3, 2, 1, 0, 0, 1, 1, 4, 8, 7, 6, 5, 0])

thresholded_data = signal > threshold
threshold_edges = numpy.convolve([1, -1], thresholded_data, mode='same')

thresholded_edge_indices = numpy.where(threshold_edges==1)[0]

print(thresholded_edge_indices)

[2 9 16], , , . Matlab Python ( Numpy) - 12 , , 4.5s.

Edit: As @eickenberg noted, the convolution can be replaced with numpy.diff(thresholded_data), which is conceptually a little easier, although in this case the indices will be absent by 1, so do not forget to add them back, and also convert thresholded_datato an array from int c thresholded_data.astype(int). There is no noticeable difference in speed between the two methods.

+2
source

Source: https://habr.com/ru/post/1542013/


All Articles