Compress a list of numbers into unique non-overlapping time ranges using python

Question

Compress a list of numbers into unique non-overlapping time ranges using python

I am from biology and very new to python and ML, the laboratory has an ML blackbox model that outputs a sequence similar to this:

Predictions =
[1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,1,0,1,0,1,0,1,1,1,1,1,0,0,0,1,1,1,1,1,1,0]

each value represents a predicted time interval of 0.25 s. 1 means High.
0 means Not High.

How to convert these forecasts to [start, stop, label]?
so longer sequences are grouped in the example, the first 10 represent from 0 to 10 * .25 s, so the first range and label will be

[[0.0,2,5, high]
there are 13 zeros nearby ===> start = (2.5), stop = 13 * .25 +2.5, label = Not high
Thus,
[2.5, 5.75, Not-High]

so the final list will look like a list of lists / ranges with unique non-overlapping intervals along with a type label:

[[0.0,2.5, High],
[2.5, 5.75, Not-High],
[5.75,6.50, High] ..

What I tried:
1. Count the number of values in the forecasts
2. Create two ranges: one starts at zero and the other starts at 0.25. 3. combine these two lists into tuples

import numpy as np  
len_pred = len(Predictions) 
range_1 = np.arange(0,len_pred,0.25)
range_2 = np.arange(0.25,len_pred,0.25)
new_range = zip(range_1,range_2)

Here I can get ranges, but not enough shortcuts.
It seems to be a simple problem, but I'm running in circles.

Please inform. Thank you

+4

python algorithm python-2.7 numpy

Seirra Feb 21 '18 at 4:54

source share

3 answers

diff() where() , :

import numpy as np

p = np.array([1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,1,0,1,0,1,0,1,1,1,1,1,0,0,0,1,1,1,1,1,1,0])

idx = np.r_[0, np.where(np.diff(p) != 0)[0]+1, len(p)]
t = idx * 0.25

np.c_[t[:-1], t[1:], p[idx[:-1]]]

:

array([[  0.  ,   2.5 ,   1.  ],
       [  2.5 ,   5.75,   0.  ],
       [  5.75,   6.5 ,   1.  ],
       [  6.5 ,   6.75,   0.  ],
       [  6.75,   7.  ,   1.  ],
       [  7.  ,   7.25,   0.  ],
       [  7.25,   7.5 ,   1.  ],
       [  7.5 ,   7.75,   0.  ],
       [  7.75,   8.  ,   1.  ],
       [  8.  ,   8.25,   0.  ],
       [  8.25,   9.5 ,   1.  ],
       [  9.5 ,  10.25,   0.  ],
       [ 10.25,  11.75,   1.  ],
       [ 11.75,  12.  ,   0.  ]])

+4

HYRY 21 . '18 9:27

, , - .

compact_prediction = list()
sequence = list()  # This will contain each sequence list [start, end, label]

last_prediction = 0

for index, prediction in enumerate(Predictions):
    if index == 0:
        sequence.append(0)  # It the first sequence, so it will start in zero

    # When we not talking about the prediction we only end the sequence
    # when the last prediction is different from the current one, 
    # signaling a change
    elif prediction != last_prediction:
        sequence.append((index - 1) * 0.25) # We append the end of the sequence

        # And we put the label based on the last prediction
        if last_prediction == 1:  
            sequence.append('High')
        else:
            sequence.append('Not-High')

        # Append to our compact list and reset the sequence
        compact_prediction.append(sequence)
        sequence= list()

        # After reseting the sequence we append the start of the new one
        sequence.append(index * 0.25)

    # Save the last prediction so we can check if it changed
    last_prediction = prediction

print(compact_prediction)

: [[0.0, 2.25, "" ], [2.5, 5.5, "" ], [5.75, 6.25, "" ], [6.5, 6.5, "" ] [6.75, 6.75, "" ], [7.0, 7.0, "" ], [7.25, 7.25, "" ], [7.5, 7.5, "" ], [7.75, 7.75, '], [8.0, 8.0, "" ], [8.25, 9.25, "" ], [9.5, 10.0, "" ], [10.25, 11.5, "" ]]

+3

forayer 21 . '18 5:06

Steve · Accepted Answer · 2018-02-21T05:13:39+0000

You can iterate over the list and create a range when changes are detected. When using this method, you will also need to consider the final range. It may not be super clean, but it must be effective.

current_time = 0
range_start = 0
current_value = predictions[0]
ranges = []
for p in predictions:
  if p != current_value:
    ranges.append([range_start, current_time, 'high' if current_value == 1 else 'not high'])
    range_start = current_time
    current_value = p
  current_time += .25
ranges.append([range_start, current_time, 'high' if current_value == 1 else 'not high'])

, .

Compress a list of numbers into unique non-overlapping time ranges using python

More articles: