How to calculate the length of continuous occurrences of a value (downtime) in a matrix?

I have the following data:

1 0 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 

Each column represents a device, and each row represents a period of time. Each data point indicates whether the device was active during this time period. I am trying to calculate the length of each uptime or “spell” so that each device is active. In other words, the length of each spell is continuous in each column. In this case, the first column will be 2 11 3 , etc.

This is easy to do with one device (one data column):

 rng(1) %% Parameters lambda = 0.05; % Pr(failure) N = 1; % number of devices T = 18; % number of time periods in sample %% Generate example data device_status = [rand(T, N) >= lambda ; false(1, N)]; %% Calculate spell lengths, ie duration of uptime for each device cumul_status = cumsum(device_status); % The 'cumul_status > 0' condition excludes the case where the vector begins with one % or more zeros cumul_uptimes = cumul_status(device_status == 0 & cumul_status > 0); uptimes = cumul_uptimes - [0 ; cumul_uptimes(1:end-1)]; 

so that I can just parfor over the columns and do one column at a time and use parfor (for example) for parallel work. Is there a way to do this in all columns simultaneously using vectorized matrix operations?

EDIT: I have to add that this is complicated by the fact that each device may have a different number of time periods.

+5
source share
1 answer

Here is the way. Not sure if it is considered vectorized.

Let your data matrix be denoted as x . Then

 [ii, jj] = find([true(1,size(x,2)); ~x; true(1,size(x,2))]); result = accumarray(jj, ii, [], @(x){nonzeros(diff(x)-1)}); 

creates an array of cells where each cell corresponds to a column. In your example

 result{1} = 2 11 3 result{2} = 13 3 result{3} = 6 11 

How it works

The idea is to find the row and column indexes of zeros in x (i.e. true values ​​in ~x ), and then use the column indices as grouping variables (first argument is accumarray ).

In each group, we use the anonymous function @(x){nonzeros(diff(x)-1)} to calculate the differences in the positions of the rows of zeros. We can apply diff directly because the column indices from find already sorted, thanks to the Matlab column major template. Subtract 1 because the zeros at x not considered part of the uptime; delete the idle time equal to 0 ( nonzeros ) and pack the resulting vector in the cell ( {...} ).

A string of true values ​​is added and appended to ~x to make sure that we find the start and end periods of uptime.

+3
source

Source: https://habr.com/ru/post/1239457/


All Articles