Removing extreme values ​​in a vector in Matlab?

So I have = [2 7 4 9 2 4 999]

And I would like to remove 999 from the matrix (which is an obvious outlier).

Is there a general way to remove such values? I have a set of vectors, and not all of them have such extreme values. prctile (a, 99.5) is going to print the largest number in the vector no matter how extreme (or not extreme) it is.

+4
source share
3 answers

There are several ways to do this, but first you must determine what is “extreme”? Is it above a certain threshold above a certain number of standard deviations? Or, if you know that you have exactly n these extreme events and that their values ​​are greater than the others, you can use sort and delete the last n elements. etc.

For example, a(a>threshold)=[] takes care of the threshold as a definition, and a(a>mean(a)+n*std(a))=[] takes care of dropping the n standard deviation above the average of a .

A completely different approach is to use the median a , if the vector is as short as you mention, you want to look at the median value, and then you can either generate something higher than some coefficient of this value a(a>n*median(a))=[] .

Finally, a way to evaluate the approach to treating these spikes would be to take a histogram of the data and work from there ...

+10
source

I can think of two:

  • Sort the matrix and remove the n-elements above and below.
  • Calculate the mean and standard deviation and discard all values ​​outside the range: mean +/- (n * standard deviation)

In both cases, n must be selected by the user.

+2
source

Filter your signal.

 %choose the value N = 10; filtered = filter(ones(1,N)/N, 1, signal); 

Find the noise

 noise = signal - filtered; 

Remove noisy items

 THRESH = 50; signal = signal(abs(noise) < THRESH); 

This is better than mean+-n*stddev , because it searches for local changes, so it will not be interrupted by a slowly changing signal, for example [1 2 3 ... 998 998] .

+1
source

Source: https://habr.com/ru/post/1468853/


All Articles