Search for local highs and lows

I am looking for an efficient computationally efficient way to find local highs / lows for a large list of numbers in R. We hope that without for loops ...

For example, if I have a data file of type 1 2 3 2 1 1 2 1 , I want the function to return 3 and 7, which are the positions of local maxima.

+54
r
Jul 26 '11 at 20:47
source share
10 answers

diff(diff(x)) (or diff(x,differences=2) : thanks to @ZheyuanLi) essentially computes a discrete analog of the second derivative, so it should be negative at local maxima. Below +1 takes care that the diff result is shorter than the input vector.

edit : added @Tommy correction for cases when delta-x is not 1 ...

 tt <- c(1,2,3,2,1, 1, 2, 1) which(diff(sign(diff(tt)))==-2)+1 

My suggestion above ( http://statweb.stanford.edu/~tibs/PPC/Rdist/ ) is for when the data is noisier.

+49
Jul 26 2018-11-21T00:
source share

@Ben solution is pretty sweet. However, it does not handle the following cases:

 # all these return numeric(0): x <- c(1,2,9,9,2,1,1,5,5,1) # duplicated points at maxima which(diff(sign(diff(x)))==-2)+1 x <- c(2,2,9,9,2,1,1,5,5,1) # duplicated points at start which(diff(sign(diff(x)))==-2)+1 x <- c(3,2,9,9,2,1,1,5,5,1) # start is maxima which(diff(sign(diff(x)))==-2)+1 

Here's a more reliable (and slower, uglier) version:

 localMaxima <- function(x) { # Use -Inf instead if x is numeric (non-integer) y <- diff(c(-.Machine$integer.max, x)) > 0L rle(y)$lengths y <- cumsum(rle(y)$lengths) y <- y[seq.int(1L, length(y), 2L)] if (x[[1]] == x[[2]]) { y <- y[-1] } y } x <- c(1,2,9,9,2,1,1,5,5,1) localMaxima(x) # 3, 8 x <- c(2,2,9,9,2,1,1,5,5,1) localMaxima(x) # 3, 8 x <- c(3,2,9,9,2,1,1,5,5,1) localMaxima(x) # 1, 3, 8 
+36
Jul 26 2018-11-21T00:
source share

Use the zoo library function:

 x <- c(1, 2, 3, 2, 1, 1, 2, 1) library(zoo) xz <- as.zoo(x) rollapply(xz, 3, function(x) which.min(x)==2) # 2 3 4 5 6 7 #FALSE FALSE FALSE TRUE FALSE FALSE rollapply(xz, 3, function(x) which.max(x)==2) # 2 3 4 5 6 7 #FALSE TRUE FALSE FALSE FALSE TRUE 

Then pull the index using "coredata" for those values ​​where "which.max" is the "central value" signaling a local maximum. You could obviously do the same for local minima using which.min instead of which.max .

  rxz <- rollapply(xz, 3, function(x) which.max(x)==2) index(rxz)[coredata(rxz)] #[1] 3 7 

I assume that you do not need start or end values, but if you do, you can fill in the ends of your vectors before processing, more like telomeres on chromosomes.

(I mark the ppc package (Peak Probability Competitors for mass spectrometric analyzes, simply because I did not know about its availability until reading the @BenBolker comment above, and I think adding these few words will increase the chances that anyone then with massive interest will see this when searching.)

+20
July 42-26 2018-11-21T00:
source share

Today I hit. I know that you said no loops, but I'm stuck with using the apply function. A bit compact and fast and allows you to set a threshold specification so you can go more than 1.

Function:

 inflect <- function(x, threshold = 1){ up <- sapply(1:threshold, function(n) c(x[-(seq(n))], rep(NA, n))) down <- sapply(-1:-threshold, function(n) c(rep(NA,abs(n)), x[-seq(length(x), length(x) - abs(n) + 1)])) a <- cbind(x,up,down) list(minima = which(apply(a, 1, min) == a[,1]), maxima = which(apply(a, 1, max) == a[,1])) } 

To visualize it / play with thresholds, you can run the following code:

 # Pick a desired threshold # to plot up to n <- 2 # Generate Data randomwalk <- 100 + cumsum(rnorm(50, 0.2, 1)) # climbs upwards most of the time bottoms <- lapply(1:n, function(x) inflect(randomwalk, threshold = x)$minima) tops <- lapply(1:n, function(x) inflect(randomwalk, threshold = x)$maxima) # Color functions cf.1 <- grDevices::colorRampPalette(c("pink","red")) cf.2 <- grDevices::colorRampPalette(c("cyan","blue")) plot(randomwalk, type = 'l', main = "Minima & Maxima\nVariable Thresholds") for(i in 1:n){ points(bottoms[[i]], randomwalk[bottoms[[i]]], pch = 16, col = cf.1(n)[i], cex = i/1.5) } for(i in 1:n){ points(tops[[i]], randomwalk[tops[[i]]], pch = 16, col = cf.2(n)[i], cex = i/1.5) } legend("topleft", legend = c("Minima",1:n,"Maxima",1:n), pch = rep(c(NA, rep(16,n)), 2), col = c(1, cf.1(n),1, cf.2(n)), pt.cex = c(rep(c(1, c(1:n) / 1.5), 2)), cex = .75, ncol = 2) 

enter image description here

+10
Mar 28 '17 at 5:36
source share

There are some good solutions, but it depends on what you need.

Just diff(tt) returns the differences.

You want to determine when you go from increasing values ​​to decreasing values. One way to do this is @Ben:

  diff(sign(diff(tt)))==-2 

The problem is that it will only detect changes that go immediately from strictly increasing to a strict decrease.

A small change will allow you to repeat the values ​​at the peak (return TRUE for the last peak value):

  diff(diff(x)>=0)<0 

Then you just need to place the front and back parts correctly if you want to detect highs at the beginning or end

Everything here is wrapped in a function (including valley search):

  which.peaks <- function(x,partial=TRUE,decreasing=FALSE){ if (decreasing){ if (partial){ which(diff(c(FALSE,diff(x)>0,TRUE))>0) }else { which(diff(diff(x)>0)>0)+1 } }else { if (partial){ which(diff(c(TRUE,diff(x)>=0,FALSE))<0) }else { which(diff(diff(x)>=0)<0)+1 } } } 
+8
Jun 21 '13 at 14:44
source share

The answer to this question @ 42- is excellent, but I had a use case where I did not want to use zoo . This is easy to implement with dplyr using lag and lead :

 library(dplyr) test = data_frame(x = sample(1:10, 20, replace = TRUE)) mutate(test, local.minima = if_else(lag(x) > x & lead(x) > x, TRUE, FALSE) 

Like the rollapply solution, you can control window sizes and edges through the arguments lag / lead n and default respectively.

+3
Jul 24 '17 at 23:58
source share

Here's a solution for the lows :

@Ben solution

 x <- c(1,2,3,2,1,2,1) which(diff(sign(diff(x)))==+2)+1 # 5 

Please look at things in the Tommy post!

Decision

@Tommy:

 localMinima <- function(x) { # Use -Inf instead if x is numeric (non-integer) y <- diff(c(.Machine$integer.max, x)) > 0L rle(y)$lengths y <- cumsum(rle(y)$lengths) y <- y[seq.int(1L, length(y), 2L)] if (x[[1]] == x[[2]]) { y <- y[-1] } y } x <- c(1,2,9,9,2,1,1,5,5,1) localMinima(x) # 1, 7, 10 x <- c(2,2,9,9,2,1,1,5,5,1) localMinima(x) # 7, 10 x <- c(3,2,9,9,2,1,1,5,5,1) localMinima(x) # 2, 7, 10 

Please note: neither localMaxima nor localMinima can handle duplicate highs / lows at startup!

+2
Sep 25 '14 at 10:16
source share

I had problems working in previous solutions, and I tried to get maximum lows and highs. The code below will do this and close it, marking the minimum green and maximums in red. Unlike the function which.max() , it will extract all the minimum / maximum indices from the data frame. A null value is added to the first diff() function to take into account the missing reduced length of the result that occurs whenever you use this function. The insertion of this into the innermost diff() function call is preserved because of the need to add an offset outside the logical expression. It doesn't really matter, but I think this is a cleaner way to do it.

 # create example data called stockData stockData = data.frame(x = 1:30, y=rnorm(30,7)) # get the location of the minima/maxima. note the added zero offsets # the location to get the correct indices min_indexes = which(diff( sign(diff( c(0,stockData$y)))) == 2) max_indexes = which(diff( sign(diff( c(0,stockData$y)))) == -2) # get the actual values where the minima/maxima are located min_locs = stockData[min_indexes,] max_locs = stockData[max_indexes,] # plot the data and mark minima with red and maxima with green plot(stockData$y, type="l") points( min_locs, col="red", pch=19, cex=1 ) points( max_locs, col="green", pch=19, cex=1 ) 
+2
Mar 17 '16 at 23:33
source share

In pracma package use

 tt <- c(1,2,3,2,1, 1, 2, 1) tt_peaks <- findpeaks(tt, zero = "0", peakpat = NULL, minpeakheight = -Inf, minpeakdistance = 1, threshold = 0, npeaks = 0, sortstr = FALSE) [,1] [,2] [,3] [,4] [1,] 3 3 1 5 [2,] 2 7 6 8 

This returns a 4-column matrix. The first column shows the absolute values ​​of the local peaks. The second column is the indices. The third and fourth columns are the beginning and end of the peaks (with potential overlap).

See https://www.rdocumentation.org/packages/pracma/versions/1.9.9/topics/findpeaks for details.

One caveat: I used it in a series of non-integer numbers, and the peak was one index too late (for all peaks), and I don't know why. So I had to manually remove β€œ1” from my index vector (it doesn't matter).

0
Apr 13 '18 at 23:59 on
source share

I posted this elsewhere, but I think this is an interesting way to do this. I'm not sure what its computational efficiency is, but it is a very concise way to solve the problem.

 vals=rbinom(1000,20,0.5) text=paste0(substr(format(diff(vals),scientific=TRUE),1,1),collapse="") sort(na.omit(c(gregexpr('[ ]-',text)[[1]]+1,ifelse(grepl('^-',text),1,NA), ifelse(grepl('[^-]$',text),length(vals),NA)))) 
-one
Oct. 14 '13 at 3:56
source share



All Articles