How to combine stat_ecdf with geom_ribbon?

I am trying to draw an ECDF of some data with a "confidence interval" represented across a shaded area using ggplot2 . I am having problems combining geom_ribbon() with stat_ecdf() to achieve the effect that I get after.

Consider the following data examples:

 set.seed(1) dat <- data.frame(variable = rlnorm(100) + 2) dat <- transform(dat, lower = variable - 2, upper = variable + 2) > head(dat) variable lower upper 1 2.534484 0.5344838 4.534484 2 3.201587 1.2015872 5.201587 3 2.433602 0.4336018 4.433602 4 6.929713 4.9297132 8.929713 5 3.390284 1.3902836 5.390284 6 2.440225 0.4402254 4.440225 

I can create an ECDF variable using

 library("ggplot2") ggplot(dat, aes(x = variable)) + geom_step(stat = "ecdf") 

However, I cannot use lower and upper as the ymax and ymax aesthetics of geom_ribbon() to impose a confidence interval on the chart as another layer. I tried:

 ggplot(dat, aes(x = variable)) + geom_ribbon(aes(ymin = lower, ymax = upper), stat = "ecdf") + geom_step(stat = "ecdf") 

but this causes the following error:

 Error: geom_ribbon requires the following missing aesthetics: ymin, ymax 

Is there a way to coax geom_ribbon() to work with stat_ecdf() to create a shaded confidence interval? Or, can anyone suggest an alternative means of adding a shaded polygon defined by lower and upper as a layer for the ECDF graph?

+6
source share
2 answers

Try it (shot a little in the dark):

 ggplot(dat, aes(x = variable)) + geom_ribbon(aes(x = variable,ymin = ..y..-2,ymax = ..y..+2), stat = "ecdf",alpha=0.2) + geom_step(stat = "ecdf") 

Okay, so this is not the same as what you are trying to do, but it should explain what is happening. stat returns a data frame with only source x and computed y, so I think that everything you need to work with. those. stat_ecdf only calculates the cumulative distribution function for one x at a time.

The only thing I can think of is the obvious, calculating the bottom and top separately, something like this:

 l <- ecdf(dat$lower) u <- ecdf(dat$upper) v <- ecdf(dat$variable) dat$lower1 <- l(dat$variable) dat$upper1 <- u(dat$variable) dat$variable1 <- v(dat$variable) ggplot(dat,aes(x = variable)) + geom_step(aes(y = variable1)) + geom_ribbon(aes(ymin = upper1,ymax = lower1),alpha = 0.2) 
+3
source

Not sure exactly how you want to reflect CI, but ggplot_build() allows you to get the generated data from the graph, then you can redo what you like.

This diagram shows:

  • red = original tape
  • blue = accepts the original CI vectors and applies to the ecdf curve
  • green = calculates ecdf upper and lower rows and graphs

enter image description here

  g<-ggplot(dat, aes(x = variable)) + geom_step(stat = "ecdf") + geom_ribbon(aes(ymin = lower, ymax = upper), alpha=0.5, fill="red") inside<-ggplot_build(g) matched<-merge(inside$data[[1]],data.frame(x=dat$variable,dat$lower,dat$upper),by=("x")) g + geom_ribbon(data=matched, aes(x = x, ymin = y + dat.upper-x, ymax = y - x + dat.lower), alpha=0.5, fill="blue") + geom_ribbon(data=matched, aes(x = x, ymin = ecdf(dat.lower)(x), ymax = ecdf(dat.upper)(x)), alpha=0.5, fill="green") 
+2
source

Source: https://habr.com/ru/post/959019/


All Articles