How to extend ggplot2 boxplot with ggproto?

I often use drawers in my work and love the aesthetics of ggplot2 . But for the standard geom_boxplot two things that are important to me are missing: the tips of the mustache and the median marks. Thanks to the information here, I wrote a function:

 gBoxplot <- function(formula = NULL, data = NULL, font = "CMU Serif", fsize = 18){ require(ggplot2) vars <- all.vars(formula) response <- vars[1] factor <- vars[2] # A function for medians labelling fun_med <- function(x){ return(data.frame(y = median(x), label = round(median(x), 3))) } p <- ggplot(data, aes_string(x = factor, y = response)) + stat_boxplot(geom = "errorbar", width = 0.6) + geom_boxplot() + stat_summary(fun.data = fun_med, geom = "label", family = font, size = fsize/3, vjust = -0.1) + theme_grey(base_size = fsize, base_family = font) return(p) } 

There are also font settings, but this is only because I'm too lazy to create a theme. Here is an example:

 gBoxplot(hwy ~ class, mpg) 

plot1

Good for me, but there are some limitations (you cannot use automatic dodging, etc.), and it would be better to create a new geometry based on geom_boxplot . I read the vignette ggplot2 extension , but cannot figure out how to implement it. Any help would be appreciated.

+5
source share
1 answer

So I thought about it alone for a while. Basically, when you create a new primitive, you usually write a combination of:

  • A layer-function
  • A stat-ggproto ,
  • A geom-ggproto

Only the level function should be available to the user. You need to write stat-ggproto if you need some new way to convert your data to make your primitive. And you need to write geom-ggproto if you have any new mesh graphics to create.

In this case, when we mostly compost the layer-functions that already exist, we really don't need to write new ggprotos. Just write a new layer function . This layer function will create three layers that you are already using, and map the parameters as you plan. In this case:

  • Layer1 - uses geom_errorbar and stat_boxplot - to get our errorbars
  • Layer2 - uses geom_boxplot and stat_boxplot - to create boxplots
  • Layer3 - users geom_label and stat_summary - to create text labels with an average value in the center of the fields.

Of course, you can write a new stat-ggproto and a new geom-ggproto , which will do it all right away. Or maybe you are composting stat_summary and stat_boxplot into one and three geom-protos , and this is done with one layer. But there is little point if we have no problems with efficiency.

Anyway, here is the code:

 geom_myboxplot <- function(formula = NULL, data = NULL, stat = "boxplot", position = "dodge",coef=1.5, font = "sans", fsize = 18, width=0.6, fun.data = NULL, fun.y = NULL, fun.ymax = NULL, fun.ymin = NULL, fun.args = list(), outlier.colour = NULL, outlier.color = NULL, outlier.shape = 19, outlier.size = 1.5,outlier.stroke = 0.5, notch = FALSE, notchwidth = 0.5,varwidth = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE,...) { vars <- all.vars(formula) response <- vars[1] factor <- vars[2] mymap <- aes_string(x=factor,y=response) fun_med <- function(x) { return(data.frame(y = median(x), label = round(median(x), 3))) } position <- position_dodge(width) l1 <- layer(data = data, mapping = mymap, stat = StatBoxplot, geom = "errorbar", position = position, show.legend = show.legend, inherit.aes = inherit.aes, params = list(na.rm = na.rm, coef = coef, width = width, ...)) l2 <- layer(data = data, mapping = mymap, stat = stat, geom = GeomBoxplot, position = position, show.legend = show.legend, inherit.aes = inherit.aes, params = list(outlier.colour = outlier.colour, outlier.shape = outlier.shape, outlier.size = outlier.size, outlier.stroke = outlier.stroke, notch = notch, notchwidth = notchwidth, varwidth = varwidth, na.rm = na.rm, ...)) l3 <- layer(data = data, mapping = mymap, stat = StatSummary, geom = "label", position = position, show.legend = show.legend, inherit.aes = inherit.aes, params = list(fun.data = fun_med, fun.y = fun.y, fun.ymax = fun.ymax, fun.ymin = fun.ymin, fun.args = fun.args, na.rm=na.rm,family=font,size=fsize/3,vjust=-0.1,...)) return(list(l1,l2,l3)) } 

which allows you to create your own custom boxes, now it looks like this:

 ggplot(mpg) + geom_myboxplot( hwy ~ class, font = "sans",fsize = 18)+ theme_grey(base_family = "sans",base_size = 18 ) 

And they look like this:

enter image description here

Note : we actually didn’t have to use the layer function, we could use the original stat_boxplot , geom_boxplot and stat_summary instead. But we still would have to fill in all the parameters if we wanted to be able to control them from our custom boxplot, so I think it was more understandable, at least from the point of view of structure and not functionality, Maybe this is not so, it's a matter of taste ...

Also I do not have this font, which looks much nicer. But I did not want to track it and install it.

+6
source

Source: https://habr.com/ru/post/1240918/


All Articles