Using ggplot2: Create faceted scatterplots with scaled and moved density

I would like to plot some data as a scatter plot using facet_wrap, overlaying some information such as linear regression and density. I managed to do all this, but the density values ​​do not correspond to my points, which is a normal thing, since these points are far away. However, I would like to scale and move my density curve so that it is clearly visible; I don't care about real values, but more about its form.

Here is an exaggerated minimum working example of what I have:

set.seed(48151623) mydf <- data.frame(x1=rnorm(mean=5,n=100),x2=rnorm(n=100,mean=10),x3=rnorm(n=100,mean=20,sd=3)) mydf$var <- mydf$x1 + mydf$x2 * mydf$x3 mydf.wide <- melt(mydf,id.vars='var',measure.vars=c(1:3)) ggplot(data=mydf.wide,aes(x=value,y=var)) + geom_point(colour='red') + geom_smooth(method='lm') + stat_density(aes(x=value,y=..scaled..),position='identity',geom='line') + facet_wrap(~variable,scale='free_x') 

The result is: example of current plot

What I would like to be like this ugly hack:

 stat_density(aes(x=value,y=..scaled..*100+200),position='identity',geom='line') 

Ideally, I would use y=..scaled..* diff(range(value)) + min(value) , but when I do this, I get an error "Value" was not found. I suspect the problem is faceting, but I would rather keep my facets.

How can I scale and move the density curve in this case?

cool result but ugly hack

+4
source share
3 answers

I appreciate the answers of all, which made me better understand the basic mechanisms of ggplot. I also understand how uncomfortable my claim is; ggplot will not solve my problem. I managed to do what I did not want using ggplot stat_density , but to directly calculate my densities in another data frame:

 set.seed(48151623) mydf <- data.frame(x1=rnorm(mean=5,n=100),x2=rnorm(n=100,mean=10),x3=rnorm(n=100,mean=20,sd=3)) mydf$var <- mydf$x1 + mydf$x2 * mydf$x3 mydf.wide <- melt(mydf,id.vars='var',measure.vars=c(1:3)) mydf.densities <- do.call('rbind',lapply(unique(mydf.wide$variable), function(var) { tmp <- mydf.wide[which(mydf.wide$variable==var),c('var','value')] dfit <- density(tmp$value,cut=0) scaledy <-dfit$y/max(dfit$y) * diff(range(tmp$var)) + min(tmp$var) data.frame(x=dfit$x,y=scaledy,variable=rep(var,length(dfit$x))) })) ggplot(data=mydf.wide,aes(x=value,y=var)) + geom_point(colour='red') + geom_smooth(method='lm') + geom_line(aes(x=x,y=y),data=mydf.densities) + facet_wrap(~variable,scale='free_x') 

(I know that the construction of mydf.densities bit confusing, but I will work on this later).

I give generosity to the majority of voting decisions at the end of the day, for your problems.

The plot I wanted to do

0
source

I suggest making two graphs and combining them with grid.arrange :

 p1 <- ggplot(data=mydf.wide,aes(x=value,y=var)) + geom_point(colour='red') + geom_smooth(method='lm') + facet_wrap(~variable,scale='free_x') + theme(axis.title.x=element_blank(), axis.text.x=element_blank(), axis.ticks.x=element_blank(), plot.margin = unit(c(1, 1, 0, 0.5), "lines")) p2 <- ggplot(data=mydf.wide,aes(x=value,y=var)) + stat_density(aes(x=value,y=..scaled..),position='identity',geom='line') + facet_wrap(~variable,scale='free_x') + theme(strip.background=element_blank(), strip.text=element_blank(), plot.margin = unit(c(-1, 1, 0.5, 0.35), "lines")) library(gridExtra) grid.arrange(p1, p2, heights = c(2,1)) 

enter image description here

+6
source

I'm not sure if this answers your question completely, but it was too long to add a comment, therefore ... In response to your second piece of code in your question, since you already defined x=value , you can use x instead of value in its definition of y.

 stat_density(aes(x=value,y=..scaled..*diff(range(x)) + min(x)),position='identity',geom='line') 

This seems to correct your error and create the following graph:

faceted scatterplot with density curves on same y-axis

The only problem, of course, is that if you have data with low y values, then you will still overlap the density curves with a scatter plot. But, if this is not so, I personally think that this is a rather informative figure, as long as you can communicate effectively, that the y axis values ​​are not important when interpreting density curves - only the shape of the curves is important.

+2
source

Source: https://habr.com/ru/post/1491025/


All Articles