How to specify the color of lines and points in ecdf ggplot2

Question

How to specify the color of lines and points in ecdf ggplot2

I have a dataset that is hard to visualize, but I think that ECDF with a few dots and lines added to it will do the trick. I can build things the way I want; my problem colors things correctly.

I have the following code that puts all the correct lines and points on a chart, but now I would like to color and mark everything correctly. I looked through several articles and tried a hundred things, but I can’t get it right. Do I need to format my data differently?

My vision of a legend looks something like this:

dashed line = b
solid line = a
red = s
blue = d
dot = s.mean

code to generate an example:

require(ggplot2) require(reshape2) sa = rnorm(100)*100 sb = rnorm(100)*100+50 da = -35 db = 20 sdata = data.frame(cbind(sa,sb)) ddata = data.frame(cbind(da,db)) sdata.m = melt(sdata) ddata.m = melt(ddata) ggplot(sdata.m, aes(x=value, color=variable)) + geom_vline(data=ddata.m, aes(xintercept = value, color=variable), linetype = 2, size=2) + stat_ecdf(size=1)+ labs(title = 'plotTitle', color='colorLegendTitle') + xlab('xLabel') + ylab('yLabel')+ theme_bw(30) + theme( legend.position=c(.8, .2), legend.box="horizontal", text=element_text(family="Times"), legend.key.size = unit(1,"cm")) + geom_point(x=mean(sdata.m$value[sdata.m$variable=="sa"]),y=.5, size = 5) + geom_point(x=mean(sdata.m$value[sdata.m$variable=="sb"]),y=.5, size = 5)

enter image description here Some context for the data I draw: I have stochastic data sets and deterministic sets (d); each stochastic set will have hundreds of values, and deterministic sets have only one value. Therefore, in my plot, I compare the distribution of stochastic data (solid lines) and the average of stochastic data (points) with deterministic values (dashed lines). For stochastic and deterministic data sets, there are two “cases” (a) and (b). I would like all (a) and (b) data to have the same color.

It seems like this should be easy with the aes and color / linetype / geom mappings, but I can't figure it out.

Thanks in advance.

+4

r ggplot2 ecdf

Ryanstochastic Jun 10 '13 at 21:57

source share

2 answers

Didzis gets a response to the answer; I was able to adapt my code and get to the final product that I was looking for:

 ggplot(sdata.m, aes(x=value, color=variable,linetype=variable,shape=variable))+ stat_ecdf(size=1)+ geom_vline(data=ddata.m, aes(xintercept = value,color=variable,linetype=variable,shape=variable), size=2) + geom_point(aes(x=mean(sdata.m$value[sdata.m$variable=="sa"]), color="samean",linetype="samean",shape="samean", y=.5),size = 5) + geom_point(aes(x=mean(sdata.m$value[sdata.m$variable=="sb"]), color="sbmean",linetype="sbmean",shape="sbmean", y=.5),size = 5) + scale_shape_manual(breaks=c("da","db","sa","samean","sb","sbmean"), values=c(16,16,16,16,16,16)) + scale_color_manual(breaks=c("da","db","sa","samean","sb","sbmean"), values=c("blue","red","blue","blue","red","red"))+ scale_linetype_manual(breaks=c("da","db","sa","samean","sb","sbmean"), values=c(2,2,1,0,1,0))+ guides(color=guide_legend(override.aes=list(shape=c(NA,NA,NA,16,NA,16))))

A few things I learned:

When adding breaks / values to scale_manual, it is important to use the alphabetical order.
when all parameters (line type / shape / color) are matched with the same variable "variable", you can get everything in one legend
when redefining things using manual scales, you need to make one from each scale, and then redefine the "guides" if necessary

Thanks again Didzis. Another life saved.

+3

Ryanstochastic Jun 12 '13 at 15:22

source share

Didzis elferts · Accepted Answer · 2013-06-11T05:55:47+0000

To get the best place for the legend color=variable and linetype=variable inside aes() for ggplot() and for geom_vline() - that means there will be one legend. Then for geom_point() put x and y inside aes() , as well as color="s.mean" and linetype="s.mean" . This will ensure that a new level is added to the legend. Now using scale_color"manual() and scale_linetype_manual() you can set the desired colors and line types. Using guides() and override.aes= you can remove points from the first four entries.

 ggplot(sdata.m, aes(x=value, color=variable,linetype=variable))+ stat_ecdf(size=1)+ geom_vline(data=ddata.m, aes(xintercept = value,color=variable,linetype=variable), size=2) + geom_point(aes(x=mean(sdata.m$value[sdata.m$variable=="sa"]), color="s.mean",linetype="s.mean",y=.5),size = 5) + geom_point(aes(x=mean(sdata.m$value[sdata.m$variable=="sb"]), color="s.mean",linetype="s.mean",y=.5),size = 5)+ scale_color_manual(breaks=c("da","db","sa","sb","s.mean"), values=c("blue","blue","red","red","green"))+ scale_linetype_manual(breaks=c("da","db","sa","sb","s.mean"), values=c(1,2,1,2,0))+ guides(color=guide_legend(override.aes=list(shape=c(NA,NA,NA,NA,16))))

How to specify the color of lines and points in ecdf ggplot2

More articles: