Turn-based and complex geometry in the same figure?

Question

Turn-based and complex geometry in the same figure?

I have the following graph, which is essentially two histograms of distributions plotted along each other:

my.barplot <- function( df, title="", ... ) { df.count <- aggregate( df$outcome, by=list(df$category1,df$outcome), FUN=length ) colnames( df.count ) <- c("category1","outcome","n") df.total <- aggregate( df.count$n, by=list(df.count$category1), FUN=sum ) colnames( df.total ) <- c("category1","total") df.dens <- merge(df.count, df.total) df.dens$dens <- with( df.dens, n/total ) p <- ggplot( df.dens, aes( x=outcome, fill=category1 ), ... ) p <- p + geom_bar( aes( y=dens ), position="dodge" ) p <- p + opts( axis.text.x=theme_text(angle=-90,hjust=0), title=title ) p } N <- 50*(2*8*2) outcome <- sample(ordered(seq(8)),N,replace=TRUE,prob=c(seq(4)/20,rev(seq(4)/20)) ) category2 <- ifelse( outcome==1, sample(c("yes","not"), prob=c(.95,.05)), sample(c("yes","not"), prob=c(.35,.65)) ) dat <- data.frame( category1=rep(c("in","out"),each=N/2), category2=category2, outcome=outcome ) my.barplot(dat)

existing barchart

I would like to build in each bar a share belonging to some second category. No need to organize it in the first category, I would just stack the bars. However, I cannot figure out how to add the second category. Basically in every bar result-category1 I want the proportion in category 2 to be darker shaded.

Here's a GIMP image of what I'm trying to create:

barchart with stacked proportions of category2

+6

r ggplot2

Ari B. Friedman Feb 29 '12 at 3:23

source share

3 answers

I like the comment by @MattP; I would add that the alternative alpha() is a direct indication of transparency. For example, # FF0000 is a solid color, and # FF000033 is a pale / partially transparent color. As always, a search at http://addictedtor.free.fr/graphiques/ can help you find the code to create the exact graph style that you are after.

+1

Carl Witthoft Feb 29 '12 at 12:29

source share

Well, I gave him a chance, but I didn’t make a ton of progress, except that I put the corresponding densities in the same .frame data:

 my.barplot <- function( df, title="", legend.title="",... ) { df.count12 <- aggregate( df$outcome, by=list(df$category1,df$category2,df$outcome), FUN=length ) colnames( df.count12 ) <- c("category1","category2","outcome","n") df.total <- aggregate( df.count12$n, by=list(df.count12$category1), FUN=sum ) colnames( df.total ) <- c("category1","total") # Densities within a bar - Categories 1 & 2 df.dens12 <- merge(df.count12, df.total) df.dens12$dens12 <- with( df.dens12, n/total ) # Total bar height - Category 1 density df.count1 <- aggregate( df.dens12$n, by=list(df.dens12$category1,df.dens12$outcome), FUN=sum ) colnames( df.count1 ) <- c("category1","outcome","n") df.dens1 <- merge(df.count1,df.total) df.dens1$dens1 <- with(df.dens1, n/total) # Merge both into the final dataset df.dens <- merge(df.dens12,df.dens1,all.x=TRUE,by=c("category1","outcome")) df.dens <- subset(df.dens, select=c(-total.x) ) colnames( df.dens ) <- sub("\\.x","12",colnames(df.dens)) colnames( df.dens ) <- sub("\\.y","1",colnames(df.dens)) # Plot ymax <- max(df.dens$dens1) # Plot 1: category1 p <- ggplot( df.dens, aes( x=outcome, fill=category1 ), ... ) p1 <- p + geom_bar( aes( y=dens1 ), position="dodge" ) p1 <- p1 + opts( axis.text.x=theme_text(angle=-90,hjust=0), title=title ) if(legend.title!="") { p1 <- p1 + scale_colour_discrete(name=legend.title) } # Plot 2: category2 p2 <- p1 + geom_bar( aes( y=dens12, fill=category2 ), position="stack", stat="identity" ) p2 } N <- 50*(2*8*2) outcome <- sample(ordered(seq(8)),N,replace=TRUE,prob=c(seq(4)/20,rev(seq(4)/20)) ) category2 <- ifelse( outcome==1, sample(c("yes","not"), prob=c(.95,.05)), sample(c("yes","not"), prob=c(.35,.65)) ) dat <- data.frame( category1=rep(c("in","out"),each=N/2), category2=category2, outcome=outcome ) my.barplot(dat, title="Test title", legend.title="Medical system")

Comparing my attempts with the link, it’s clear that it puts the third dimension (x = result, dodge = category1, stack = category2) along with using the grid layout, while I really need the third dimension laid out in the second dimension. I think I may have come to the point that ggplot2 is being tormented too much, and I should just write a function using the basic graphics. Woe

0

Ari B. Friedman Mar 01 '12 at 0:47

source share

Matt parker · Accepted Answer · 2012-03-01T01:31:44+0000

Basic Graphics?!? NEVERRRR

Here is what I came up with. I admit that it was difficult for me to understand all your aggregation and preparation, so I just aggregated to calculate and possibly got everything wrong, but it seems that you are in a position where it is easier to start with a valid plot and then insert the inputs correctly. Does it do the trick?

 # Aggregate dat.agg <- ddply(dat, .var = c("category1", "outcome"), .fun = summarise, cat1.n = length(outcome), yes = sum(category2 %in% "yes"), not = sum(category2 %in% "not") ) # Plot - outcome will be x for both layers ggplot(dat.agg, aes(x = outcome)) + # First layer of bars - for category1 totals by outcome geom_bar(aes(weight = cat1.n, fill = category1), position = "dodge") + # Second layer of bars - number of "yes" by outcome and category1 geom_bar(aes(weight = yes, fill = category1), position = "dodge") + # Transparency to make total lighter than "yes" - I am bad at colors scale_fill_manual(value = c(alpha("#1F78B4", 0.5), alpha("#33A02C", 0.5))) + # Title opts(title = "A pretty plot <3")

Turn-based and complex geometry in the same figure?

More articles: