How to get proportions and number of data frames in r

I have a data frame as shown below, but with lots of rows

> df<-data.frame(x1=c(1,1,0,0,1,0),x2=c("a","a","b","a","c","c")) > df x1 x2 1 1 a 2 1 a 3 0 b 4 0 a 5 1 c 6 0 c 

From df I need a data frame where strings are unique df$x2 values, and col1 is the fraction of 1s associated with each letter, and col2 is the amount of each letter. So my conclusion will be

  > getprops(df) prop count a .6666 3 b 0 1 c 0.5 2 

I can come up with some complicated, dirty ways to do this, but I'm looking for something short and effective. Thanks

+4
source share
5 answers

Try installing plyr and run

 library(plyr) df <- data.frame(x1=c(1, 1, 0, 0, 1, 0), label=c("a", "a", "b", "a", "c", "c")) ddply(df, .(label), summarize, prop = mean(x1), count = length(x1)) # label prop count # 1 a 0.6666667 3 # 2 b 0.0000000 1 # 3 c 0.5000000 2 

which uses the split / apply / comb method under the hood, similar to the one in the R base:

 do.call(rbind, lapply(split(df, df$x2), with, list(prop = mean(x1), count = length(x1)))) 
+4
source

I like @RicardoSaporta (+1) solution, but can you also use ? prop.table :

 > df<-data.frame(x1=c(1,1,0,0,1,0),x2=c("a","a","b","a","c","c")) > df x1 x2 1 1 a 2 1 a 3 0 b 4 0 a 5 1 c 6 0 c > tab <- table(df$x2, df$x1) > tab 0 1 a 1 2 b 1 0 c 1 1 > ptab <- prop.table(tab, margin=1) > ptab 0 1 a 0.3333333 0.6666667 b 1.0000000 0.0000000 c 0.5000000 0.5000000 > dframe <- data.frame(values=rownames(tab), prop=ptab[,2], count=tab[,2]) > dframe values prop count aa 0.6666667 2 bb 0.0000000 0 cc 0.5000000 1 

If you want, you can combine this into one function:

 getprops <- function(values, indicator){ tab <- table(values, indicator) ptab <- prop.table(tab, margin=1) dframe <- data.frame(values=rownames(tab), prop=ptab[,2], count=tab[,2]) return(dframe) } > getprops(values=df$x2, indicator=df$x2) values prop count aa 0.6666667 2 bb 0.0000000 0 cc 0.5000000 1 
+4
source

I am not sure if this does what you want.

 df<-data.frame(x1=c(1,1,0,0,1,0),x2=c("a","a","b","a","c","c")) ones <- with(df, aggregate(x1 ~ x2, FUN = sum)) count <- table(df$x2) prop <- ones$x1 / count df2 <- data.frame(prop, count) df2 rownames(df2) <- df2[,3] df2 <- df2[,c(2,4)] colnames(df2) <- c('prop', 'count') df2 prop count a 0.6666667 3 b 0.0000000 1 c 0.5000000 2 
+3
source

Here is one liner in data.table :

 > DT[, list(props=sum(x1) / .N, count=.N), by=x2] x2 props count 1: a 0.6666667 3 2: b 0.0000000 1 3: c 0.5000000 2 


where DT <- data.table(df)

+3
source

Try using table

 tbl <- table(df$x1, df$x2) # abc # 0 1 1 1 # 1 2 0 1 tbl["1",] / colSums(tbl) # abc # 0.6666667 0.0000000 0.5000000 

To make good use of the output:

 data.frame(proportions=tbl["1",] / colSums(tbl)) proportions a 0.6666667 b 0.0000000 c 0.5000000 
+2
source

Source: https://habr.com/ru/post/1490088/


All Articles