How to get the final statistics in R after a negative selection of a data frame

Question

How to get the final statistics in R after a negative selection of a data frame

I would like to negatively select (everything except the given row value for each level of the factor variable) and summarize the remaining data. For a simple example, I have a DF data frame with two columns.

>DF Category Value A 5 B 2 C 3 A 1 C 1

It would look like if dplyr could have a negative choice (can it?).

 > DF %>% group_by(!Category) %>% summarise(avg = mean(Value)) !Category avg A 2.00 #average of all rows where category isn't A B 2.50 C 2.67

+5

r dataframe dplyr

Mark Mar 21 '16 at 20:02

source share

3 answers

Using data.table , we can try:

 library(data.table) setDT(DF)[, DF[!Category %in% .BY[[1]], mean(Value)], by = Category] # Category V1 #1: A 2.000000 #2: B 2.500000 #3: C 2.666667

+2

mtoto Mar 21 '16 at 21:08

source share

Another way is to use a for loop:

 DF<-data.frame(Category=c("A","B","C","A","C"), Value=c(5,2,3,1,1)) DF2<-data.frame(Category=unique(DF$Category)) for(letter in unique(DF$Category)) { DF3<-DF[DF$Category!=letter,] DF2$avg[DF2$Category==letter]<-round(mean(DF3$Value),2) } DF2 Category avg 1 A 2.00 2 B 2.50 3 C 2.67

0

Mario Mar 22 '16 at 0:48

source share

bouncyball · Accepted Answer · 2016-03-21T20:19:11+0000

Here you can do it in the R database:

edit : thanks for suggesting extensible change @Ryan

 > sapply(levels(DF$Category), FUN = function(x) mean(subset(DF, Category != x)$Value)) ABC 2.000000 2.500000 2.666667

How to get the final statistics in R after a negative selection of a data frame

More articles: