R: calculate variance for data $ V1 for each different value in data $ V2

I have a data frame similar to this

V1 V2 .. 1 .. 2 .. 1 .. 3 

and etc.

For each individual value of V2, I would like to calculate the variance of the data in V1. I just started my adventure with R, any hints how to do this? for my particular case, I assume that I can manually do something like

  var1 = var(data[data$V2==1, "V1"]) var2 = ... 

etc., because I know all the possible values โ€‹โ€‹of V2 (there are not many of them), however, I am curious what more general solutions will be. Any ideas?

+4
source share
4 answers
 library(reshape) ddply(data, .(V2), summarise, variance=var(V1)) 
+3
source

And the old tapply standby:

 dat <- data.frame(x = runif(50), y = rep(letters[1:5],each = 10)) tapply(dat$x,dat$y,FUN = var) abcde 0.03907351 0.10197081 0.08036828 0.03075195 0.08289562 
+9
source

Another solution using data.table . This is much faster, especially useful when you have large data sets.

 require(data.table) dat2 = data.table(dat) ans = dat2[,list(variance = var(V1)),'V2'] 
+7
source

There are several ways to do this, I prefer:

 dat <- data.frame(V1 = rnorm(50), V2=rep(1:5,10)) dat aggregate (V1~V2, data=dat, var) # The first argument tells it to group V1 based on the values in V2, the last argument simply tells it the function to apply. > aggregate (V1~V2, data=dat, var) V2 V1 1 1 0.9139360 2 2 1.6222236 3 3 1.2429743 4 4 1.1889356 5 5 0.7000294 

Also learn ddply, daply etc. in plyr package.

+3
source

Source: https://habr.com/ru/post/1368883/


All Articles