Aggregate data using each () with reshape2 :: dcast

I usually use the reshape package to aggregate some data (d'uh), usually with plyr , because of its uber-awesome function each . I recently got an offer to switch to reshape2 and try, and now I can no longer use each wizardry.

Shape change

 > m <- melt(mtcars, id.vars = c("am", "vs"), measure.vars = "hp") > cast(m, am + vs ~ variable, each(min, max, mean, sd)) am vs hp_min hp_max hp_mean hp_sd 1 0 0 150 245 194.16667 33.35984 2 0 1 62 123 102.14286 20.93186 3 1 0 91 335 180.83333 98.81582 4 1 1 52 113 80.57143 24.14441 

reshape2

 require(plyr) > m <- melt(mtcars, id.vars = c("am", "vs"), measure.vars = "hp") > dcast(m, am + vs ~ variable, each(min, max, mean, sd)) Error in structure(ordered, dim = ns) : dims [product 4] do not match the length of object [16] In addition: Warning messages: 1: In fs[[i]](x, ...) : no non-missing arguments to min; returning Inf 2: In fs[[i]](x, ...) : no non-missing arguments to max; returning -Inf 

I was not in the mood to comb this, since my previous code works like a charm with reshape , but I would really like to know:

  • Is it possible to use each with dcast ?
  • Can reshape2 be used at all? reshape out of date?
+4
source share
1 answer

The answer to your first question looks no . Quoting from ?reshape2:::dcast :

If the combination of variables that you supply does not uniquely identify a single row in the original dataset, you will need to provide an aggregate function, fun.aggregate. This function needs a vector of numbers and returns a single summary statistic.

Take a look at the Hadley github page for reshape2 suggests that he knows that this function has been removed, but it seems to be better to do it in plyr , presumably with something like:

 ddply(m,.(am,vs),summarise,min = min(value), max = max(value), mean = mean(value), sd = sd(value)) 

or if you really want to use each :

 ddply(m,.(am,vs),function(x){each(min,max,mean,sd)(x$value)}) 
+5
source

Source: https://habr.com/ru/post/1400904/


All Articles