I am trying to calculate the sum of the values ββ(in XYZmin ) that occur on the same date.
My data is as follows:
bar <- structure(list(date = structure(c(15622, 15622, 15622, 15628, 15632, 15635, 15639, 15639, 15639, 15639, 15639, 15642, 15646, 15646, 15650, 15650, 15650, 15657, 15660, 15660, 15674, 15681, 15691, 15695, 15709, 15716, 15723, 15730, 15737, 15737, 15737, 15737, 15737, 15737, 15740, 15743, 15743, 15743, 15744, 15744, 15744, 15744, 15746, 15751, 15755, 15758), class = "Date"), XYZmin = c(-20, -15, -10, -70, -60, -60, -95, -10, -10, -40, -25, -25, -20, -10, -3, -5, -25, -5, -70, -5, -30, -30, -25, 60, 60, 60, 60, 60, -10, -10, -30, -30, -10, -10, -10, -60, -30, -10, 75, -10, -10, -10, 60, 60, -15, 60)), .Names = c("date", "XYZmin"), class = "data.frame", row.names = c(NA, -46L)) head(bar) date XYZmin 1 2012-10-09 -20 2 2012-10-09 -15 3 2012-10-09 -10 4 2012-10-15 -70 5 2012-10-19 -60 6 2012-10-22 -60
What I'm trying to accomplish is to create new XYZtot variables, where within the data that occur more than once, they sum the first and second value in the 2nd data and sum the 1st, 2nd and 3rd values on 3 data. That's what I aimed from.
head(new_bar_with_XYZtot) date XYZmin XYZtot 1 2012-10-09 -20 -20 2 2012-10-09 -15 -35 3 2012-10-09 -10 -40 4 2012-10-15 -70 -70 5 2012-10-19 -60 -60 6 2012-10-22 -60 -60
updated with big microbenchmark test
alexwhan <- function(bar,date,XYZmin) ddply(bar, .(date), transform, XYZmin.sum = cumsum(XYZmin)) Arun <- function(bar,date,XYZmin) within(bar, {XYZtot <- ave( XYZmin, date, FUN=cumsum)}) agstudy <- function(bar,date,XYZmin) transform(bar, XYZtot = ave(XYZmin, date, FUN = cumsum)) # install.packages("data.table", dependencies = TRUE) library(data.table) mnel <- function(bar,date,XYZmin) bar <- data.table(bar); bar[, XYZmin.sum := cumsum(XYZmin), by = date]
numbers
Unit: microseconds expr min lq median uq max neval alexwhan(bar, date, XYZmin) 14484.077 15056.613 15237.760 15945.482 72650.126 666 Arun(bar, date, XYZmin) 963.632 1018.311 1070.759 1138.655 4988.226 666 agstudy(bar, date, XYZmin) 1967.292 2021.115 2078.261 2158.689 9240.500 666 mnel(bar, date, XYZmin) 251.312 270.295 282.821 325.040 6540.367 666 ### Plot results: boxplot(res)
