From the comments, “C2” appears to be a “character” column with % as a suffix. Before creating a group, delete % with sub , convert to "numeric" ( as.numeric ). The variable "group" ( transform(df,...) ) is created using the cut function with breaks (group buckets / intervals) and labels (for the desired group labels). After creating a group variable, the sum value for “C1” by “group” and “counting” of elements within the “group” can be performed using aggregate from “base R”
df1 <- transform(df, group=cut(as.numeric(sub('[%]', '', C2)), breaks=c(-Inf,0.005, 0.010, 0.014, Inf), labels=c('<0.005', 0.005, 0.01, 0.014))) res <- do.call(data.frame,aggregate(C1~group, df1, FUN=function(x) c(Count=length(x), Sum=sum(x)))) dNew <- data.frame(group=levels(df1$group)) merge(res, dNew, all=TRUE) # group C1.Count C1.Sum #1 <0.005 2 3491509.6 #2 0.005 NA NA #3 0.01 2 302997.1 #4 0.014 8 364609.5
or you can use data.table . setDT converts data.frame to data.table . Set the "grouping" variable with by= and sum / create the two variables "Count" and "Sum" within the list( .. .N gives the number of elements in each "group".
library(data.table) setDT(df1)[, list(Count=.N, Sum=sum(C1)), by=group][]
Or using dplyr . %>% binds LHS with RHS arguments and combines them together. Use group_by to specify the "group" variable, and then use summarise_each or summarise to get the total counter and sum corresponding column. summarise_each would be useful if there is more than one column.
library(dplyr) df1 %>% group_by(group) %>% summarise_each(funs(n(), Sum=sum(.)), C1)
Update
Using the new df dataset
df1 <- transform(df, group=cut(C2, breaks=c(-Inf,0.005, 0.010, 0.014, Inf), labels=c('<0.005', 0.005, 0.01, 0.014))) res <- do.call(data.frame,aggregate(cbind(C1,C3)~group, df1, FUN=function(x) c(Count=length(x), Sum=sum(x)))) res # group C1.Count C1.Sum C3.Count C3.Sum #1 <0.005 2 3491509.6 2 91233 #2 0.01 2 302997.1 2 88843 #3 0.014 8 364609.5 8 268809
and you can do merge as described above.
The dplyr approach will be the same except specifying an additional variable
df1%>% group_by(group) %>% summarise_each(funs(n(), Sum=sum(.)), C1, C3) #Source: local data frame [3 x 5] # group C1_n C3_n C1_Sum C3_Sum #1 <0.005 2 2 3491509.6 91233 #2 0.01 2 2 302997.1 88843 #3 0.014 8 8 364609.5 268809
data
df <-structure(list(C1 = c(49488.01172, 268221.1563, 34775.96094, 13046.98047, 2121699.75, 71155.09375, 1369809.875, 750, 44943.82813, 85585.04688, 31090.10938, 68550.40625), C2 = c("0.0512%", "0.0128%", "0.0128%", "0.07241%", "0.00453%", "0.0181%", "0.00453%", "0.2048%", "0.0362%", "0.0362%", "0.0362%", "0.0181%")), .Names = c("C1", "C2"), row.names = c(NA, -12L), class = "data.frame")