Excel SUMIFS equivalent in R

Question

Excel SUMIFS equivalent in R

I am very new to R, and I am considering ways to recreate Excel VBA macros and Excel worksheet functions such as SUMIFS. SUMIFS summarizes a column if a row has records matching several conditions on other columns.

I have the following data frame and I want to compute a new column. The new column represents the Sample sum for all rows that overlap with the Start Date and EndDate ranges. For example, in line 1 it will be 697 (the sum of the first 3 lines ). Criteria for a specific amount: include Sample if EndDate >= StartDate[i] & StartDate <=EndDate[i]

  StartDate EndDate Sample *SUMIFS example* 10/01/14 24/01/14 139 *697* 12/01/14 26/01/14 136 19/01/14 02/02/14 422 25/01/14 08/02/14 762 29/01/14 12/02/14 899 05/02/14 19/02/14 850 07/02/14 21/02/14 602 09/02/14 23/02/14 180 18/02/14 04/03/14 866

Any comments or pointers would be greatly appreciated.

+5

r sumifs

Barnaby1 Nov 03 '14 at 18:46

source share

4 answers

cameron.bracken · Answer 1 · 2014-11-03T20:52:09+0000

You can do this with a loop or with Cartesian merging. I do not know any built-in functions to do just that.

 library(dplyr) x = structure(list(StartDate = structure(c(1389312000, 1389484800, 1390089600, 1390608000, 1390953600, 1391558400, 1391731200, 1391904000, 1392681600), tzone = "UTC", class = c("POSIXct", "POSIXt")), EndDate = structure(c(1390521600, 1390694400, 1391299200, 1391817600, 1392163200, 1392768000, 1392940800, 1393113600, 1393891200), tzone = "UTC", class = c("POSIXct", "POSIXt" )), Sample = c(139L, 136L, 422L, 762L, 899L, 850L, 602L, 180L, 866L)), .Names = c("StartDate", "EndDate", "Sample" ), row.names = c(NA, -9L), class = "data.frame") x2 = x names(x2)=c('StartDate2','EndDate2','Sample2') x3 = merge(x,x2,allow.cartesian =T) x4 = summarise(group_by(x3,StartDate,EndDate), sumifs=sum(Sample2[EndDate2 >= StartDate & StartDate2 <= EndDate])) x_sumifs = merge(x,x4,by=c('StartDate','EndDate'))

This is what the output looks like.

 > x_sumifs StartDate EndDate Sample sumifs 1 2014-01-10 2014-01-24 139 697 2 2014-01-12 2014-01-26 136 1459 3 2014-01-19 2014-02-02 422 2358 4 2014-01-25 2014-02-08 762 3671 5 2014-01-29 2014-02-12 899 3715 6 2014-02-05 2014-02-19 850 4159 7 2014-02-07 2014-02-21 602 4159 8 2014-02-09 2014-02-23 180 3397 9 2014-02-18 2014-03-04 866 2498

janos · Answer 2 · 2014-11-03T20:23:56+0000

Assuming you got the above data in df data frame:

 sum(df$Sample[EndDate >= df$StartDate & StartDate <= df$EndDate])

I.e:

df$Sample[...] selects the Sample column with the conditions specified in [...]
EndDate >= df$StartDate and StartDate <= df$EndDate are taken from your example converted to R conditions, with & between, so that both conditions are true at the same time. Note that there are no i indices in the expression. The way this works in R, the expression is evaluated for each row in the data frame, and the result df$Sample[...] is a vector of values, only the values in which the expression in [...] was true
sum is, of course, a built-in function for calculating the sum, of course

akrun · Answer 3 · 2014-11-04T10:57:32+0000

You can use lapply/sapply from base R for this. x from @ cameron.bracken's post.

 x$sumifs <- sapply(seq_len(nrow(x)), function(i) with(x, sum(Sample[EndDate >= StartDate[i] & StartDate <= EndDate[i]]))) x # StartDate EndDate Sample sumifs #1 2014-01-10 2014-01-24 139 697 #2 2014-01-12 2014-01-26 136 1459 #3 2014-01-19 2014-02-02 422 2358 #4 2014-01-25 2014-02-08 762 3671 #5 2014-01-29 2014-02-12 899 3715 #6 2014-02-05 2014-02-19 850 4159 #7 2014-02-07 2014-02-21 602 4159 #8 2014-02-09 2014-02-23 180 3397 #9 2014-02-18 2014-03-04 866 2498

Abhishek tawar · Answer 4 · 2017-07-13T18:21:45+0000

You can use the 'by' function to get the value. In the data frame, the “by” is broken down line by line into data frames multiplied by the values of one or more factors, and the function is applied to each subset in turn.

 x$sumifs <- by(Sample[EndDate >= StartDate[i] & StartDate <= EndDate[i]],sum)

More details about the function can be found here.

Excel SUMIFS equivalent in R

More articles: