R resets cumsum to zero at the beginning of each year

I have a dataframe with a bunch of donation data. I take the data and arrange it in a temporary order from the oldest to the most recent gifts. Then I add a column containing the total amount of gifts over time. The data has long-term data, and I was looking for a good way to reset cumsum to 0 at the beginning of each year (the year begins and ends July 1 for tax purposes).

Here's how to do it now:

 id date giftamt cumsum() 005 01-05-2001 20.00 20.00 007 06-05-2001 25.00 45.00 009 12-05-2001 20.00 65.00 012 02-05-2002 30.00 95.00 015 08-05-2002 50.00 145.00 025 12-05-2002 25.00 170.00 ... ... ... ... 

Here's what I would like to look like:

 id date giftamt cumsum() 005 01-05-2001 20.00 20.00 007 06-05-2001 25.00 45.00 009 12-05-2001 20.00 20.00 012 02-05-2002 30.00 50.00 015 08-05-2002 50.00 50.00 025 12-05-2002 25.00 75.00 ... ... ... ... 

Any suggestions?

UPDATE:

Here's the code that finally worked kindly with Seb:

 #tweak for changing the calendar year to fiscal year df$year <- as.numeric(format(as.Date(df$giftdate), format="%Y")) df$month <- as.numeric(format(as.Date(df$giftdate), format="%m")) df$year <- ifelse(df$month<=6, df$year, df$year+1) #cum-summing :) library(plyr) finalDf <- ddply(df, .(year), summarize, cumsum(as.numeric(as.character(giftamt)))) 
+4
source share
3 answers

I would try it like this (df is a dataframe):

 #tweak for changing the calendar year to fiscal year df$year <- format(as.Date(df$date), format="%Y") df$month <- format(as.Date(df$date), format="%m") df$year <- ifelse(df$month<=6, year, year+1) #cum-summing :) library(plyr) ddply(df, .(year), summarize, cumsum(giftamt)) 
+9
source

There are two tasks: create a column in the data frame that represents each year, then split the data, apply cumm and recombine. R has many ways to do both parts.

Probably the most readable way to complete the first task is year from the lubridate package.

 library(lubridate) df$year <- year(df$date) 

Please note that R has many date formats, so check if you are currently POSIXct or Date or chron or zoo or xts or one of the other formats.

Choosing Seb or ddply for the second task is the one I would recommend. For completeness, you can also use tapply or aggregate .

 with(df, tapply(giftamt, year, cumsum)) aggregate(giftamt ~ year, df, cumsum) 

With the new information you want to change over the years of July 1, update the year column to

 df$year <- with(df, year(date) + (month(date) >= 7)) 
+3
source
 gifts <- read.table("gifts.txt", header=T, quote="\"") NbGifts <- nrow(gifts) # Determination of the relevant fiscal year ending dates CalYear <- as.numeric(substr(gifts$date,7,10)) # calendar years TCY <- as.numeric(names(table(CalYear))) # list of calendar years MDFY <- "07-01-" # ending date for the current fiscal year EFY <- paste(MDFY,TCY,sep="") # list of fiscal year ending dates EFYplus <- cbind(TCY,EFY) # table of fiscal year ending dates colnames(EFYplus) <- c("CalYear","EndDate") # Manipulation of data frames in order to match # the fiscal year end dates to the relevant dates giftsPlusYear <- data.frame(CalYear, gifts, stringsAsFactors = FALSE) giftsPlusEFY <- merge(giftsPlusYear,EFYplus) # using the CalYear # Date comparison in order to associate a gift to its fiscal year DateGift <- as.Date(giftsPlusEFY$date,"%m-%d-%y") # date conversion for comparison DateEFY <- as.Date(giftsPlusEFY$EndDate,"%m-%d-%y") FiscYear <- ifelse(DateGift<DateEFY,giftsPlusEFY$CalYear,giftsPlusEFY$CalYear+1) # Computation of cumulative totals per fiscal year LastFY <- 0 CumGift <- rep(0,NbGifts) for (g in 1:NbGifts){ if (LastFY==FiscYear[g]){ CumGift[g] <- CumGift[g-1] + gifts$giftamt[g] } else { CumGift[g] <- gifts$giftamt[g] LastFY <- FiscYear[g] } } (CumGifts <- cbind(gifts,CumGift)) 
+1
source

Source: https://habr.com/ru/post/1386726/


All Articles