I wanted an implementation that did not increase my dependencies outside of data.table , which is usually my only dependency. data.table is only needed for mday, which means the day of the month.
This is a function since my brain works when I consider someone as old as:
require(data.table) agecalc <- function(origin, current){ y <- year(current) - year(origin) - 1 offset <- 0 if(month(current) > month(origin)) offset <- 1 if(month(current) == month(origin) & mday(current) >= mday(origin)) offset <- 1 age <- y + offset return(age) }
This is the same logic, reorganized and vectorized:
agecalc <- function(origin, current){ age <- year(current) - year(origin) - 1 ii <- (month(current) > month(origin)) | (month(current) == month(origin) & mday(current) >= mday(origin)) age[ii] <- age[ii] + 1 return(age) }
I could imagine scenarios in which string comparisons could be faster if you specify the year as a number and the date of birth as a string.
agecalc <- function(origin, current){ origin <- as.character(origin) current <- as.character(current) age <- as.numeric(substr(current, 1, 4)) - as.numeric(substr(origin, 1, 4)) - 1 if(substr(current, 6, 10) >= substr(origin, 6, 10)){ age <- age + 1 } return(age) }
Some tests:
agecalc(as.IDate("1985-08-13"), as.IDate("1985-08-12")) agecalc(as.IDate("1985-08-13"), as.IDate("1985-08-13")) agecalc(as.IDate("1985-08-13"), as.IDate("1986-08-12")) agecalc(as.IDate("1985-08-13"), as.IDate("1986-08-13")) agecalc(as.IDate("1985-08-13"), as.IDate("1986-09-12")) agecalc(as.IDate("2000-02-29"), as.IDate("2000-02-28")) agecalc(as.IDate("2000-02-29"), as.IDate("2000-02-29")) agecalc(as.IDate("2000-02-29"), as.IDate("2001-02-28")) agecalc(as.IDate("2000-02-29"), as.IDate("2001-02-29")) agecalc(as.IDate("2000-02-29"), as.IDate("2001-03-01")) agecalc(as.IDate("2000-02-29"), as.IDate("2004-02-28")) agecalc(as.IDate("2000-02-29"), as.IDate("2004-02-29")) agecalc(as.IDate("2000-02-29"), as.IDate("2011-03-01")) ## Requires vectorized version: d <- data.table(d=as.IDate("2000-01-01") + 0:10000) d[ , b1 := as.IDate("2000-08-15")] d[ , b2 := as.IDate("2000-02-29")] d[ , age1_num := (d - b1) / 365] d[ , age2_num := (d - b2) / 365] d[ , age1 := agecalc(b1, d)] d[ , age2 := agecalc(b2, d)] d