The hard part is extracting the actual date of birth from the identifier. The following function does this by creating three arrays to search for โ19โ or โ20โ depending on whether the year is 00-36, 37-57 or 58-99. It returns dates in the standard format "yyyy-mm-dd" :
A <- c(rep("19",4),rep("20",6)) B <- c(rep("19",5),rep("20",4),"19") C <- c(rep("19",5),rep("18",4),"19") birthday <- function(code){ day <- substr(code,1,2) month <- substr(code,3,4) year <- substr(code,5,6) snum <- 1+as.numeric(substr(code,7,7)) prefix <- ifelse(as.numeric(year) <= 36,A[snum],ifelse(as.numeric(year)<=57,B[snum],C[snum])) year <- paste0(prefix,year) paste(year,month,day,sep = "-") }
For instance:
df <- data.frame(ID = c("1901912222","0110841111","0404143333","1602024444")) df$BD <- birthday(df$ID)
Yielding:
ID BD 1 1901912222 1991-01-19 2 0110841111 1984-10-01 3 0404143333 1914-04-04 4 1602024444 2002-02-16
As soon as you have a birthday in the standard 4-digit format of the year, it is quite simple, for example, to calculate the age. See this question.