Calculate the age of a personal number when the year of birth has only two numbers

Question

Calculate the age of a personal number when the year of birth has only two numbers

I know there are many questions similar to this. BUT I do not ask the same thing!

My problem is that all the questions I looked at have birthdays with a whole year, fx 04/05/1971 (format:% d /% m /% Y).

Birthdays in my data are Danish CPR numbers (personal identification numbers), and they look like this:

ID 1901912222 0110841111 0404143333 1602032444

NB: These dates are examples. I have thousands of lines, and these are people of all ages, also above 100 (but most often no more than 17).

1st and 2nd day: birthday 3rd and 4th day: month of birth 5th and 6th day: year of birth Last four = serial number.

So this gives me birthdays (and ages):

  ID birthdate age 1901912222 19/09/91 26 0110841111 01/10/84 33 0404143333 04/04/14 103 1602024444 16/02/02 15

Thus, the format is:% d% m% y [4-digit consecutive number]

Thus, the last four digits (serial number) also contain some information. They say that if a person is 3 or 103 years old (now I do not have a year). See image for description:

I don't know if this helps, but I have Excel code:

= YEAR (NOW ()) - 1-IF (DATE (YEAR (NOW ()); MID (D12; 3; 2); LEFT (D12; 2)) and l; = NOW (); MID (D12; 5 ; 2) + IF (left (right (D12; 4); 1) * 1 <= 3; 1900; IF (and (left (right (D12; 4); 1) * 1 = 4; MID (D12; 5 ; 2) * 1 <= 36); 2000; IF (and (left (right (D12; 4); 1) * 1 = 4; MID (D12; 5; 2) * 1> = 37); 1900; IF (and (left (right (D12; 4); 1) * 1> = 5; left (right (D12; 4); 1) * 1 <= 8; MID (D12; 5; 2) * 1 <= 57 ); 2000; IF (and (left (right (D12; 4); 1) * 1> = 5; left (right (D12; 4); 1) * 1 <= 8; MID (D12; 5; 2) * 1> = 58); 1800; IF (and (left (right (D12; 4); 1) * 1 = 9; MID (D12; 5; 2) * 1 <= 36); 2000 + (MID D12; 5; 2); 1900)))))) - 1; MID (D12; 5; 2) + IF (left (right (D12; 4); 1) * 1 <= 3; 1900; IF (and (left (right (D12; 4); 1) * 1 = 4; MID (D12; 5; 2) * 1 <= 36); 2000; IF (and (f Eve (right (D12; 4); 1) * 1 = 4; MID (D12; 5; 2) * 1> = 37); 1900; IF (AND (left (right (D12; 4); 1) * 1 > = 5; left (right (D12; 4); 1) * 1 <= 8; MID (D12; 5; 2) * 1 <= 57); 2000; IF (and (left (right (D12; 4); 1) * 1> = 5; left (right (D12; 4); 1) * 1 <= 8; MID (D12; 5; 2) * 1> = 58); 1800; IF (and (left (right (D12; 4); 1) * 1 = 9; MID (D12; 5; 2) * 1 <= 36); 2000 + MID (D12; 5; 2) ; 1900))))))))

I really hope you can help me with this problem!

+5

r date-of-birth

Louise sørensen Dec 29 '17 at 12:01

source share

1 answer

John coleman · Accepted Answer · 2017-12-29T13:09:10+0000

The hard part is extracting the actual date of birth from the identifier. The following function does this by creating three arrays to search for “19” or “20” depending on whether the year is 00-36, 37-57 or 58-99. It returns dates in the standard format "yyyy-mm-dd" :

 A <- c(rep("19",4),rep("20",6)) B <- c(rep("19",5),rep("20",4),"19") C <- c(rep("19",5),rep("18",4),"19") birthday <- function(code){ day <- substr(code,1,2) month <- substr(code,3,4) year <- substr(code,5,6) snum <- 1+as.numeric(substr(code,7,7)) prefix <- ifelse(as.numeric(year) <= 36,A[snum],ifelse(as.numeric(year)<=57,B[snum],C[snum])) year <- paste0(prefix,year) paste(year,month,day,sep = "-") }

For instance:

 df <- data.frame(ID = c("1901912222","0110841111","0404143333","1602024444")) df$BD <- birthday(df$ID)

Yielding:

  ID BD 1 1901912222 1991-01-19 2 0110841111 1984-10-01 3 0404143333 1914-04-04 4 1602024444 2002-02-16

As soon as you have a birthday in the standard 4-digit format of the year, it is quite simple, for example, to calculate the age. See this question.

Calculate the age of a personal number when the year of birth has only two numbers

More articles: