Calculate the age of a personal number when the year of birth has only two numbers

I know there are many questions similar to this. BUT I do not ask the same thing!

My problem is that all the questions I looked at have birthdays with a whole year, fx 04/05/1971 (format:% d /% m /% Y).

Birthdays in my data are Danish CPR numbers (personal identification numbers), and they look like this:

ID 1901912222 0110841111 0404143333 1602032444 

NB: These dates are examples. I have thousands of lines, and these are people of all ages, also above 100 (but most often no more than 17).

1st and 2nd day: birthday 3rd and 4th day: month of birth 5th and 6th day: year of birth Last four = serial number.

So this gives me birthdays (and ages):

  ID birthdate age 1901912222 19/09/91 26 0110841111 01/10/84 33 0404143333 04/04/14 103 1602024444 16/02/02 15 

Thus, the format is:% d% m% y [4-digit consecutive number]

Thus, the last four digits (serial number) also contain some information. They say that if a person is 3 or 103 years old (now I do not have a year). See image for description:

Year of birth and serial number

I don't know if this helps, but I have Excel code:

= YEAR (NOW ()) - 1-IF (DATE (YEAR (NOW ()); MID (D12; 3; 2); LEFT (D12; 2)) and l; = NOW (); MID (D12; 5 ; 2) + IF (left (right (D12; 4); 1) * 1 <= 3; 1900; IF (and (left (right (D12; 4); 1) * 1 = 4; MID (D12; 5 ; 2) * 1 <= 36); 2000; IF (and (left (right (D12; 4); 1) * 1 = 4; MID (D12; 5; 2) * 1> = 37); 1900; IF (and (left (right (D12; 4); 1) * 1> = 5; left (right (D12; 4); 1) * 1 <= 8; MID (D12; 5; 2) * 1 <= 57 ); 2000; IF (and (left (right (D12; 4); 1) * 1> = 5; left (right (D12; 4); 1) * 1 <= 8; MID (D12; 5; 2) * 1> = 58); 1800; IF (and (left (right (D12; 4); 1) * 1 = 9; MID (D12; 5; 2) * 1 <= 36); 2000 + (MID D12; 5; 2); 1900)))))) - 1; MID (D12; 5; 2) + IF (left (right (D12; 4); 1) * 1 <= 3; 1900; IF (and (left (right (D12; 4); 1) * 1 = 4; MID (D12; 5; 2) * 1 <= 36); 2000; IF (and (f Eve (right (D12; 4); 1) * 1 = 4; MID (D12; 5; 2) * 1> = 37); 1900; IF (AND (left (right (D12; 4); 1) * 1 > = 5; left (right (D12; 4); 1) * 1 <= 8; MID (D12; 5; 2) * 1 <= 57); 2000; IF (and (left (right (D12; 4); 1) * 1> = 5; left (right (D12; 4); 1) * 1 <= 8; MID (D12; 5; 2) * 1> = 58); 1800; IF (and (left (right (D12; 4); 1) * 1 = 9; MID (D12; 5; 2) * 1 <= 36); 2000 + MID (D12; 5; 2) ; 1900))))))))

I really hope you can help me with this problem!

+5
source share
1 answer

The hard part is extracting the actual date of birth from the identifier. The following function does this by creating three arrays to search for โ€œ19โ€ or โ€œ20โ€ depending on whether the year is 00-36, 37-57 or 58-99. It returns dates in the standard format "yyyy-mm-dd" :

 A <- c(rep("19",4),rep("20",6)) B <- c(rep("19",5),rep("20",4),"19") C <- c(rep("19",5),rep("18",4),"19") birthday <- function(code){ day <- substr(code,1,2) month <- substr(code,3,4) year <- substr(code,5,6) snum <- 1+as.numeric(substr(code,7,7)) prefix <- ifelse(as.numeric(year) <= 36,A[snum],ifelse(as.numeric(year)<=57,B[snum],C[snum])) year <- paste0(prefix,year) paste(year,month,day,sep = "-") } 

For instance:

 df <- data.frame(ID = c("1901912222","0110841111","0404143333","1602024444")) df$BD <- birthday(df$ID) 

Yielding:

  ID BD 1 1901912222 1991-01-19 2 0110841111 1984-10-01 3 0404143333 1914-04-04 4 1602024444 2002-02-16 

As soon as you have a birthday in the standard 4-digit format of the year, it is quite simple, for example, to calculate the age. See this question.

+4
source

Source: https://habr.com/ru/post/1274379/


All Articles