R Lubridate returns an unwanted century when it is given a two-digit year

In R , I have a vector of strings representing dates in two different formats:

  • "month / day / year"
  • "month day, year"

The first format has a two-digit year, so my vector looks something like this:

c("3/18/75", "March 10, 1994", "10/1/80", "June 15, 1979",...)

I want to put dates in a vector in a standard format. This should be easy using the function mdyfrom the package lubridate, except when I pass it the first format, it returns an unwanted century.

mdy("3/18/75") returns "2075-03-18 UTC"

Does anyone know how it can return a date in the 20th century? This is "1975-03-18 UTC" . Any other decision on how to standardize dates would also be greatly appreciated.

I am running lubridate_1.3.3 version if that matters.

+5
source share
4 answers

Lubridate v1.7.1 does not have this problem.

-1
source

lubridate v1.7.4 does. Looking at 2068 as we speak

+1
source

:

some_dates <- c("3/18/75", "March 10, 1994", "10/1/80", "June 15, 1979")
dates <- mdy(some_dates)
future_dates <- year(dates) > year(Sys.Date())
year(dates[future_dates]) <- year(dates[future_dates]) - 100

, , , , 2075 ;)

library(stringr)
some_dates <- c('3/18/75', '01/09/53')
str_replace(some_dates, '[0-9]+$', '19\\0')

, :

some_dates <- c("3/18/75", "March 10, 1994", "10/1/80", "June 15, 1979")
str_replace(some_dates, '/([0-9]{2}$)', '/19\\1')
0

, :

library(lubridate)
dates <- c("3/18/75", "March 10, 1994", "10/1/80", "June 15, 1979", "10/19/15")

adjustCentury <- function(d, threshold=1930){
  y <- year(d) %% 100
  if(y > threshold %% 100) year(d) <- 1900 + y
  d
}

lapply(lapply(dates, mdy), adjustCentury)

:

[[1]]
[1] "1975-03-18 UTC"

[[2]]
[1] "1994-03-10 UTC"

[[3]]
[1] "1980-10-01 UTC"

[[4]]
[1] "1979-06-15 UTC"

[[5]]
[1] "2015-10-19 UTC"
0

Source: https://habr.com/ru/post/1612307/


All Articles