Create a new variable based on the function of other variables

How to pass columns as arguments to a function, and then create a new column that is a function of the other two? For example, taking this great function to add months to a date and taking this sample data frame:

df <- structure(
  list(
date = structure(
  c(
    17135,
    17105,
    17105,
    17074,
    17286,
    17317,
    17317,
    17347,
    17105,
    17317
  ),
  class = "Date"
),
monthslater = c(10,
                11, 13, 14, 3, 3, 3, 3, 4, NA)
  ),
  .Names = c("date", "monthslater"),
  row.names = c(NA, 10L),
  class = "data.frame"
)

I would like to create a new column in which I will give the recording of the columns dateand monthslaterin function add.months. I would think something like this would work:

df$newdate <- add.months(df$date, df$monthslater)

But this is not so.

Full code for the function:

add.months <- function(date,n) seq(date, by = paste(n, "months"), length = 2)[2]
+4
source share
3 answers

Using %m+%from the lubridatepackage:

library(lubridate)
df$newdate <- df$date %m+% months(df$monthslater)

gives:

> df
         date monthslater    newdate
1  2016-11-30          10 2017-09-30
2  2016-10-31          11 2017-09-30
3  2016-10-31          13 2017-11-30
4  2016-09-30          14 2017-11-30
5  2017-04-30           3 2017-07-30
6  2017-05-31           3 2017-08-31
7  2017-05-31           3 2017-08-31
8  2017-06-30           3 2017-09-30
9  2016-10-31           4 2017-02-28
10 2017-05-31           4 2017-09-30

Similarly, you can also add days or years:

df$newdate2 <- df$date %m+% days(df$monthslater)
df$newdate3 <- df$date %m+% years(df$monthslater)

which gives:

> df
         date monthslater    newdate   newdate2   newdate3
1  2016-11-30          10 2017-09-30 2016-12-10 2026-11-30
2  2016-10-31          11 2017-09-30 2016-11-11 2027-10-31
3  2016-10-31          13 2017-11-30 2016-11-13 2029-10-31
4  2016-09-30          14 2017-11-30 2016-10-14 2030-09-30
5  2017-04-30           3 2017-07-30 2017-05-03 2020-04-30
6  2017-05-31           3 2017-08-31 2017-06-03 2020-05-31
7  2017-05-31           3 2017-08-31 2017-06-03 2020-05-31
8  2017-06-30           3 2017-09-30 2017-07-03 2020-06-30
9  2016-10-31           4 2017-02-28 2016-11-04 2020-10-31
10 2017-05-31           4 2017-09-30 2017-06-04 2021-05-31
+6
source

R:

df$newdate <- mapply(add.months, df[[1]], df[[2]], SIMPLIFY = FALSE)

> df
         date monthslater    newdate
1  2016-11-30          10 2017-09-30
2  2016-10-31          11 2017-10-01
3  2016-10-31          13 2017-12-01
4  2016-09-30          14 2017-11-30
5  2017-04-30           3 2017-07-30
6  2017-05-31           3 2017-08-31
7  2017-05-31           3 2017-08-31
8  2017-06-30           3 2017-09-30
9  2016-10-31           4 2017-03-03
10 2017-05-31           4 2017-10-01
+1

For your immediate, specific problem, consider mapplypassing these two vectors differently into a specific function. And as monthlater includes NA, add tryCatchto a specific function.

add.months <- function(date, n) {
  tryCatch(seq(date, by = paste(n, "months"), length = 2)[2],
           warning = function(w) return(NA),
           error = function(e) return(NA))
}

df$newdate <- as.Date(mapply(add.months, df$date, df$monthslater), origin="1970-01-01")
df

#          date monthslater    newdate
# 1  2016-11-30          10 2017-09-30
# 2  2016-10-31          11 2017-10-01
# 3  2016-10-31          13 2017-12-01
# 4  2016-09-30          14 2017-11-30
# 5  2017-04-30           3 2017-07-30
# 6  2017-05-31           3 2017-08-31
# 7  2017-05-31           3 2017-08-31
# 8  2017-06-30           3 2017-09-30
# 9  2016-10-31           4 2017-03-03
# 10 2017-05-31          NA       <NA>

In addition, pay attention to the author’s element associated with the end of February, and therefore # 9 is extended 3 days in advance.

+1
source

Source: https://habr.com/ru/post/1692591/


All Articles