I have data.table similar to this:
library(data.table)
mydt <- data.table(id = LETTERS[1:6], x = 1:6, y = 2:3)
> mydt
id x y
1: A 1 2
2: B 2 3
3: C 3 2
4: D 4 3
5: E 5 2
6: F 6 3
I would like to replace the columns of values by adding a delay and bring to each observation (i.e. x[-1] + x + x[1]). I can do something similar with an awesome feature shift().
cols <- c('x', 'y')
mydt[
,
(cols) := shift(.SD, 1) + .SD + shift(.SD, 1, type = 'lead'),
.SDcols = cols
][]
id x y
1: A NA NA
2: B 6 7
3: C 9 8
4: D 12 7
5: E 15 8
6: F NA NA
But this introduces NA for strings where there is no lead / lag value. How can I change the calculation to use the available two values only for these rows (e.g. na.rm = TRUE)? So that there is a way out
id x y
1: A 3 5
2: B 6 7
3: C 9 8
4: D 12 7
5: E 15 8
6: F 11 5
I tried to use sum(..., na.rm = TRUE)instead of the operator +, but it gives an error: Error in sum(shift(.SD, 1), .SD, shift(.SD, 1, type = "lead"), na.rm = TRUE) :
invalid 'type' (list) of argument.
I also tried the following, but apparently gives something else as a result.
mydt[
,
(cols) := lapply(
.SD,
function(x) sum(shift(x, 1), x, shift(x, 1, type = 'lead'), na.rm = TRUE)
),
.SDcols = cols
][]
id x y
1: A 126 90
2: B 126 90
3: C 126 90
4: D 126 90
5: E 126 90
6: F 126 90