Calculating differences between consecutive values ​​or with the last non-NA value in a vector in R

I am looking for a function R to calculate the differences between consecutive or with the last non-NA value in a vector. Here is an example:

visit <- c(1,2,3,4)
time <- c(5,10,NA,15)
df <- data.frame(visit ,time)

We are looking for time since the last visit.

Using diff, we get the length of 3 vectors:

diff <- diff(df$time, lag = 1, differences = 1)

5 NA NA

Required diff vector:

 5 NA 5

And ideally, it would be the same length as the original vector value, so it could be added to the dataframe 'df':

  visit | time | diff
    1      5       NA
    2      10      5
    3      NA      NA
    4      15      5
+4
source share
2 answers

Here is one way using only the basic R operations:

First, develop a difference other than NA by grinding NAs:

> cdiffs = diff(df$time[!is.na(df$time)])

, . -NA-, , NA - :

> cplace = which(!is.na(df$time))[-1]

NA :

> df$diffs = NA
> df$diffs[cplace] = cdiffs
> df
  visit time diffs
1     1    5    NA
2     2   10     5
3     3   NA    NA
4     4   15     5
+2

lag na.locf :

lag , na.locf ,

library(zoo)     #for na.locf function
library(dplyr)   #for lag function, (had issues with base lag function)

DF$newDiff = DF$time - na.locf(lag(DF$time),na.rm = FALSE)

DF
#  visit time newDiff
#1     1    5      NA
#2     2   10       5
#3     3   NA      NA
#4     4   15       5
+2

Source: https://habr.com/ru/post/1670443/


All Articles