Identify the records in which this sequence of events occurs over x days

I have a big data.table, similar in structure to df:

library("data.table")
df <- data.frame(part = c("A", "B", "A", "C", "A", "D", "B", "D", "E"), 
                 day = c(1, 2, 3, 4, 5, 6, 6, 7, 15), 
                 code = c("S", "S", "P", "X", "P", "S", "P", "P", "P"))
setDT(df)
df
   part day code
1:    A   1    S
2:    B   2    S
3:    A   3    P
4:    C   4    X
5:    A   5    P
6:    D   6    S
7:    B   6    P
8:    D   7    P
9:    E  15    P

How to add a column in which records are written, where code= S, but the same parthas code= Pfor the next 3 days? Expected Result:

   part day code  flag
1:    A   1    S  TRUE
2:    B   2    S FALSE
3:    A   3    P FALSE
4:    C   4    X FALSE
5:    A   5    P FALSE
6:    D   6    S  TRUE
7:    B   6    P FALSE
8:    D   7    P FALSE
9:    E  15    P FALSE
+4
source share
2 answers

I think it does

df[, v := FALSE ]
df[code == "S", v := !is.na(
  df[code == "P"][df[code == "S"], on=c("part", "day"), roll=-3, which=TRUE]
)]

   part day code     v
1:    A   1    S  TRUE
2:    B   2    S FALSE
3:    A   3    P FALSE
4:    C   4    X FALSE
5:    A   5    P FALSE
6:    D   6    S  TRUE
7:    B   6    P FALSE
8:    D   7    P FALSE
9:    E  15    P FALSE

!is.na(x[i, which=TRUE]) , i x. ( , i .) roll , , on.

, roll , .

+6

-

df$v <- as.logical((df$code== "S") *
        c(sapply(seq(1:(nrow(df)-2)), function(x)
           {
            max(df[(x:x+2),"code"] == "P")
           }),
          df[nrow(df)-1,"code"]=="P",
          df[nrow(df),"code"]=="P"))
+1

Source: https://habr.com/ru/post/1661757/


All Articles