I have a dataframe with id, order time value and value. And for each group of identifiers, I would like to delete rows with a smaller value than rows with a smaller time value.
data <- data.frame(id = c(rep(c("a", "b"), each = 3L), "b"),
time = c(0, 1, 2, 0, 1, 2, 3),
value = c(1, 1, 2, 3, 1, 2, 4))
> data
id time value
1 a 0 1
2 a 1 1
3 a 2 2
4 b 0 3
5 b 1 1
6 b 2 2
7 b 3 4
Thus, the result will be:
> data
id time value
1 a 0 1
2 a 2 2
3 b 0 3
4 b 3 4
(For id == blines where they time %in% c(3, 4)are deleted, because the value is valueless than with timebelow)
I'm thinking of lag
data %>%
group_by(id) %>%
filter(time == 0 | lag(value, order_by = time) < value)
Source: local data frame [5 x 3]
Groups: id [2]
id time value
<fctr> <dbl> <dbl>
1 a 0 1
2 a 2 2
3 b 0 3
4 b 2 2
5 b 3 4
But this does not work as expected, since it is a vectorized function, so the idea should be to use the "recursive lag function" or to check the last maximum value. I can do this recursively with a loop, but I'm sure there is a simpler and higher level way to do this.
Any help would be appreciated, thanks!