I have a data table. For instance:
Sim j active cost
1: 1 1 1 100
2: 1 2 1 125
3: 1 3 0 200
4: 1 4 1 250
5: 2 1 1 100
6: 2 2 0 50
7: 2 3 0 125
8: 2 4 1 200
dt <- data.table(Sim = c(1, 1, 1, 1, 2, 2, 2, 2),
j = c(1, 2, 3, 4, 1, 2, 3, 4),
active = c(1, 1, 0, 1, 1, 0, 0, 1),
cost = c(100, 125, 200, 250, 100, 50, 125, 200))
I want to add the column "incr_cost", which subtracts the cost in each row i from the cost in another row, which I will call row k, where row k meets the following conditions:
- sim_k = sim_i
- active_k = 1
- j_k <j_i
- row k contains the largest j of all rows satisfying the three conditions above
For lines where j = 1, incr_cost might just be NA.
In my example, the solution would look like this:
Sim j active cost incr_cost
1: 1 1 1 100 NA
2: 1 2 1 125 25
3: 1 3 0 200 75
4: 1 4 1 250 125
5: 2 1 1 100 NA
6: 2 2 0 50 -50
7: 2 3 0 125 25
8: 2 4 1 200 100
, , , "" data.table, , , , , . , j, ( ).
, , k:
dt[, incr_cost := cost - shift(cost, fill=NA), by=Sim]
r data.table, non-data.table. !