R: data.table, set the first and last value of the group to NA

I would like to set the first and last values ​​in the group to NA. Here is an example:

DT <- data.table(v = rnorm(12), class=rep(1:3, each=4)) DT[, v[c(1,.N)] := NA , by=class] 

But that does not work. How can i do this?

+6
source share
4 answers

For now, the way to do this is to extract the indexes first and then perform one assignment by reference.

 idx = DT[, .(idx = .I[c(1L, .N)]), by=class]$idx DT[idx, v := NA] 

I will try to add this example to the Reference semantics vignette.

+9
source

It may not be single-line, but the code has the β€œfirst” and β€œlast” :)

 > DT <- data.table(v = rnorm(12), class=rep(1:3, each=4)) > setkey(DT, class) > classes = DT[, .(unique(class))] > DT[classes, v := NA, mult='first'] > DT[classes, v := NA, mult='last'] > DT v class 1: NA 1 2: -1.8191 1 3: -0.6355 1 4: NA 1 5: NA 2 6: -1.1771 2 7: -0.8125 2 8: NA 2 9: NA 3 10: 0.2357 3 11: 0.3416 3 12: NA 3 > 

The order is also saved for non-key columns. I believe this is a documented feature.

+3
source

Using the helper function is easy

 set.na = function(x,y) {x[y] = NA; x} DT[, set.na(v,c(1,.N)) , by=class] 
+1
source

The canonical way of modifying subsets of data is to use i to define the subset. You cannot use [ together with := . Create a temporary i , as @David Arenburg suggested, or you can create the outcome vector yourself using the construction c(NA, v[-c(1, .N)], NA) .

 DT[, v := c(NA, v[-c(1, .N)], NA)[1:.N], by = class] 

However, you should also notice that the order of the lines can change when you, for example. install a new key or use any number of functions. Therefore, you must be very careful with this operation.

0
source

Source: https://habr.com/ru/post/983371/


All Articles