How to calculate deviations from the weighted average in data.table?

I would like to calculate the deviations from the (weighted) average for many variables in data.table .

Take this example:

 mydt <- data.table( id = c(1, 2, 2, 3, 3, 3), x = 1:6, y = 6:1, w = rep(1:2, 3) ) mydt id xyw 1: 1 1 6 1 2: 2 2 5 2 3: 2 3 4 1 4: 3 4 3 2 5: 3 5 2 1 6: 3 6 1 2 

I can calculate the weighted funds x and y as follows:

 mydt[ , lapply( as.list(.SD)[c("x", "y")], weighted.mean, w = w ), by = id ] 

(I use the relatively complex as.list(.SD)[...] .SDcols instead of .SDcols due to this error).

I tried to first create tools for each line, but could not find how to combine := with lapply() .

+5
source share
1 answer

Just adjust the weighted average calculation a bit:

 mydt[ , lapply( .SD[, .(x, y)], function(var) var - weighted.mean(var, w = w) ), by = id ] id xy 1: 1 0.0000 0.0000 2: 2 -0.3333 0.3333 3: 2 0.6667 -0.6667 4: 3 -1.0000 1.0000 5: 3 0.0000 0.0000 6: 3 1.0000 -1.0000 

The solution is updated using the proposed notational simplification @DavidArenburg.

+3
source

Source: https://habr.com/ru/post/1237654/


All Articles