Computing columns in a variable data table

I am trying to calculate the columns in the data.table with the calculation passed to the variable. The following is the same as what I am trying to achieve:

dt <- data.table(mpg)
dt[, list(manufacturer, model, mpg_cyl_cty=cty/cyl, mpg_cyl_hwy=hwy/cyl)]

where I want to mpg_cyl_cty=cty/cyl, mpg_cyl_hwy=hwy/cylcome from a type variable:

var <- c('mpg_cyl_cty=cty/cyl', 'mpg_cyl_hwy=hwy/cyl')
dt[, list(manufacturer, model, var)]

I assume there are more problems with this, since the type varneeds to be assigned (c or list) and how it dtis called through a list or c.

I hope someone has a suggestion, since I do not find anything on the WWW.

+4
source share
2 answers
library(ggplot2)
library(data.table)

dt <- data.table(mpg)
# The original calculation
dt1 <- dt[, list(manufacturer, model, mpg_cyl_cty=cty/cyl, mpg_cyl_hwy=hwy/cyl)]

var <- c('mpg_cyl_cty=cty/cyl', 'mpg_cyl_hwy=hwy/cyl')
# create a string to pass for evaluation
expr <- paste0("`:=`(", paste0(var, collapse = ", "), ")")

dt2 <- dt[, 
          .(manufacturer, model, cty, cyl, hwy)
         ][, eval(parse(text = expr))        # evaluate the expression
         ][, c("cty", "cyl", "hwy") := NULL] # delete unnecessary columns

> print(all.equal(dt1, dt2))
[1] TRUE
+1
source

A slightly different approach to avoid eval(parse(.))and work with language objects.
Instead c('mpg_cyl_cty=cty/cyl', 'mpg_cyl_hwy=hwy/cyl')only required c("cty","hwy").

library(data.table)
dt = as.data.table(ggplot2::mpg)
r.expected = dt[, list(manufacturer, model, mpg_cyl_cty=cty/cyl, mpg_cyl_hwy=hwy/cyl)]

cyl.ratio.j = function(var){
    substitute(lhs := rhs, list(
        lhs = as.name(paste0("mpg_cyl_", var)),
        rhs = call("/", as.name(var), as.name("cyl"))
    ))
}

r = dt[, eval(cyl.ratio.j("cty"))
       ][, eval(cyl.ratio.j("hwy"))
         ][, .SD, .SDcols = c("manufacturer", "model", paste0("mpg_cyl_", c("cty","hwy")))]

all.equal(r.expected, r)
#[1] TRUE
0

Source: https://habr.com/ru/post/1622204/


All Articles