Working with rich objects in data.table columns

Question

Working with rich objects in data.table columns

Let's say I have a data table in which one column contains linear models:

library(data.table)
set.seed(1014)

dt <- data.table(
  g = c(1, 1, 2, 2, 3, 3, 3),
  x = runif(7),
  y = runif(7)
)

models <- dt[, list(mod = list(lm(y ~ x, data = .SD))), by = g]

Now I want to extract the r-squared value from each model. Can I do better than this?

models[, list(rsq = summary(mod[[1]])$r.squared), by = g]

##    g      rsq
## 1: 1 1.000000
## 2: 2 1.000000
## 3: 3 0.004452

Ideally, I would like to remove [[1]]and not rely on knowing the previous grouping variable (I know that I want each line to be its own group).

+4

r data.table

hadley Apr 9 '14 at 21:12

source share

4 answers

eddi · Answer 1 · 2014-04-10T15:17:23+0000

It is just summarybeing a bad function, not vectorized. So, how about manually vectorizing it (it's about the same as @mnel's solution):

r.squared = Vectorize(function(x) summary(x)$r.squared)

models[, rsq := r.squared(mod)]
models
#   g  mod         rsq
#1: 1 <lm> 1.000000000
#2: 2 <lm> 1.000000000
#3: 3 <lm> 0.004451631

mnel · Answer 2 · 2014-04-10T05:49:04+0000

rapply, classes='lm', . sapply, ( )

library(data.table)
set.seed(1014)

dt <- data.table(
  g = c(1, 1, 2, 2, 3, 3, 3),
  x = runif(7),
  y = runif(7)
)

models <- dt[, list(mod = list(lm(y ~ x, data = .SD))), by = g]
models[, rsq := sapply(mod, function(x) summary(x)$r.squared)]

models
#     g  mod         rsq
#  1: 1 <lm> 1.000000000
#  2: 2 <lm> 1.000000000
#  3: 3 <lm> 0.004451631

" " data.table - , .SD .

. lm data.table ? , . # 2590.

David Arenburg · Answer 3 · 2014-04-09T21:38:55+0000

?

library(data.table)
set.seed(1014)

dt <- data.table(
  g = c(1, 1, 2, 2, 3, 3, 3),
  x = runif(7),
  y = runif(7)
)
models <- dt[, list(rsq = summary(lm(y ~ x))$r.squared), by = g]
#   g         rsq
#1: 1 1.000000000
#2: 2 1.000000000
#3: 3 0.004451631

Jan Kislinger · Answer 4 · 2017-03-10T16:01:36+0000

, , .

require(purrr)
require(broom)
map_df(models$mod, glance)

Working with rich objects in data.table columns

More articles: