How to perform operations on list columns in R data.table to output another list column?

It’s still hard for me to think about how to work with R data.table columns, which are lists.

Here is the data table R.

library(data.table)
dt = data.table(
      numericcol = rep(42, 8),
      listcol = list(c(1, 22, 3), 6, 1, 12, c(5, 6, 1123), 3, 42, 1)
  )
> dt
   numericcol        listcol
1:         42        1,22, 3
2:         42              6
3:         42              1
4:         42             12
5:         42    5,   6,1123
6:         42              3
7:         42             42
8:         42              1

I would like to create a column for absolute values ​​between elements numericcoland listcol:

> dt
   numericcol        listcol    absvals 
1:         42        1,22, 3    41, 20, 39
2:         42              6    36
3:         42              1    41
4:         42             12    30
5:         42    5,   6,1123    37, 36, 1081
6:         42              3    39
7:         42             42    0
8:         42              1    41

So my first thought was to use sapply()as follows:

dt[, absvals := sapply(listcol, function(x) abs(x-numericcol))]

Outputs the following:

> dt
   numericcol        listcol absvals
1:         42        1,22, 3      41
2:         42              6      20
3:         42              1      39
4:         42             12      41
5:         42    5,   6,1123      20
6:         42              3      39
7:         42             42      41
8:         42              1      20

So absvalsnow it is a column of unregistered items with a separate item in each row and is a different size than data.table.

(1) How to create absvalsto maintain the structure of the list listcol?

(2) , , R data.table ​​ ?

vec = as.vector(dt[, absvals := sapply(listcol, function(x) abs(x-numericcol))])

?

+4
5

, mapply:

dt[, absvals := mapply(listcol, numericcol, FUN = function(x, y) abs(x-y))]

#output
dt
   numericcol        listcol        absvals
1:         42        1,22, 3       41,20,39
2:         42              6             36
3:         42              1             41
4:         42             12             30
5:         42    5,   6,1123   37,  36,1081
6:         42              3             39
7:         42             42              0
8:         42              1             41
+5

, -, , . , list data.table, , [.data.table , j, list, , list list , j .

, :

dt[ , abs_vals := list(lapply(seq_along(.I), function(ii) 
  abs(listcol[[ii]] - numericcol[ii])))][]
#    numericcol        listcol       abs_vals
# 1:         42        1,22, 3       41,20,39
# 2:         42              6             36
# 3:         42              1             41
# 4:         42             12             30
# 5:         42    5,   6,1123   37,  36,1081
# 6:         42              3             39
# 7:         42             42              0
# 8:         42              1             41

seq_along(.I) .

+2

Map

dt[, absvals := Map(function(x, y) abs(x-y), listcol, numericcol)]
dt
#    numericcol        listcol        absvals
#1:         42        1,22, 3       41,20,39
#2:         42              6             36
#3:         42              1             41
#4:         42             12             30
#5:         42    5,   6,1123   37,  36,1081
#6:         42              3             39
#7:         42             42              0
#8:         42              1             41

purrr::map

dt[, absvals := map2(listcol, numericcol, ~ abs(.x -.y))]

unlist rep licated 'numericol' lengths "listvals".

dt[, absvals := relist(abs(rep(numericcol, lengths(listcol)) - 
                   unlist(listcol)), skeleton = listcol)]

. , 'numericol', rep

+2

apply() data.table numericol listcol :

dt[, absvals := apply(.SD, 
                      1, 
                      function(x) abs(x$numericcol - x$listcol))]

:

   numericcol        listcol        absvals
1:         42        1,22, 3       41,20,39
2:         42              6             36
3:         42              1             41
4:         42             12             30
5:         42    5,   6,1123   37,  36,1081
6:         42              3             39
7:         42             42              0
8:         42              1             41
+2

, ? , .

# convert to long format:
dt2 <- dt[, .(var = unlist(listcol)), by = numericcol]
dt2[, absval := abs(var - numericcol)]
dt2
    numericcol  var absval
 1:         42    1     41
 2:         42   22     20
 3:         42    3     39
 4:         42    6     36
 5:         42    1     41
 6:         42   12     30
 7:         42    5     37
 8:         42    6     36
 9:         42 1123   1081
10:         42    3     39
11:         42   42      0
12:         42    1     41

, data.tables.

+2
source

Source: https://habr.com/ru/post/1696352/


All Articles