Data.table equivalent to tidyr :: complete ()

tidyr::complete()adds rows to data.framefor combinations of column values โ€‹โ€‹that are not in the data. Example:

library(dplyr)
library(tidyr)

df <- data.frame(person = c(1,2,2),
                 observation_id = c(1,1,2),
                 value = c(1,1,1))
df %>%
  tidyr::complete(person,
                  observation_id,
                  fill = list(value=0))

gives

# A tibble: 4 ร— 3
  person observation_id value
   <dbl>          <dbl> <dbl>
1      1              1     1
2      1              2     0
3      2              1     1
4      2              2     1

where the valuecombination of person == 1and observation_id == 2, which is absent in df, was filled with the value 0.

What would be equivalent to this in data.table?

+8
source share
2 answers

I believe that the philosophy of data.table entails a smaller number of specially named functions for tasks than you will find in the tipper, so additional coding is required, for example:

res = setDT(df)[
  CJ(person = person, observation_id = observation_id, unique=TRUE), 
  on=.(person, observation_id)
]

. @thelatemail:

res[is.na(value), value := 0 ]

@Jealie , .


, , . , :

completeDT <- function(DT, cols, defs = NULL){
  mDT = do.call(CJ, c(DT[, ..cols], list(unique=TRUE)))
  res = DT[mDT, on=names(mDT)]
  if (length(defs)) 
    res[, names(defs) := Map(replace, .SD, lapply(.SD, is.na), defs), .SDcols=names(defs)]
  res[]
} 

completeDT(setDT(df), cols = c("person", "observation_id"), defs = c(value = 0))

   person observation_id value
1:      1              1     1
2:      1              2     0
3:      2              1     1
4:      2              2     1

, @thelatemail:

vars <- c("person","observation_id")
df[do.call(CJ, c(mget(vars), unique=TRUE)), on=vars]

# or with magrittr...
c("person","observation_id") %>% df[do.call(CJ, c(mget(.), unique=TRUE)), on=.]

: CJ, @MichaelChirico & @MattDowle .

+7

, :

dt[CJ(person=unique(dt$person), 
      observation_id=unique(dt$observation_id)),
   on=c('person','observation_id')]

:

   person observation_id value
1:      1              1     1
2:      2              1     1
3:      1              2    NA
4:      2              2     1

, ( NA), , :)

+3

Source: https://habr.com/ru/post/1016615/


All Articles