Rbindlist for factors with missing levels

I have some data.tables that I would like rbindlist . Tables contain factors with (possibly missing) levels. Then rbindlist(...) behaves differently than do.call(rbind(...)) :

 dt1 <- data.table(x=factor(c("a", "b"), levels=letters)) rbindlist(list(dt1, dt1))[,x] ## [1] abab ## Levels: ab do.call(rbind, list(dt1, dt1))[,x] ## [1] abab ## Levels: abcdefghijklmnopqrstu vwxyz 

If I want to keep levels, can I turn to rbind or is there a data.table way?

+6
source share
2 answers

I think rbindlist faster because it does not do.call(rbind.data.frame,...) check do.call(rbind.data.frame,...)

Why not set the levels after snapping?

  Dt <- rbindlist(list(dt1, dt1)) setattr(Dt$x,"levels",letters) ## set attribute without a copy 

from ?setattr :

setattr () is useful in many situations for setting attributes by reference and can be used for any object or part of an object, not just data.tables.

+4
source

Thanks for pointing out this problem. Starting with version 1.8.11 fixed:

 dt1 <- data.table(x=factor(c("a", "b"), levels=letters)) rbindlist(list(dt1, dt1))[,x] #[1] abab #Levels: abcdefghijklmnopqrstu vwxyz 
+2
source

Source: https://habr.com/ru/post/956246/


All Articles