I try to use the package data.tabletogether with plyrto do parallel computing in R, and I get unexpected behavior. I am using windows 7.
I created the following function that creates a freqency table using data.table
t_dt_test <- function(x){
dt <- data.table(x)
dt[, j = list(freq = .N), by = x]
}
Create some test data
test <- list(letters[1:3],letters[1:3],letters[1:3])
This works fine using llplywith.parallel = FALSE
llply(test, t_dt_test, .parallel = FALSE)
[[1]]
x freq
1: a 1
2: b 1
3: c 1
Buy, if I try it in parrallel, it does not work.
library(doParallel)
nodes <- detectCores()
cl <-makeCluster(nodes)
llply(test, t_dt_test, .parallel = TRUE ,.paropts = list( .packages = 'data.table'))
Returns this
Error in do.ply(i) : task 1 failed - "invalid subscript type 'list'"
It [.data.tabledoesn't seem to get passed to the nodes as I expected.
I tried changing the function to
t_dt_test <- function(x){
dt <- data.table(x)
data.table:::`[.data.table`(x = dt, j = list(freq = .N), by = x)
}
but still getting the same error.
: plyr,
?