R puts several randomForest objects into a vector

I am curious if R has the ability to put objects in vectors / lists / arrays / etc. I use the randomforest package to work on subsets of most of the data and would like to keep each version in a list. It would look like this:

answers <- c() for(i in 1:10){ x <- round((1/i), 3) answers <- (rbind(answers, x)) } 

Ideally, I would like to do something like this:

 answers <- c() for(i in 1:10){ RF <- randomForest(training, training$data1, sampsize=c(100), do.trace=TRUE, importance=TRUE, ntree=50,,forest=TRUE) answers <- (rbind(answers, RF)) } 

This kind of work, but here is the output for one RF object:

 > RF Call: randomForest(x = training, y = training$data1, ntree = 50, sampsize = c(100), importance = TRUE, do.trace = TRUE, forest = TRUE) Type of random forest: regression Number of trees: 10 No. of variables tried at each split: 2 Mean of squared residuals: 0.05343956 % Var explained: 14.32 

While this is the out list for the answer list:

 > answers call type predicted mse rsq oob.times importance importanceSD RF Expression "regression" Numeric,150000 Numeric,10 Numeric,10 Integer,150000 Numeric,16 Numeric,8 RF Expression "regression" Numeric,150000 Numeric,10 Numeric,10 Integer,150000 Numeric,16 Numeric,8 RF Expression "regression" Numeric,150000 Numeric,10 Numeric,10 Integer,150000 Numeric,16 Numeric,8 RF Expression "regression" Numeric,150000 Numeric,10 Numeric,10 Integer,150000 Numeric,16 Numeric,8 RF Expression "regression" Numeric,150000 Numeric,10 Numeric,10 Integer,150000 Numeric,16 Numeric,8 RF Expression "regression" Numeric,150000 Numeric,10 Numeric,10 Integer,150000 Numeric,16 Numeric,8 RF Expression "regression" Numeric,150000 Numeric,10 Numeric,10 Integer,150000 Numeric,16 Numeric,8 RF Expression "regression" Numeric,150000 Numeric,10 Numeric,10 Integer,150000 Numeric,16 Numeric,8 RF Expression "regression" Numeric,150000 Numeric,10 Numeric,10 Integer,150000 Numeric,16 Numeric,8 RF Expression "regression" Numeric,150000 Numeric,10 Numeric,10 Integer,150000 Numeric,16 Numeric,8 localImportance proximity ntree mtry forest coefs y test inbag RF NULL NULL 10 2 List,11 NULL Integer,150000 NULL NULL RF NULL NULL 10 2 List,11 NULL Integer,150000 NULL NULL RF NULL NULL 10 2 List,11 NULL Integer,150000 NULL NULL RF NULL NULL 10 2 List,11 NULL Integer,150000 NULL NULL RF NULL NULL 10 2 List,11 NULL Integer,150000 NULL NULL RF NULL NULL 10 2 List,11 NULL Integer,150000 NULL NULL RF NULL NULL 10 2 List,11 NULL Integer,150000 NULL NULL RF NULL NULL 10 2 List,11 NULL Integer,150000 NULL NULL RF NULL NULL 10 2 List,11 NULL Integer,150000 NULL NULL RF NULL NULL 10 2 List,11 NULL Integer,150000 NULL NULL 

Does anyone know how to store all RF objects or call them so that information is stored in the same way as a single RF object? Thanks for the suggestions.

+6
source share
4 answers

Do not grow vectors or enumerate one element at a time. Pre-distribute them and assign objects to specific parts:

 answers <- vector("list",10) for (i in 1:10){ answers[[i]] <- randomForest(training, training$data1, sampsize=c(100), do.trace=TRUE, importance=TRUE, ntree=50, forest=TRUE) } 

As a side note, rbind ing vectors rbind not create another vector or list; if you check your result in the first example, you will see that it is a single column matrix. This explains the strange behavior that you observe when you try to rbind objects together.

+9
source

Use lapply :

 lapply(1:10,function(i) randomForest(<your parameters>)) 

You will get a list of random forest objects; you can access the ith of them using the [[]] operator.

+4
source

Initialize the list:

 mylist <- vector("list") # technically all objects in R are vectors 

Add to it:

 new_element <- 5 mylist <- c(mylist, new_element) 

@joran pre-placement tips matter when lists are large, but not entirely necessary when they are small. You can also access the matrix that you create in your source code. It looks a little strange, but the information is here. For example, the first element of this list matrix could be restored with:

 answers[1, ] 
+3
source

Other answers provide solutions for storing random forest objects in a list , but they do not explain why they work.

As @ 42- suggests, this is not a pre-allocation step that solves the problem here.

The real problem is that the randomForest object is basically a list (check is.list(randomForest(...) ). When you write an expression like:

 list_of_rf = c() # ... or list_of_rf = NULL list_of_rf = rbind(list_of_rf, randomForest(...)) # ... or list_of_rf = c(list_of_rf, randomForest(...)) 

you are essentially asking to combine an empty object with a list. Instead of leading to a list of length 1 (random forest model), this statement leads to a list containing all the components of a random forest model! You can verify this by typing the R console in it:

> length (list_of_rf)

[1] 19

There are several ways to get R to perform the required operation:

  • explicit effect on the list (cf @joran answer, although there is no need to pre-select):

     list_of_rf = NULL list_of_rf[[1]] = randomForest(...) 
  • let lapply (or similar) build a list (cf @mbq answer):

     list_of_rf = lapply(..., function(i) randomForest(...)) 
  • encapsulate a random forest in a list that will be simplified when concatenating:

     list_of_rf = NULL list_of_rf = c(list_of_rf, list(randomForest(...))) 

Finally, if you make a mistake and make your own random model RandomForest, which is designed for 10 hours, do not sweat, you can restore it as follows:

 list_of_rf = NULL list_of_rf = c(list_of_rf, randomForest(...)) # oups, mistake rf = as.vector(list_of_rf)[1:19] class(rf) = 'randomForest' 
0
source

Source: https://habr.com/ru/post/899577/


All Articles