As I understand it, RF has an internal technique of checking in and out of the bag, so 1/3 of the data is kept (from the bag), and 2/3 of the data is used to train the RF model.
My question is, if the above is true, why is there a need to split the dataset and therefore use the xtest and ytest parameters.
Please consider the issue in terms of regression, not classification.
tg110 source
share