In the rpart package, rpart, what determines the size of the trees represented in the CP table for the decision tree? In the example below, the CP table by default represents only trees with 1, 2, and 5 nodes (as nsplit = 0, 1, and 4, respectively).
library(rpart) fit <- rpart(Kyphosis ~ Age + Number + Start, method="class", data=kyphosis) > printcp(fit) Classification tree: rpart(formula = Kyphosis ~ Age + Number + Start, data = kyphosis, method = "class") Variables actually used in tree construction: [1] Age Start Root node error: 17/81 = 0.20988 n= 81 CP nsplit rel error xerror xstd 1 0.176471 0 1.00000 1.00000 0.21559 2 0.019608 1 0.82353 0.94118 0.21078 3 0.010000 4 0.76471 0.94118 0.21078
Is there a built-in rpart() rule for determining the size of trees? And is it possible to make t22 return cross-validation statistics for all possible tree sizes, that is, for the example above, also include rows for trees with 3 and 4 nodes (nsplit = 2, 3)?
source share