The effect of pruning a decision tree

I want to know if I am creating a decision tree as ID3 from a set for training and verification, but A does not work. At the same time, I have another decision tree B also in ID3 generated from one training and verification set, but B is truncated. Now I am testing both A and B in a future unlabeled test suite, is it always that a pruned tree will work better? Any idea is welcome, thanks.

+3
source share
4 answers

I think that we need to make the distinction clearer: pruned trees always work better on verification , but not necessarily on testing (in fact, this is also equal or worse performance in a training set ). I assume that pruning is done after the tree is built (i.e., after pruning).

Remember that the whole reason for using the validation set is to avoid over-training on the training data set, and the key point here is the generalization: we want the model (decision tree) to generalize outside the cases that were provided during the “training” time to new invisible examples.

+3
source

, , . , , , .

+1

. , . .

0

1- @AMRO. Post-pruning , . Pre-pruning . Pre-pruning , , . , node.

Then, that node becomes a leaf. This sheet may contain the most frequent class among a subset of tuples or the probability of these tuples.

0
source

Source: https://habr.com/ru/post/1770826/


All Articles