Difference between min_samples_split and min_samples_leaf in sklearn DecisionTreeClassifier

I was passing the sklearn DecisionTreeClassifier class .

If you look at the parameters for the class, we have two parameters min_samples_split and min_samples_leaf . The basic idea behind them looks the same, you specify the minimum number of samples needed to decide whether the node will be a leaf or split further.

Why do we need two parameters when one implies the other ?. Is there any reason or scenario that sets them apart?

+4
source share
1 answer

From the documentation:

, min_samples_leaf , min_samples_split , min_samples_split .

, , ( node) node. node ( ), a node - ( - ).

min_samples_split , node, min_samples_leaf , node.

, min_samples_split = 5 7 node, . , , 1 , 6 . min_samples_leaf = 2, ( node 7 ), , node.

, min_samples_leaf , min_samples_split.

+6

Source: https://habr.com/ru/post/1686603/


All Articles