Difference between min_samples_split and min_samples_leaf in sklearn DecisionTreeClassifier

Question

Difference between min_samples_split and min_samples_leaf in sklearn DecisionTreeClassifier

I was passing the sklearn DecisionTreeClassifier class .

If you look at the parameters for the class, we have two parameters min_samples_split and min_samples_leaf . The basic idea behind them looks the same, you specify the minimum number of samples needed to decide whether the node will be a leaf or split further.

Why do we need two parameters when one implies the other ?. Is there any reason or scenario that sets them apart?

+4

python scikit-learn

Hara chaitanya Sep 29 '17 at 1:25

source share

1 answer

Alex · Accepted Answer · 2017-09-29T11:50:51+0000

From the documentation:

, min_samples_leaf , min_samples_split , min_samples_split .

, , ( node) node. node ( ), a node - ( - ).

min_samples_split , node, min_samples_leaf , node.

, min_samples_split = 5 7 node, . , , 1 , 6 . min_samples_leaf = 2, ( node 7 ), , node.

, min_samples_leaf , min_samples_split.

Difference between min_samples_split and min_samples_leaf in sklearn DecisionTreeClassifier

More articles: