Mapping scikit-learn DecisionTreeClassifier.tree_.value for the predicted class

Question

Mapping scikit-learn DecisionTreeClassifier.tree_.value for the predicted class

I am using scikit-learn DecissionTreeClassifier in a class 3 dataset. After I adjust the classifier, I refer to all leaf nodes of the tree_ attribute to get the number of instances that fall into this node for each class.

clf = tree.DecisionTreeClassifier(max_depth=5) clf.fit(X, y) # lets assume there is a leaf node with id 5 print clf.tree_.value[5]

This will print:

 >>> array([[ 0., 1., 68.]])

but ... how do you know which position in this array belongs to the class? The classifier has the classes_ attribute, which is also a list.

 >>> clf.classes_ array(['CLASS_1', 'CLASS_2', 'CLASS_3'], dtype=object)

Maybe index 1 in the array of values corresponds to the class in index 1 of the array of classes, etc.?

+5

python scikit-learn decision-tree

nemi Oct 05 '14 at 21:33

source share

2 answers

No, it's not clf.classes_, but clf.tree_.feature, which contain the index of the X column. And, if X is a Pandas DataFrame, X.columns contains the name. You can find more information in a similar question .

0

Jihun Oct 6 '14 at 4:58

source share

nemi · Accepted Answer · 2014-10-09T12:25:59+0000

Asked about it on the scikit-learm mailing list, and my hunch was correct. It turns out that index 1 in the array of values corresponds to the class in index 1 of the array of classes, etc.

Mapping scikit-learn DecisionTreeClassifier.tree_.value for the predicted class

More articles: