How can I get the relative importance of the features of logistic regression for a specific forecast?

I use logistic regression (in scikit) for the binary classification problem, and I am interested in being able to explain each individual prediction. To be more precise, I'm interested in predicting the probability of a positive class and assessing the importance of each function for this prediction.

Using odds (beta) as an important measure is usually a bad idea, as mentioned here , but I still have to find a good alternative.

So far, the best I've found are the following 3 options:

  • Variant of Monte Carlo . Correcting all other functions, re-run the prediction, replacing the function we want to evaluate with random samples from a set of workouts. Do it many times. This would set the base probability for the positive class. Then compare with the probability of a positive class of initial mileage. Difference is an indicator of the importance of a function.
  • Classifiers "Leave-one-out":. To evaluate the importance of a function, first create a model that uses all the functions, and then another that uses all the functions except the one under test. Predict a new observation using both models. The difference between them will be important for this function.
  • Corrected beta: Based on this answer , evaluating the importance of these functions by the value of its coefficient times the standard deviation of the corresponding parameter in the data. ''

All the options (using beta, Monte Carlo and Single Player) seem like weak decisions to me.

  • Monte Carlo depends on the distribution of the training kit, and I cannot find any literature to support it.
  • โ€œLeave oneโ€ will be easily deceived by two correlated signs (when one of them was absent, the other should intervene to compensate, and both values โ€‹โ€‹would have a value of 0).
  • The adjusted beta versions sound believable, but I canโ€™t find the literature to support it.

Actual question:. What is the best way to interpret the importance of each function at the time a decision is made using a linear classifier?

Quick Note # 1: this is trivial for random forests, we can just use the prediction + bias decomposition, as this blog post explains perfectly. The problem here is how to do something similar with linear classifiers such as logistic regression.

Quick Note # 2: There are a number of related questions about stackoverflow ( 1 2 3 4 5 ), I could not find the answer to this specific question.

+5
source share
1 answer

If you want the importance of functions for a particular solution, why not imitate the decision_function (which is provided by scikit-learn, so you can check if you get the same value) step by step? The solution function for linear classifiers is simple:

intercept_ + coef_[0]*feature[0] + coef_[1]*feature[1] + ...

The importance of the function i then just coef_[i]*feature[i] . Of course, this is similar to estimating the coefficients, but since it is multiplied by the actual function, and this also happens under the hood, this may be your best bet.

+2
source

Source: https://habr.com/ru/post/1239483/


All Articles