Python sci-kit learn (metrics): difference between r2_score and explain_variance_score?

Question

Python sci-kit learn (metrics): difference between r2_score and explain_variance_score?

I noticed that r2_score and explain_variance_score are built-in sklearn.metrics methods for regression tasks.

It always seemed to me that r2_score is the percentage variance explained by the model. How is this different from explain_variance_score?

When do you pick one over the other?

Thank!

+4

python scikit-learn

monkeybiz7 Jun 24 '14 at 4:10

source share

1 answer

CT Zhu · Accepted Answer · 2014-06-24T06:16:08+0000

OK, look at this example:

In [123]:
#data
y_true = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 2, 8]
print metrics.explained_variance_score(y_true, y_pred)
print metrics.r2_score(y_true, y_pred)
0.957173447537
0.948608137045
In [124]:
#what explained_variance_score really is
1-np.cov(np.array(y_true)-np.array(y_pred))/np.cov(y_true)
Out[124]:
0.95717344753747324
In [125]:
#what r^2 really is
1-((np.array(y_true)-np.array(y_pred))**2).sum()/(4*np.array(y_true).std()**2)
Out[125]:
0.94860813704496794
In [126]:
#Notice that the mean residue is not 0
(np.array(y_true)-np.array(y_pred)).mean()
Out[126]:
-0.25
In [127]:
#if the predicted values are different, such that the mean residue IS 0:
y_pred=[2.5, 0.0, 2, 7]
(np.array(y_true)-np.array(y_pred)).mean()
Out[127]:
0.0
In [128]:
#They become the same stuff
print metrics.explained_variance_score(y_true, y_pred)
print metrics.r2_score(y_true, y_pred)
0.982869379015
0.982869379015

So, when the average remainder is 0, they are the same. Which one to choose depending on your needs, that is, the average balance is assumed to be 0?

Python sci-kit learn (metrics): difference between r2_score and explain_variance_score?

More articles: