Explained Variance Regression Score
An Explained Variance Regression Score is a Linear Regression Score Function based on Explained Variance.
- Context:
- It is defined as
[math]\displaystyle{ S(y, \hat{y}) = 1 - \frac{Var\{ y - \hat{y}\}}{Var\{y\}} }[/math]
where Var is the variance
- It is defined as
[math]\displaystyle{ y }[/math] is the the measured response variable and [math]\displaystyle{ \hat{y} }[/math] are its predicted values.
- Example(s),
- Counter-Example(s)
- See: Linear Regression, Variance, Explained Sum of Squares.
References
2017a
- (Wikipedia, 2017) ⇒ https://en.wikipedia.org/wiki/Explained_variation#Linear_regression Retrieved:2017-10-1.
- The fraction of variance unexplained is an established concept in the context of linear regression. The usual definition of the coefficient of determination is based on the fundamental concept of explained variance.
2017b
- (Wikipedia, 2017) ⇒ https://en.wikipedia.org/wiki/Coefficient_of_determination#As_explained_variance Retrieved:2017-10-2.
- Suppose r = 0.7, meaning r2 = 0.49. This implies that 49% of the variability between the two variables has been accounted for, and the remaining 51% of the variability is still unaccounted for.
In some cases the total sum of squares equals the sum of the two other sums of squares defined above, : [math]\displaystyle{ SS_\text{res}+SS_\text{reg}=SS_\text{tot}. \, }[/math] See partitioning in the general OLS model for a derivation of this result for one case where the relation holds. When this relation does hold, the above definition of R2 is equivalent to : [math]\displaystyle{ R^2 = \frac{SS_\text{reg}}{SS_\text{tot}} = \frac{SS_\text{reg}/n}{SS_\text{tot}/n}. }[/math] In this form R2 is expressed as the ratio of the explained variance (variance of the model's predictions, which is SSreg / n) to the total variance (sample variance of the dependent variable, which is SStot / n).
This partition of the sum of squares holds for instance when the model values ƒi have been obtained by linear regression. A milder sufficient condition reads as follows: The model has the form : [math]\displaystyle{ f_i=\alpha+\beta q_i \, }[/math] where the qi are arbitrary values that may or may not depend on i or on other free parameters (the common choice qi = xi is just one special case), and the coefficients α and β are obtained by minimizing the residual sum of squares.
This set of conditions is an important one and it has a number of implications for the properties of the fitted residuals and the modelled values. In particular, under these conditions: : [math]\displaystyle{ \bar{f}=\bar{y}.\, }[/math]
- Suppose r = 0.7, meaning r2 = 0.49. This implies that 49% of the variability between the two variables has been accounted for, and the remaining 51% of the variability is still unaccounted for.
2017c
- (Scikit Learn, 2017C) ⇒ http://scikit-learn.org/stable/modules/model_evaluation.html#explained-variance-score Retrieved:2017-10-1.
- QUOTE: The
explained_variance_score
computes the explained variance regression score.If [math]\displaystyle{ \hat{y} }[/math] is the estimated target output, [math]\displaystyle{ y }[/math] the corresponding (correct) target output, and Var is Variance, the square of the standard deviation, then the explained variance is estimated as follow:
[math]\displaystyle{ \texttt{explained_variance}(y, \hat{y}) = 1 - \frac{Var\{ y - \hat{y}\}}{Var\{y\}} }[/math]
The best possible score is 1.0, lower values are worse.
- QUOTE: The