Residual Maximum Likelihood
A Residual Maximum Likelihood is a maximum likelihood estimation that uses the likelihood function of a transformed dataset.
References
2016
- (Wikipedia, 2015) ⇒ https://www.wikiwand.com/en/Restricted_maximum_likelihood
- In statistics, the restricted (or residual, or reduced) maximum likelihood (REML) approach is a particular form of maximum likelihood estimation which does not base estimates on a maximum likelihood fit of all the information, but instead uses a likelihood function calculated from a transformed set of data, so that nuisance parameters have no effect.
In the case of variance component estimation, the original data set is replaced by a set of contrasts calculated from the data, and the likelihood function is calculated from the probability distribution of these contrasts, according to the model for the complete data set. In particular, REML is used as a method for fitting [[linear
- In statistics, the restricted (or residual, or reduced) maximum likelihood (REML) approach is a particular form of maximum likelihood estimation which does not base estimates on a maximum likelihood fit of all the information, but instead uses a likelihood function calculated from a transformed set of data, so that nuisance parameters have no effect.
mixed model]]s. In contrast to the earlier maximum likelihood estimation, REML can produce unbiased estimates of variance and covariance parameters.
The idea underlying REML estimation was put forward by M. S. Bartlett in 1937. The first description of the approach applied to estimating components of variance in unbalanced data was by Desmond Patterson and Robin Thompson of the University of Edinburgh in 1971, although they did not use the term REML. A review of the early literature was given by Harville.
REML estimation is available in a number of general-purpose statistical software packages, including Genstat (the REML directive), SAS (the MIXED procedure), SPSS (the MIXED command), Stata (the mixed command), JMP (statistical software), and R (especially the lme4 and older nlme packages), as well as in more specialist packages such as MLwiN, HLM, ASReml, Statistical Parametric Mapping and CropStat.
REML estimation is implemented in Surfstat a Matlab toolbox for the statistical analysis of univariate and multivariate surface and volume.
2014
- (Oehlert, 2014) ⇒ Oehlert, G. W. (2014). A few words about REML. http://sers.stat.umn.edu/~gary/classes/5303/handouts/REML.pdf
- REML is actually a way to estimate variance components. Once we have estimated variance components, we then assume that the estimated components are “correct” (that is, equal to their estimated values) and compute generalized least squares estimates of the fixed effects parameters. GLS is a version of least squares that allows us to account for covariances among the responses, such as might be present in a mixed effects model. Sometimes we get the same estimates using GLS that we would get using ordinary least squares, but not always. The variances we compute for our fixed effects can also differ between ordinary least squares and GLS. Butdon’t worry, all the GLS stuff will be done internally to lmer or lme. REML works by first getting regression residuals for the observations modeled by the fixed effects portion of the model, ignoring at this point any variance components (...)
1996
- (Smyth & Verbyla, 1996) ⇒ Smyth, G. K., & Verbyla, A. P. (1996). A conditional likelihood approach to residual maximum likelihood estimation in generalized linear models. Journal of the Royal Statistical Society. Series B (Methodological), 565-572. http://www.jstor.org/stable/2345894
- Residual maximum likelihood (REML) estimation is often preferred to maximum likelihood estimation as a method of estimating covariance parameters in linear models because it takes account of the loss of degrees of freedom in estimating the mean and produces unbiased estimating equations for the variance parameters. In this paper it is shown that REML has an exact conditional likelihood interpretation, where the conditioning is on an appropriate sufficient statistic to remove dependence on the nuisance parameters. This interpretation clarifies the motivation for REML and generalizes directly to non-normal models in which there is a low dimensional sufficient statistic for the fitted values. The conditional likelihood is shown to be well defined and to satisfy the properties of a likelihood function, even though this is not generally true when conditioning on statistics which depend on parameters of interest. Using the conditional likelihood representation, the concept of REML is extended to generalized linear models with varying dispersion and canonical link. Explicit calculation of the conditional likelihood is given for the one-way lay-out. A saddle-point approximation for the conditional likelihood is also derived.