Link Function
A Link Function is a function that provides the relationship between a linear predictor and a distribution function mean.
- Example(s):
- [math]\displaystyle{ \mathbf{X}\boldsymbol{\beta}=\mu^{-1}\,\! }[/math], for a Normal distribution with distribution mean of [math]\displaystyle{ \mu=(\mathbf{X}\boldsymbol{\beta})^{-1}\,\! }[/math]
- [math]\displaystyle{ \mathbf{X}\boldsymbol{\beta}=\ln{(\mu)}\,\! }[/math], for a Poisson distribution with distribution mean of [math]\displaystyle{ \mu=\exp{(\mathbf{X}\boldsymbol{\beta})}\,\! }[/math]
- See: Variance-Stabilizing Transformation, Sufficiency (Statistics), Domain of a Function, Range (Mathematics), Gibbs_sampling, Normal Distribution, Exponential Distribution, Multiplicative Inverse, Gamma Distribution, Inverse Gaussian Distribution, Poisson Distribution.
References
2017
- (Wikipedia, 2017) ⇒ https://en.wikipedia.org/wiki/Generalized_linear_model#Link_function Retrieved:2017-4-20.
- … The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a link function and by allowing the magnitude of the variance of each measurement to be a function of its predicted value.
2017
- (Wikipedia, 2017) ⇒ https://en.wikipedia.org/wiki/Generalized_linear_model#Link_function Retrieved:2017-4-20.
- The link function provides the relationship between the linear predictor and the mean of the distribution function. There are many commonly used link functions, and their choice is informed by several considerations. There is always a well-defined canonical link function which is derived from the exponential of the response's density function. However in some cases it makes sense to try to match the domain of the link function to the range of the distribution function's mean, or use a non-canonical link function for algorithmic purposes, for example Bayesian probit regression.
When using a distribution function with a canonical parameter [math]\displaystyle{ \theta }[/math], the canonical link function is the function that expresses [math]\displaystyle{ \theta }[/math] in terms of [math]\displaystyle{ \mu }[/math] , i.e. [math]\displaystyle{ \theta = b(\mu) }[/math] . For the most common distributions, the mean [math]\displaystyle{ \mu }[/math] is one of the parameters in the standard form of the distribution's density function, and then [math]\displaystyle{ b(\mu) }[/math] is the function as defined above that maps the density function into its canonical form. When using the canonical link function, [math]\displaystyle{ b(\mu) = \theta = \mathbf{X}\boldsymbol{\beta} }[/math] , which allows [math]\displaystyle{ \mathbf{X}^{\rm T} \mathbf{Y} }[/math] to be a sufficient statistic for [math]\displaystyle{ \boldsymbol{\beta} }[/math] .
Following is a table of several exponential-family distributions in common use and the data they are typically used for, along with the canonical link functions and their inverses (sometimes referred to as the mean function, as done here).
- The link function provides the relationship between the linear predictor and the mean of the distribution function. There are many commonly used link functions, and their choice is informed by several considerations. There is always a well-defined canonical link function which is derived from the exponential of the response's density function. However in some cases it makes sense to try to match the domain of the link function to the range of the distribution function's mean, or use a non-canonical link function for algorithmic purposes, for example Bayesian probit regression.
Distribution | Support of distribution | Typical uses | Link name | Link function | Mean function |
---|---|---|---|---|---|
Normal | real: [math]\displaystyle{ (-\infty,+\infty) }[/math] | Linear-response data | Identity | [math]\displaystyle{ \mathbf{X}\boldsymbol{\beta}=\mu\,\! }[/math] | [math]\displaystyle{ \mu=\mathbf{X}\boldsymbol{\beta}\,\! }[/math] |
Exponential | real: [math]\displaystyle{ (0,+\infty) }[/math] | Exponential-response data, scale parameters | Inverse | [math]\displaystyle{ \mathbf{X}\boldsymbol{\beta}=\mu^{-1}\,\! }[/math] | [math]\displaystyle{ \mu=(\mathbf{X}\boldsymbol{\beta})^{-1}\,\! }[/math] |
Gamma | |||||
Inverse Gaussian |
real: [math]\displaystyle{ (0, +\infty) }[/math] | Inverse squared |
[math]\displaystyle{ \mathbf{X}\boldsymbol{\beta}=\mu^{-2}\,\! }[/math] | [math]\displaystyle{ \mu=(\mathbf{X}\boldsymbol{\beta})^{-1/2}\,\! }[/math] | |
Poisson | integer: [math]\displaystyle{ 0,1,2,\ldots }[/math] | count of occurrences in fixed amount of time/space | Log | [math]\displaystyle{ \mathbf{X}\boldsymbol{\beta}=\ln{(\mu)}\,\! }[/math] | [math]\displaystyle{ \mu=\exp{(\mathbf{X}\boldsymbol{\beta})}\,\! }[/math] |
Bernoulli | integer: [math]\displaystyle{ \{0,1\} }[/math] | outcome of single yes/no occurrence | Logit | [math]\displaystyle{ \mathbf{X}\boldsymbol{\beta}=\ln{\left(\frac{\mu}{1-\mu}\right)}\,\! }[/math] | [math]\displaystyle{ \mu=\frac{\exp{(\mathbf{X}\boldsymbol{\beta})}}{1 + \exp{(\mathbf{X}\boldsymbol{\beta})}} = \frac{1}{1 + \exp{(-\mathbf{X}\boldsymbol{\beta})}}\,\! }[/math] |
Binomial | integer: [math]\displaystyle{ 0,1,\ldots,N }[/math] | count of # of "yes" occurrences out of N yes/no occurrences | |||
Categorical | integer: [math]\displaystyle{ [0,K) }[/math] | outcome of single K-way occurrence | |||
K-vector of integer: [math]\displaystyle{ [0,1] }[/math], where exactly one element in the vector has the value 1 | |||||
Multinomial | K-vector of integer: [math]\displaystyle{ [0,N] }[/math] | count of occurrences of different types (1 .. K) out of N total K-way occurrences |
- In the cases of the exponential and gamma distributions, the domain of the canonical link function is not the same as the permitted range of the mean. In particular, the linear predictor may be negative, which would give an impossible negative mean. When maximizing the likelihood, precautions must be taken to avoid this. An alternative is to use a noncanonical link function.
Note also that in the case of the Bernoulli, binomial, categorical and multinomial distributions, the support of the distributions is not the same type of data as the parameter being predicted. In all of these cases, the predicted parameter is one or more probabilities, i.e. real numbers in the range [math]\displaystyle{ [0,1] }[/math] . The resulting model is known as logistic regression (or multinomial logistic regression in the case that K-way rather than binary values are being predicted).
For the Bernoulli and binomial distributions, the parameter is a single probability, indicating the likelihood of occurrence of a single event. The Bernoulli still satisfies the basic condition of the generalized linear model in that, even though a single outcome will always be either 0 or 1, the expected value will nonetheless be a real-valued probability, i.e. the probability of occurrence of a "yes" (or 1) outcome. Similarly, in a binomial distribution, the expected value is Np, i.e. the expected proportion of "yes" outcomes will be the probability to be predicted.
For categorical and multinomial distributions, the parameter to be predicted is a K-vector of probabilities, with the further restriction that all probabilities must add up to 1. Each probability indicates the likelihood of occurrence of one of the K possible values. For the multinomial distribution, and for the vector form of the categorical distribution, the expected values of the elements of the vector can be related to the predicted probabilities similarly to the binomial and Bernoulli distributions.
- In the cases of the exponential and gamma distributions, the domain of the canonical link function is not the same as the permitted range of the mean. In particular, the linear predictor may be negative, which would give an impossible negative mean. When maximizing the likelihood, precautions must be taken to avoid this. An alternative is to use a noncanonical link function.