Statistical Population Contrast
Jump to navigation
Jump to search
A Statistical Population Contrast is a linear combination of statistical parameters in which the sum of coefficients is equal to zero.
- AKA: Contrast.
- Context:
- It can be defined as [math]\displaystyle{ C =\sum_{i=1}^{k}c_i\mu_i }[/math], where [math]\displaystyle{ \mu_i }[/math] are statistical parameters (e.g. sample means) and [math]\displaystyle{ c_i }[/math] are the contrast coefficients. C is a statistical population contrast only if [math]\displaystyle{ \sum_{i=1}^{k}c_i=0 }[/math].
- Example(s)
- [math]\displaystyle{ C=(2)\mu_1+(-1)\mu_2+(-1)\mu_3 }[/math]
- [math]\displaystyle{ C=(2)\mu_1+(4)\mu_2+(-1)\mu_3+(-3)\mu_4+(-2)\mu_5 }[/math]
- Counter-Example(s):
- [math]\displaystyle{ C=(2)\mu_1+(+1)\mu_2+(0)\mu_3 }[/math]
- [math]\displaystyle{ C=(2)\mu_1+(1)\mu_2+(-1)\mu_3+(-3)\mu_4+(-2)\mu_5 }[/math]
- See: Multiple Comparisons Inference Task, Analysis of Variance, Linear Regression.
References
2016
- (Wikipedia, 2016) ⇒ http://en.wikipedia.org/wiki/Contrast_(statistics) Retrieved 2016-08-28
- In statistics, particularly in analysis of variance and linear regression, a contrast is a linear combination of variables (parameters or statistics) whose coefficients add up to zero, allowing comparison of different treatments.
- [...] Let [math]\displaystyle{ \theta_1,\ldots,\theta_t }[/math] be a set of variables, either parameters or statistics, and [math]\displaystyle{ a_1,\ldots,a_t }[/math] be known constants. The quantity [math]\displaystyle{ \sum_{i=1}^t a_i \theta_i }[/math] is a linear combination. It is called a contrast if [math]\displaystyle{ \sum_{i=1}^t a_i = 0 }[/math]. Furthermore, two contrasts, [math]\displaystyle{ \sum_{i=1}^t a_i \theta_i }[/math] and [math]\displaystyle{ \sum_{i=1}^t b_i \theta_i }[/math], are orthogonal if [math]\displaystyle{ \sum_{i=1}^t a_i b_i = 0 }[/math].
2015
- (Seltman, 2015) ⇒ Seltman, H. J. (2015). “Experimental design and analysis", Chapter 13. Online at: http://www.stat.cmu.edu/~hseltman/309/Book/chapter13.pdf
- In the chapter on probability theory, we saw that the sampling distribution of any of the sample means from a (one treatment) sample of size n using the assumptions of Normality, equal variance, and independent errors is [math]\displaystyle{ \bar{y}_i\sim N(\mu_i,\sigma^2/n) }[/math], i.e., across repeated experiments, a sample mean is Normally distributed with the “correct” mean and the variance equal to the common group variance reduced by a factor of n. Now we need to find the sampling distribution for some particular combination of sample means.
- To do this, we need to write the contrast in “standard form”. The standard form involves writing a sum with one term for each population mean ([math]\displaystyle{ \mu }[/math]), whether or not it is in the particular contrast, and with a single number, called a contrast coefficient in front of each population mean. For our examples we get:
- [math]\displaystyle{ \gamma_1 = (0)\mu_1 + (0)\mu_2 + (0)\mu_3 + (1)\mu_4 + (−1)\mu_5 + (0)\mu_6 }[/math]
- and
- [math]\displaystyle{ \gamma_2 = (1/2)\mu_1 + (1/2)\mu_2 + (−1/3)\mu_3 + (−1/3)\mu_4 + (−1/3)\mu_5 + (0)\mu_6 }[/math]
- In a more general framing of the contrast we would write
- [math]\displaystyle{ γ = C_1\mu1 + \cdots + C_k\mu_k }[/math]
- In other words, each contrast can be summarized by specifying its k coefficients (C values). And it turns out that the k coefficients are what most computer programs want as input when you specify the contrast of a custom null hypothesis.
- In our examples, the coefficients (and computer input) for null hypothesis H01 are [0, 0, 1, -1, 0, 0], and for H02 they are [1/2, 1/2, -1/3, -1/3, -1/3, 0]. Note that the zeros are necessary. For example, if you just entered [1, -1], the computer would not understand which pair of treatment population means you want it to compare. Also, note that any valid set of contrast coefficients must add to zero.