Statistical Population Contrast

From GM-RKB
Jump to navigation Jump to search

A Statistical Population Contrast is a linear combination of statistical parameters in which the sum of coefficients is equal to zero.



References

2016

[...] Let [math]\displaystyle{ \theta_1,\ldots,\theta_t }[/math] be a set of variables, either parameters or statistics, and [math]\displaystyle{ a_1,\ldots,a_t }[/math] be known constants. The quantity [math]\displaystyle{ \sum_{i=1}^t a_i \theta_i }[/math] is a linear combination. It is called a contrast if [math]\displaystyle{ \sum_{i=1}^t a_i = 0 }[/math]. Furthermore, two contrasts, [math]\displaystyle{ \sum_{i=1}^t a_i \theta_i }[/math] and [math]\displaystyle{ \sum_{i=1}^t b_i \theta_i }[/math], are orthogonal if [math]\displaystyle{ \sum_{i=1}^t a_i b_i = 0 }[/math].

2015

  • (Seltman, 2015) ⇒ Seltman, H. J. (2015). “Experimental design and analysis", Chapter 13. Online at: http://www.stat.cmu.edu/~hseltman/309/Book/chapter13.pdf
    • In the chapter on probability theory, we saw that the sampling distribution of any of the sample means from a (one treatment) sample of size n using the assumptions of Normality, equal variance, and independent errors is [math]\displaystyle{ \bar{y}_i\sim N(\mu_i,\sigma^2/n) }[/math], i.e., across repeated experiments, a sample mean is Normally distributed with the “correct” mean and the variance equal to the common group variance reduced by a factor of n. Now we need to find the sampling distribution for some particular combination of sample means.
To do this, we need to write the contrast in “standard form”. The standard form involves writing a sum with one term for each population mean ([math]\displaystyle{ \mu }[/math]), whether or not it is in the particular contrast, and with a single number, called a contrast coefficient in front of each population mean. For our examples we get:
[math]\displaystyle{ \gamma_1 = (0)\mu_1 + (0)\mu_2 + (0)\mu_3 + (1)\mu_4 + (−1)\mu_5 + (0)\mu_6 }[/math]
and
[math]\displaystyle{ \gamma_2 = (1/2)\mu_1 + (1/2)\mu_2 + (−1/3)\mu_3 + (−1/3)\mu_4 + (−1/3)\mu_5 + (0)\mu_6 }[/math]
In a more general framing of the contrast we would write
[math]\displaystyle{ γ = C_1\mu1 + \cdots + C_k\mu_k }[/math]
In other words, each contrast can be summarized by specifying its k coefficients (C values). And it turns out that the k coefficients are what most computer programs want as input when you specify the contrast of a custom null hypothesis.
In our examples, the coefficients (and computer input) for null hypothesis H01 are [0, 0, 1, -1, 0, 0], and for H02 they are [1/2, 1/2, -1/3, -1/3, -1/3, 0]. Note that the zeros are necessary. For example, if you just entered [1, -1], the computer would not understand which pair of treatment population means you want it to compare. Also, note that any valid set of contrast coefficients must add to zero.