Kruskal-Wallis Test
(Redirected from Kruskal-Wallis test)
Jump to navigation
Jump to search
A Kruskal-Wallis Test is used to test the null hypothesis that the population median of all of the groups are equal.
- AKA: Kruskal–Wallis H test, One-way ANOVA on ranks.
- Context:
- It is a non-parametric version of ANOVA.
- See: ANOVA, F-test, Analysis of variance, Statistical Test, Bartlett's Test, Levene's Test, Brown–Forsythe Test.
References
2016
- (Wikipedia, 2016) ⇒ http://en.wikipedia.org/wiki/Kruskal–Wallis_one-way_analysis_of_variance Retrieved 2016-08-07
- The Kruskal–Wallis test by ranks, Kruskal–Wallis H test (named after William Kruskal and W. Allen Wallis), or One-way ANOVA on ranks is a non-parametric method for testing whether samples originate from the same distribution. It is used for comparing two or more independent samples of equal or different sample sizes. It extends the Mann–Whitney U test when there are more than two groups. The parametric equivalent of the Kruskal-Wallis test is the one-way analysis of variance (ANOVA). A significant Kruskal-Wallis test indicates that at least one sample stochastically dominates one other sample. The test does not identify where this stochastic dominance occurs or for how many pairs of groups stochastic dominance obtains. Dunn's test would help analyze the specific sample pairs for stochastic dominance.
Since it is a non-parametric method, the Kruskal–Wallis test does not assume a normal distribution of the residuals, unlike the analogous one-way analysis of variance. If the researcher can make the less stringent assumptions of an identically shaped and scaled distribution for all groups, except for any difference in medians, then the null hypothesis is that the medians of all groups are equal, and the alternative hypothesis is that at least one population median of one group is different from the population median of at least one other group.
- The Kruskal–Wallis test by ranks, Kruskal–Wallis H test (named after William Kruskal and W. Allen Wallis), or One-way ANOVA on ranks is a non-parametric method for testing whether samples originate from the same distribution. It is used for comparing two or more independent samples of equal or different sample sizes. It extends the Mann–Whitney U test when there are more than two groups. The parametric equivalent of the Kruskal-Wallis test is the one-way analysis of variance (ANOVA). A significant Kruskal-Wallis test indicates that at least one sample stochastically dominates one other sample. The test does not identify where this stochastic dominance occurs or for how many pairs of groups stochastic dominance obtains. Dunn's test would help analyze the specific sample pairs for stochastic dominance.
- Method
- Rank all data from all groups together; i.e., rank the data from 1 to N ignoring group membership. Assign any tied values the average of the ranks they would have received had they not been tied.
- The test statistic is given by:
- [math]\displaystyle{ H = (N-1)\frac{\sum_{i=1}^g n_i(\bar{r}_{i\cdot} - \bar{r})^2}{\sum_{i=1}^g\sum_{j=1}^{n_i}(r_{ij} - \bar{r})^2}, }[/math] where:
- [math]\displaystyle{ n_i }[/math] is the number of observations in group [math]\displaystyle{ i }[/math]
- [math]\displaystyle{ r_{ij} }[/math] is the rank (among all observations) of observation [math]\displaystyle{ j }[/math] from group [math]\displaystyle{ i }[/math]
- [math]\displaystyle{ N }[/math] is the total number of observations across all groups
- [math]\displaystyle{ \bar{r}_{i\cdot} = \frac{\sum_{j=1}^{n_i}{r_{ij}}}{n_i} }[/math] is the average rank of all observations in group [math]\displaystyle{ i }[/math]
- [math]\displaystyle{ \bar{r} =\tfrac 12 (N+1) }[/math] is the average of all the [math]\displaystyle{ r_{ij} }[/math].
- If the data contain no ties the denominator of the expression for [math]\displaystyle{ H }[/math] is exactly [math]\displaystyle{ (N-1)N(N+1)/12 }[/math] and [math]\displaystyle{ \bar{r}=\tfrac{N+1}{2} }[/math]. Thus
- [math]\displaystyle{
\begin{align}
H & = \frac{12}{N(N+1)}\sum_{i=1}^g n_i \left(\bar{r}_{i\cdot} - \frac{N+1}{2}\right)^2 \\ & = \frac{12}{N(N+1)}\sum_{i=1}^g n_i \bar{r}_{i\cdot }^2 -\ 3(N+1).
\end{align}
}[/math]
The last formula only contains the squares of the average ranks.
- [math]\displaystyle{
\begin{align}
H & = \frac{12}{N(N+1)}\sum_{i=1}^g n_i \left(\bar{r}_{i\cdot} - \frac{N+1}{2}\right)^2 \\ & = \frac{12}{N(N+1)}\sum_{i=1}^g n_i \bar{r}_{i\cdot }^2 -\ 3(N+1).
\end{align}
}[/math]
- A correction for ties if using the short-cut formula described in the previous point can be made by dividing [math]\displaystyle{ H }[/math] by [math]\displaystyle{ 1 - \frac{\sum_{i=1}^G (t_i^3 - t_i)}{N^3-N} }[/math], where G is the number of groupings of different tied ranks, and ti is the number of tied values within group i that are tied at a particular value. This correction usually makes little difference in the value of H unless there are a large number of ties.
- Finally, the p-value is approximated by [math]\displaystyle{ \Pr(\chi^2_{g-1} \ge H) }[/math]. If some [math]\displaystyle{ n_i }[/math] values are small (i.e., less than 5) the probability distribution of H can be quite different from this chi-squared distribution. If a table of the chi-squared probability distribution is available, the critical value of chi-squared, [math]\displaystyle{ \chi^2_{\alpha: g-1} }[/math], can be found by entering the table at g − 1 degrees of freedom and looking under the desired significance or alpha level.
- If the statistic is not significant, then there is no evidence of stochastic dominance between the samples. However, if the test is significant then at least one sample stochastically dominates another sample. Therefore, a researcher might use sample contrasts between individual sample pairs, or post hoc tests using Dunn's test, which (1) properly employs the same rankings as the Kruskal-Wallis test, and (2) properly employs the pooled variance implied by the null hypothesis of the Kruskal-Wallis test in order to determine which of the sample pairs are significantly different. When performing multiple sample contrasts or tests, the Type I error rate tends to become inflated, raising concerns about multiple comparisons.
- Method