Sign Test

From GM-RKB
Jump to navigation Jump to search

A Sign Test is a non-parametric paired difference test similar to the parametric t-test.



References

2017a

  • (STAT 415, 2017) ⇒ STAT 415 Intro Mathematical Statistics. Penn State University. “The Sign Test for a Median" Retrieved 2017-01-08 from https://onlinecourses.science.psu.edu/stat414/node/318
    • Recall that for a continuous random variable [math]\displaystyle{ X }[/math], the median is the value [math]\displaystyle{ m }[/math] such that 50% of the time [math]\displaystyle{ X }[/math] lies below m and 50% of the time [math]\displaystyle{ X }[/math] lies above [math]\displaystyle{ m }[/math] (...) we'll assume that our random variable [math]\displaystyle{ X }[/math] is a continuous random variable with unknown median [math]\displaystyle{ m }[/math]. Upon taking a random sample [math]\displaystyle{ X_1, X_2, \cdots, X_n }[/math], we'll be interested in testing whether the median [math]\displaystyle{ m }[/math] takes on a particular value[math]\displaystyle{ m_0 }[/math]. That is, we'll be interested in testing the null hypothesis:
[math]\displaystyle{ H_0:\; m=m_0 }[/math]
against any of the possible alternative hypotheses:
[math]\displaystyle{ H_A:\; m \gt m_0 \quad or\quad m \lt m_0 \quad or\quad m\ne m_0 }[/math]

2017b

To form the sign test, compute [math]\displaystyle{ d_i = X_i - Y_i }[/math] where [math]\displaystyle{ X }[/math] and [math]\displaystyle{ Y }[/math] are the two samples. Count the number of times [math]\displaystyle{ d_i }[/math] is positive, R+, and the number of times it is negative, R-. If the samples have equal medians and the populations are symmetric, then R+ and R- should be similar. If there are too many positives (R+) or negatives (R-), then we reject the hypothesis of equality. Ties are excluded from the analysis. Since there are only two choices (+ or -) for [math]\displaystyle{ d_i }[/math] the test statistic for the sign test follows a binomial distribution with p=0.5.
Note that the binonial distribution is discrete, so the significance level will typically not be exact.
More formally, the hypothesis test is defined as follows.
[math]\displaystyle{ H_0:\quad u_1 = u_2 }[/math]
[math]\displaystyle{ H_a:\quad u1 \ne u2 \quad u1 \lt u2 \quad u1 \gt u2 }[/math]
Test Statistic: S- = BINCDF(R-,0.5,N) or S+ = BINCDF(R+,0.5,N)
where BINCDF is the cumulative distribution for the binomial distribution, R- is the number of minus signs (i.e., [math]\displaystyle{ d_i \lt 0 }[/math]), R+ is the number of plus signs (i.e., [math]\displaystyle{ d_i \gt 0 }[/math]), and N is the sample size excluding ties between the samples.
Significance Level: [math]\displaystyle{ \alpha }[/math] (typically set to .05). Due to the discreteness of the binomial distribution, the actual significance level will not in most cases be exact.
Critical Region: S+ < α: one sided test: U1 < U2 ; S- < α: one sided test: U1 > U2 ; α/2 < S+ < 1 - α/2: two sided test: U1 = U2
Conclusion: Reject the null hypothesis (or, equivalently, accept the alternative hypothesis) if the test statistic is in the critical region.

2016

  • (Wikipedia, 2016) ⇒ https://en.wikipedia.org/wiki/Sign_test Retrieved:2016-12-17.
    • The sign test is a statistical method to test for consistent differences between pairs of observations, such as the weight of subjects before and after treatment. Given pairs of observations (such as weight pre- and post-treatment) for each subject, the sign test determines if one member of the pair (such as pre-treatment) tends to be greater than (or less than) the other member of the pair (such as post-treatment).

      The paired observations may be designated x and y. For comparisons of paired observations (x,y), the sign test is most useful if comparisons can only be expressed as x > y, x = y, or x < y. If, instead, the observations can be expressed as numeric quantities (x = 7, y = 18), or as ranks (rank of x = 1st, rank of y = 8th), then the paired t-test or the Wilcoxon signed-rank test will usually have greater power than the sign test to detect consistent differences. If X and Y are quantitative variables, the sign test can be used to test the hypothesis that the difference between the X and Y has zero median, assuming continuous distributions of the two random variables X and Y, in the situation when we can draw paired samples from X and Y. [1]

      The sign test can also test if the median of a collection of numbers is significantly greater than or less than a specified value. For example, given a list of student grades in a class, the sign test can determine if the median grade is significantly different from, say, 75 out of 100.

      The sign test is a non-parametric test which makes very few assumptions about the nature of the distributions under test – this means that it has very general applicability but may lack the statistical power of the alternative tests.

  1. The Sign Test for a Median // STAT 415 Intro Mathematical Statistics. Penn State University.