Wilson Score Interval
A Wilson Score Interval is a approximate Binomial proportion confidence interval that provides a method for calculating a confidence interval for a proportion in a statistical population.
- Context:
- It can (often) be considered the best method for estimating the proportion confidence interval.
- It can be derived from the Wilson Score Test (which is a part of the Rao Score Tests class).
- It can be frequently used in Bernoulli Trials to calculate the probability of success.
- It can serve as an Asymmetric Measure more robust to deviations from normality.
- It can depend on the Asymptotic Normality of an Estimator.
- It can be interpreted as a 95% confidence interval corresponding to values not rejected at the 5% level.
- It can be applicable in various scenarios, such as:
- Calculating success rates in Success–Failure Experiments.
- Estimating voter support in political surveys.
- Assessing the effectiveness of medical treatments in clinical trials.
- ...
- Example(s):
- As proposed in Wilson, 1927.
- ...
- Counter-Example(s):
- Adjusted Wald Intervals.
- Normal Approximation Interval, a less robust and symmetric alternative.
- See: Confidence Interval, Statistical Test, Rao Score Test, Asymptotic Normality, Score Test, Pearson's Chi-Squared Test, Clopper-Pearson Interval, Yates's Correction for Continuity.
References
2023
- (Wikipedia, 2023) ⇒ https://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval#Wilson_score_interval Retrieved:2023-11-26.
- The Wilson score interval is an improvement over the normal approximation interval in multiple respects. It was developed by Edwin Bidwell Wilson (1927).[1] Unlike the symmetric normal approximation interval (above), the Wilson score interval is asymmetric. It does not suffer from problems of overshoot and zero-width intervals that afflict the normal interval, and it may be safely employed with small samples and skewed observations.[2] The observed coverage probability is consistently closer to the nominal value, [math]\displaystyle{ 1 - \alpha }[/math] .[3]
Like the normal interval, the interval can be computed directly from a formula.
Wilson started with the normal approximation to the binomial: : [math]\displaystyle{ z \approx \frac{~\left(\,p - \hat{p}\,\right)~}{\sigma_n} }[/math] with the analytic formula for the sample standard deviation given by
[math]\displaystyle{ \sigma_n = \sqrt{\,\frac{\,p\left(1-p\right)\,}{n}~}~. }[/math]
Combining the two, and squaring out the radical, gives an equation that is quadratic in : : [math]\displaystyle{ \left(\, \hat{p} - p \,\right)^{2} = z^{2}\cdot\frac{\,p\left(1-p\right)\,}{n} }[/math] Transforming the relation into a standard-form quadratic equation for , treating [math]\displaystyle{ \hat p }[/math] and as known values from the sample (see prior section), and using the value of that corresponds to the desired confidence for the estimate of gives this:
[math]\displaystyle{ \lt P\gt \left( 1 + \frac{\,z^2\,}{n} \right) p^2 + \lt P\gt \left( - 2 {\hat p} - \frac{\,z^2\,}{n} \right) p + \lt P\gt \biggl( {\hat p}^2 \biggr) = 0 ~, }[/math]
where all of the values in parentheses are known quantities.
The solution for estimates the upper and lower limits of the confidence interval for . Hence the probability of success is estimated by : [math]\displaystyle{ p \approx (w^- , w^+ ) = \frac{1}{~1+\frac{\,z^2\,}{n}~}\left( \hat p+\frac{\,z^2\,}{2n} \right) ~ \pm ~ \frac{z}{~1+\frac{z^2}{n}~}\sqrt{\frac{\,\hat p(1-\hat p)\,}{n}+\frac{\,z^2\,}{4n^2}~} ~ }[/math] or the equivalent : [math]\displaystyle{ p \approx \frac{~ n_S + \tfrac{1}{2} z^2 ~}{ n + z^2 } ~ \pm ~ \frac{z}{n + z^2} \sqrt{ \frac{~n_S \, n_F~}{n} + \frac{z^2}{4} ~ }~. }[/math] The practical observation from using this interval is that it has good properties even for a small number of trials and / or an extreme probability.
Intuitively, the center value of this interval is the weighted average of [math]\displaystyle{ \hat{p} }[/math] and [math]\displaystyle{ \tfrac{1}{2} }[/math] , with [math]\displaystyle{ \hat{p} }[/math] receiving greater weight as the sample size increases. Formally, the center value corresponds to using a pseudocount of z2, the number of standard deviations of the confidence interval: add this number to both the count of successes and of failures to yield the estimate of the ratio. For the common two standard deviations in each direction interval (approximately 95% coverage, which itself is approximately 1.96 standard deviations), this yields the estimate [math]\displaystyle{ (n_S+2)/(n+4) }[/math] , which is known as the "plus four rule".
Although the quadratic can be solved explicitly, in most cases Wilson's equations can also be solved numerically using the fixed-point iteration : [math]\displaystyle{ p_{k+1}=\hat{p} \pm z\cdot\sqrt{\frac{ p_k \cdot \left( 1 - p_k \right)}{n}} }[/math] with [math]\displaystyle{ p_0 = \hat{p} }[/math] .
The Wilson interval can also be derived from the single sample z-test or Pearson's chi-squared test with two categories. The resulting interval, : [math]\displaystyle{ \left\{ \theta \,\,\bigg|\,\, y \le \frac{\hat{p} - \theta}{\sqrt{\tfrac{1}{n} \theta(1 - \theta)}} \le z \right\}, }[/math] can then be solved for [math]\displaystyle{ \theta }[/math] to produce the Wilson score interval. The test in the middle of the inequality is a score test.
- The Wilson score interval is an improvement over the normal approximation interval in multiple respects. It was developed by Edwin Bidwell Wilson (1927).[1] Unlike the symmetric normal approximation interval (above), the Wilson score interval is asymmetric. It does not suffer from problems of overshoot and zero-width intervals that afflict the normal interval, and it may be safely employed with small samples and skewed observations.[2] The observed coverage probability is consistently closer to the nominal value, [math]\displaystyle{ 1 - \alpha }[/math] .[3]
2021
- (O'Neill, 2021) ⇒ Barry O'Neill. (2021). “Mathematical properties and finite-population correction for the Wilson score interval.” In: arXiv preprint arXiv:2109.12464. arXiv.org.
- NOTE: This paper focuses on the mathematical properties of the Wilson score interval, specifically its application to finite populations. It discusses the generalization of the Wilson score interval, aiming to maintain its core properties while adapting it for broader statistical use.
1998
- (Newcombe, 1998) ⇒ Robert G. Newcombe. (1998). “Interval estimation for the difference between independent proportions: comparison of eleven methods.” In: Statistics in Medicine. Wiley Online Library.
- NOTE: This study addresses the construction of interval estimates for differences between independent proportions, highlighting the effectiveness of Wilson score intervals. It provides a comparative analysis of eleven different methods, underscoring the advantages and potential applications of Wilson score intervals in various statistical scenarios.
1998
- (Agresti & Coull, 1998) ⇒ Alan Agresti, and Bradley A. Coull. (1998). “Approximate is better than ‘exact’ for interval estimation of binomial proportions.” In: The American Statistician. Taylor & Francis.
- NOTE: Agresti and Coull's paper argues for the superiority of approximate methods in interval estimation of binomial proportions over exact methods in interval estimation of binomial proportions. It emphasizes the effectiveness of the score interval, especially the 95% score interval, in comparison to other methods such as the adjusted Wald intervals. The paper advocates for the use of the score interval in both teaching and practical statistical applications.
1927
- (Wilson, 1927) ⇒ Edwin Bidwell Wilson. (1927). “Probable Inference, the Law of Succession, and Statistical Inference." In: Journal of the American Statistical Association. https://doi.org/10.1080/01621459.1927.10502953