Central Limit Theorem
A Central Limit Theorem is a probability theorem which states that for a set [math]\displaystyle{ X }[/math] of [math]\displaystyle{ n }[/math] number of independent and identically distributed random variable sample (each with expected value [math]\displaystyle{ \mu }[/math] and variance [math]\displaystyle{ \sigma^2 }[/math]) of sufficiently large set size [math]\displaystyle{ n }[/math], the probability distribution of the sample mean [math]\displaystyle{ \bar X }[/math] of is approximately normal (with mean [math]\displaystyle{ \mu }[/math] and variance [math]\displaystyle{ {1}{n} \sigma^2 }[/math]), and the sample total distribution is approximately normal with mean [math]\displaystyle{ n\mu }[/math], and variance [math]\displaystyle{ n\sigma^2 }[/math]
- AKA: CLT.
- Context:
- It can be used to know about the shape (normal) of the Sampling Distribution along with its center (mean) and spread (standard error).
- …
- Example(s):
- In an experiment of throwing of a die, let us say 1000 samples [math]\displaystyle{ S_1, S_2, S_3,\dots\, S_{1000} }[/math]have been taken. Each sample [math]\displaystyle{ S_i }[/math] is of sample size [math]\displaystyle{ n=5 }[/math]. That is if a die thrown five times and the output showed up as 1, 3, 3, 4, 2; then [math]\displaystyle{ S_1=[1,3,3,4,2] }[/math]. Similarly in the next five throws the output showed up as 1, 1, 2, 6, 6;then [math]\displaystyle{ S_2=[1,1,2,6,6] }[/math] and so on. Now writing the samples and their respective means we get
[math]\displaystyle{ S_1=[1,3,3,4,2]; \mu_1=2.6 }[/math](mean of [math]\displaystyle{ S_1 }[/math])
[math]\displaystyle{ S_2=[1,1,2,6,6]; \mu_2=3.2 }[/math](mean of [math]\displaystyle{ S_2 }[/math])
[math]\displaystyle{ S_3=[1,6,5,2,4]; \mu_3=3.6 }[/math](mean of [math]\displaystyle{ S_3 }[/math])
…
[math]\displaystyle{ S_{1000}=[1,1,4,6,6]; \mu_{1000}=3.6 }[/math](mean of [math]\displaystyle{ S_{1000} }[/math])
By plotting all the sample means by keeping mean values ([math]\displaystyle{ \mu_i }[/math]) along x-axis and their frequencies along y-axis it can be observed that the sample distribution looks some what normal. Then if the size of the samples increases from [math]\displaystyle{ n=5 }[/math] to [math]\displaystyle{ n=20 }[/math], the distribution will be more close to a normal distribution. If the sample size n increases to 100, then the sample distribution will be even more closer to normal distribution then before two cases. So when the sample size [math]\displaystyle{ n\to\infty }[/math] the sampling distribution becomes a perfect normal distribution (mean, median and mode all are same). This is what called Central Limit Theorem.
- In an experiment of throwing of a die, let us say 1000 samples [math]\displaystyle{ S_1, S_2, S_3,\dots\, S_{1000} }[/math]have been taken. Each sample [math]\displaystyle{ S_i }[/math] is of sample size [math]\displaystyle{ n=5 }[/math]. That is if a die thrown five times and the output showed up as 1, 3, 3, 4, 2; then [math]\displaystyle{ S_1=[1,3,3,4,2] }[/math]. Similarly in the next five throws the output showed up as 1, 1, 2, 6, 6;then [math]\displaystyle{ S_2=[1,1,2,6,6] }[/math] and so on. Now writing the samples and their respective means we get
- Counter-Example(s):
- See: Statistical Independence, Random Variate, Probability Distribution, Identically Distributed, Weak Convergence of Measures, Independent And Identically Distributed Random Variables, Attractor.
References
2014
- (Wikipedia, 2014) ⇒ http://en.wikipedia.org/wiki/Central_limit_theorem Retrieved:2014-5-14.
- In probability theory, the central limit theorem (CLT) states that, given certain conditions, the arithmetic mean of a sufficiently large number of iterates of independent random variables, each with a well-defined expected value and well-defined variance, will be approximately normally distributed. That is, suppose that a sample is obtained containing a large number of observations, each observation being randomly generated in a way that does not depend on the values of the other observations, and that the arithmetic average of the observed values is computed. If this procedure is performed many times, the central limit theorem says that the computed values of the average will be distributed according to the normal distribution (commonly known as a "bell curve").
The central limit theorem has a number of variants. In its common form, the random variables must be identically distributed. In variants, convergence of the mean to the normal distribution also occurs for non-identical distributions, given that they comply with certain conditions.
In more general probability theory, a central limit theorem is any of a set of weak-convergence theorems. They all express the fact that a sum of many independent and identically distributed (i.i.d.) random variables, or alternatively, random variables with specific types of dependence, will tend to be distributed according to one of a small set of attractor distributions. When the variance of the i.i.d. variables is finite, the attractor distribution is the normal distribution. In contrast, the sum of a number of i.i.d. random variables with power law tail distributions decreasing as |x|−α−1 where 0 < α < 2 (and therefore having infinite variance) will tend to an alpha-stable distribution with stability parameter (or index of stability) of α as the number of variables grows.
- In probability theory, the central limit theorem (CLT) states that, given certain conditions, the arithmetic mean of a sufficiently large number of iterates of independent random variables, each with a well-defined expected value and well-defined variance, will be approximately normally distributed. That is, suppose that a sample is obtained containing a large number of observations, each observation being randomly generated in a way that does not depend on the values of the other observations, and that the arithmetic average of the observed values is computed. If this procedure is performed many times, the central limit theorem says that the computed values of the average will be distributed according to the normal distribution (commonly known as a "bell curve").
2014
- (Wikipedia, 2014) ⇒ http://en.wikipedia.org/wiki/Probability_theory#Central_limit_theorem Retrieved:2014-5-14.
- "The central limit theorem (CLT) is one of the great results of mathematics." (Chapter 18 in [1] )
It explains the ubiquitous occurrence of the normal distribution in nature.
The theorem states that the average of many independent and identically distributed random variables with finite variance tends towards a normal distribution irrespective of the distribution followed by the original random variables. Formally, let [math]\displaystyle{ X_1,X_2,\dots\, }[/math] be independent random variables with mean [math]\displaystyle{ \mu_\, }[/math] and variance [math]\displaystyle{ \sigma^2 \gt 0.\, }[/math] Then the sequence of random variables :[math]\displaystyle{ Z_n=\frac{\sum_{i=1}^n (X_i - \mu)}{\sigma\sqrt{n}}\, }[/math]
converges in distribution to a standard normal random variable.
Notice that for some classes of random variables the classic central limit theorem works rather fast (Berry–Esseen theorem), for example the distributions with finite first, second and third moment from the exponential family, on the other hand for some random variables of the heavy tail and fat tail variety, it works very slow or may not work at all: in such cases one may use the [[Stable distribution#A generalized central limit theorem|Generalized Central Limit Theorem]] (GCLT).
- "The central limit theorem (CLT) is one of the great results of mathematics." (Chapter 18 in [1] )
- ↑ David Williams, "Probability with martingales", Cambridge 1991/2008
2008
- (Upton & Cook, 2008) ⇒ Graham Upton, and Ian Cook. (2008). “A Dictionary of Statistics, 2nd edition revised." Oxford University Press. ISBN:0199541450
- QUOTE: A theorem, proposed by Laplace, explaining the importance of the normal distribution in Statistics. Let [math]\displaystyle{ X_l, X_2, ..., X_n, }[/math] be independent random variables each having the same distribution, with mean [math]\displaystyle{ \mu }[/math]. and variance [math]\displaystyle{ \sigma^2 }[/math]. Let [math]\displaystyle{ X }[/math], given by [math]\displaystyle{ \bar X = \frac{1}{n}(x_1 + x_2 + … + X_n) }[/math] denote the sample mean. The central limit theorem states that, for large [math]\displaystyle{ n }[/math], the distribution of [math]\displaystyle{ \bar X }[/math] is approximately a normal distribution with mean, [math]\displaystyle{ \mu }[/math] and variance [math]\displaystyle{ {1}{n} \sigma^2 }[/math]. Thus, for a large random sample of observations from a distribution with mean [math]\displaystyle{ \mu }[/math], and variance [math]\displaystyle{ \sigma^2 }[/math], the distribution of the sample mean is approximately normal with mean [math]\displaystyle{ \mu }[/math] and variance [math]\displaystyle{ {1}{n} \sigma^2 }[/math] and the distribution of the sample total is approximately normal with mean [math]\displaystyle{ n\mu }[/math], and variance [math]\displaystyle{ n\sigma^2 }[/math]. The phrase “central limit theorem” appears in a 1919 article by von Mises.