In probability theory, the central limit theorem states that, under certain circumstances, the probability distribution of the scaled mean of a random sampleconverges to a normal distribution as the sample size increases to infinity. Under stronger assumptions, the Berry–Esseen theorem, or Berry–Esseen inequality, gives a more quantitative result, because it also specifies the rate at which this convergence takes place by giving a bound on the maximal error of approximation between the normal distribution and the true distribution of the scaled sample mean. The approximation is measured by the Kolmogorov–Smirnov distance. In the case of independent samples, the convergence rate is n−1/2, where n is the sample size, and the constant is estimated in terms of the third absolute normalized moment.
Statement of the theorem
Statements of the theorem vary, as it was independently discovered by two mathematicians, Andrew C. Berry (in 1941) and Carl-Gustav Esseen (1942), who then, along with other authors, refined it repeatedly over subsequent decades.
Identically distributed summands
One version, sacrificing generality somewhat for the sake of clarity, is the following:
There exists a positive constantC such that if X1, X2, ..., are i.i.d. random variables with E(X1) = 0, E(X12) = σ2 > 0, and E(|X1|3) = ρ < ∞,[note 1] and if we define
That is: given a sequence of independent and identically distributed random variables, each having mean zero and positive variance, if additionally the third absolute moment is finite, then the cumulative distribution functions of the standardized sample mean and the standard normal distribution differ (vertically, on a graph) by no more than the specified amount. Note that the approximation error for all n (and hence the limiting rate of convergence for indefinite n sufficiently large) is bounded by the order of n−1/2.
Calculated upper bounds on the constant C have decreased markedly over the years, from the original value of 7.59 by Esseen in 1942.[1] The estimate C < 0.4748 follows from the inequality
since σ3 ≤ ρ and 0.33554 · 1.415 < 0.4748. However, if ρ ≥ 1.286σ3, then the estimate
Esseen (1956) proved that the constant also satisfies the lower bound
Non-identically distributed summands
Let X1, X2, ..., be independent random variables with E(Xi) = 0, E(Xi2) = σi2 > 0, and E(|Xi|3) = ρi < ∞. Also, let
be the normalized n-th partial sum. Denote Fn the cdf of Sn, and Φ the cdf of the standard normal distribution. For the sake of convenience denote
In 1941, Andrew C. Berry proved that for all n there exists an absolute constant C1 such that
where
Independently, in 1942, Carl-Gustav Esseen proved that for all n there exists an absolute constant C0 such that
where
It is easy to make sure that ψ0≤ψ1. Due to this circumstance inequality (3) is conventionally called the Berry–Esseen inequality, and the quantity ψ0 is called the Lyapunov fraction of the third order. Moreover, in the case where the summands X1, ..., Xn have identical distributions
and thus the bounds stated by inequalities (1), (2) and (3) coincide apart from the constant.
Regarding C0, obviously, the lower bound established by Esseen (1956) remains valid:
The lower bound is exactly reached only for certain Bernoulli distributions (see Esseen (1956) for their explicit expressions).
The upper bounds for C0 were subsequently lowered from Esseen's original estimate 7.59 to 0.5600.[3]
Sum of a random number of random variables
Berry–Esseen theorems exist for the sum of a random number of random variables. The following is Theorem 1 from Korolev (1989), substituting in the constants from Remark 3.[4] It is only a portion of the results that they established:
Let be independent, identically distributed random variables with , , . Let be a non-negative integer-valued random variable, independent from . Let , and define
Let be independent -valued random vectors each having mean zero. Write and assume is invertible. Let be a -dimensional Gaussian with the same mean and covariance matrix as . Then for all convex sets ,
,
where is a universal constant and (the third power of the L2 norm).
The dependency on is conjectured to be optimal, but might not be.[6]
Durrett, Richard (1991). Probability: Theory and Examples. Pacific Grove, CA: Wadsworth & Brooks/Cole. ISBN0-534-13206-5.
Esseen, Carl-Gustav (1942). "On the Liapunoff limit of error in the theory of probability". Arkiv för Matematik, Astronomi och Fysik. A28: 1–19. ISSN0365-4133.
Esseen, Carl-Gustav (1956). "A moment inequality with an application to the central limit theorem". Skand. Aktuarietidskr. 39: 160–170.
Feller, William (1972). An Introduction to Probability Theory and Its Applications, Volume II (2nd ed.). New York: John Wiley & Sons. ISBN0-471-25709-5.
Korolev, V. Yu.; Shevtsova, I. G. (2010a). "On the upper bound for the absolute constant in the Berry–Esseen inequality". Theory of Probability and Its Applications. 54 (4): 638–658. doi:10.1137/S0040585X97984449.
Korolev, Victor; Shevtsova, Irina (2010b). "An improvement of the Berry–Esseen inequality with applications to Poisson and mixed Poisson random sums". Scandinavian Actuarial Journal. 2012 (2): 1–25. arXiv:0912.2795. doi:10.1080/03461238.2010.485370. S2CID115164568.
Manoukian, Edward B. (1986). Modern Concepts and Theorems of Mathematical Statistics. New York: Springer-Verlag. ISBN0-387-96186-0.
Serfling, Robert J. (1980). Approximation Theorems of Mathematical Statistics. New York: John Wiley & Sons. ISBN0-471-02403-1.
Shevtsova, I. G. (2008). "On the absolute constant in the Berry–Esseen inequality". The Collection of Papers of Young Scientists of the Faculty of Computational Mathematics and Cybernetics (5): 101–110.
Shevtsova, Irina (2007). "Sharpening of the upper bound of the absolute constant in the Berry–Esseen inequality". Theory of Probability and Its Applications. 51 (3): 549–553. doi:10.1137/S0040585X97982591.
Shevtsova, Irina (2010). "An Improvement of Convergence Rate Estimates in the Lyapunov Theorem". Doklady Mathematics. 82 (3): 862–864. doi:10.1134/S1064562410060062. S2CID122973032.
Shevtsova, Irina (2011). "On the absolute constants in the Berry Esseen type inequalities for identically distributed summands". arXiv:1111.6554 [math.PR].
Tyurin, I.S. (2010). "An improvement of upper estimates of the constants in the Lyapunov theorem". Russian Mathematical Surveys. 65 (3(393)): 201–202. doi:10.1070/RM2010v065n03ABEH004688. S2CID118771013.