Let X l
X l
denote a sequence of independent,
identically distributed, random variables. Assuming they have
zero means and finite variances (equaling
σ2
σ
2
), the Central Limit Theorem states that the sum
∑l=1L
X
l
L
l
1
L
X
l
L
converges in distribution to a Gaussian random variable.
1L∑l=1L
X
l
→
L
→
∞
0σ2
1
L
l
1
L
X
l
→
L
→
∞
0
σ
2
Because of its generality, this theorem is often used to
simplify calculations involving finite sums
of non-Gaussian random variables. However, attention is seldom
paid to the convergence rate of the Central Limit
Theorem. Kolmogorov, the famous twentieth century
mathematician, is reputed to have said, "The Central Limit
Theorem is a dangerous tool in the hands of amateurs." Let's
see what he meant.
Taking
σ2=1
σ
2
1
, the key result is that the magnitude of the
difference between
Px
P
x
, defined to be the probability that the sum given
above exceeds xx, and
Qx
Q
x
, the probability that a unit-variance Gaussian random
variable exceeds xx, is bounded by
a quantity inversely related to the square root of
LL (Cramer: Theorem
24).
|Px-Qx|≤cE|X|3σ31L
P
x
Q
x
c
X
3
σ
3
1
L
The constant of proportionality c
c is a number known to be about 0.8 (Hall: p6). The ratio of absolute third moment of
X l
X l
to the cube of its standard
deviation, known as the skew and denoted by
γ X
γ X
, depends only on the distribution of
X l
X l
and is independent of scale. This
bound on the absolute error has been shown to be tight (Cramer: pp. 79ff). Using our lower bound for
Q·
Q
·
(see (Reference)), we find
that the relative error in the Central Limit Theorem
approximation to the distribution of finite sums is bounded for
x>0
x
0
as
|Px-Qx|Qx≤c
γ
X
2πLⅇ+x222ifx≤11+x2xifx>1
P
x
Q
x
Q
x
c
γ
X
2
L
x
2
2
2
x
1
1
x
2
x
x
1
(1)
Suppose we require that the relative error not exceed some
specific value
εε. The
normalized (by the standard deviation) boundary
xx at which the approximation is
evaluated must not violate
Lε22πc2
γ
X
2≥ⅇx24ifx≤11+x2x2ifx>1
L
ε
2
2
c
2
γ
X
2
x
2
4
x
1
1
x
2
x
2
x
1
As shown in
Figure 1, the right side of this equation is
a monotonically increasing function.
If
ε=0.1
ε
0.1
and taking
c
γ
X
c
γ
X
arbitrarily to be unity (a reasonable value), the
upper limit of the preceding equation becomes
1.6×10-3L
1.6-3
L
. Examining Figure 1, we find that for
L=10000
L
10000
, xx must not exceed
1.17. Because we have normalized to unit variance, this
example suggests that the Gaussian approximates the
distribution of a ten-thousand term sum only over a range
corresponding to a 76% area about the mean. Consequently, the
Central Limit Theorem, as a finite-sample distributional
approximation, is only guaranteed to hold near the mode of the
Gaussian, with huge numbers of
observations needed to specify the tail behavior. Realizing
this fact will keep us from being ignorant amateurs.
-
H. Cramér. (1970). Random Variables and Probability Distributions. (Third Edition). Cambridge University Press.
-
P. Hall. (1982). Rates of Convergence in the Central Limit Theorem. In Research Notes in Mathematics. (Vol. 62, p. 6). Pitman Advanced Publishing Program.