The *central limit theorem* (CLT) asserts that if random variable *X*
is the sum of a large class of independent random variables, each with reasonable
distributions, then *X* is approximately normally distributed. This celebrated theorem
has been the object of extensive theoretical research directed toward the discovery
of the most general conditions under which it is valid. On the other hand,
this theorem serves as the basis of an extraordinary amount of applied work.
In the statistics of large samples, the sample average is a constant times the
sum of the random variables in the sampling process . Thus, for large samples,
the sample average is approximately normal—whether or not the population
distribution is normal. In much of the theory of errors of measurement,
the observed error is the sum of a large number of independent random quantities which
contribute additively to the result. Similarly, in the theory of noise, the
noise signal is the sum of a large number of random components, independently produced.
In such situations, the assumption of a normal population distribution is frequently
quite appropriate.

We consider a form of the CLT under hypotheses which are reasonable assumptions in many practical situations. We sketch a proof of this version of the CLT, known as the Lindeberg-Lévy theorem, which utilizes the limit theorem on characteristic functions, above, along with certain elementary facts from analysis. It illustrates the kind of argument used in more sophisticated proofs required for more general cases.

Consider an independent sequence

Let *F _{n}* be the distribution function for

*t*. We sketch a proof of the theorem under the condition the

*X*form an iid class.

_{i}*Central Limit Theorem* (Lindeberg-Lévy form)

If

then

IDEAS OF A PROOF

There is no loss of generality in assuming *φ* be the common
characteristic function for the *X _{i}*, and for each

*n*let

*φ*be the characteristic function for

_{n}Using the power series expansion of *φ* about the origin noted above, we have

This implies

so that

A standard lemma of analysis ensures

It is a well known property of the exponential that

so that

By the convergence theorem on characteristic functions, above,

—

The theorem says that the distribution functions for sums of increasing numbers of the *X _{i}*
converge to the normal distribution function, but it does not tell how fast. It is instructive
to consider some examples, which are easily worked out with the aid of our m-functions.

** Demonstration of the central limit theorem**

*Discrete examples*

We first examine the gaussian approximation in two cases. We take the sum of five iid simple random variables in each case. The first variable has six distinct values; the second has only three. The discrete character of the sum is more evident in the second case. Here we use not only the gaussian approximation, but the gaussian approximation shifted one half unit (the so called continuity correction for integer-values random variables). The fit is remarkably good in either case with only five terms.

A principal tool is the m-function *diidsum* (sum of discrete iid random variables).
It uses a designated number of iterations of mgsum.

### Example 1: **First random variable**

```
X = [-3.2 -1.05 2.1 4.6 5.3 7.2];
PX = 0.1*[2 2 1 3 1 1];
EX = X*PX'
EX = 1.9900
VX = dot(X.^2,PX) - EX^2
VX = 13.0904
[x,px] = diidsum(X,PX,5); % Distribution for the sum of 5 iid rv
F = cumsum(px); % Distribution function for the sum
stairs(x,F) % Stair step plot
hold on
plot(x,gaussian(5*EX,5*VX,x),'-.') % Plot of gaussian distribution function
% Plotting details (see Figure 1)
```

### Example 2: **Second random variable**

```
X = 1:3;
PX = [0.3 0.5 0.2];
EX = X*PX'
EX = 1.9000
EX2 = X.^2*PX'
EX2 = 4.1000
VX = EX2 - EX^2
VX = 0.4900
[x,px] = diidsum(X,PX,5); % Distribution for the sum of 5 iid rv
F = cumsum(px); % Distribution function for the sum
stairs(x,F) % Stair step plot
hold on
plot(x,gaussian(5*EX,5*VX,x),'-.') % Plot of gaussian distribution function
plot(x,gaussian(5*EX,5*VX,x+0.5),'o') % Plot with continuity correction
% Plotting details (see Figure 2)
```

As another example, we take the sum of twenty one iid simple random variables with integer values. We examine only part of the distribution function where most of the probability is concentrated. This effectively enlarges the x-scale, so that the nature of the approximation is more readily apparent.

### Example 3: **Sum of twenty-one iid random variables**

```
X = [0 1 3 5 6];
PX = 0.1*[1 2 3 2 2];
EX = dot(X,PX)
EX = 3.3000
VX = dot(X.^2,PX) - EX^2
VX = 4.2100
[x,px] = diidsum(X,PX,21);
F = cumsum(px);
FG = gaussian(21*EX,21*VX,x);
stairs(40:90,F(40:90))
hold on
plot(40:90,FG(40:90))
% Plotting details (see Figure 3)
```

*Absolutely continuous examples*

By use of the discrete approximation, we may get approximations to the sums of
absolutely continuous random variables. The results on discrete variables indicate
that the more values the more quickly the conversion seems to occur. In our
next example, we start with a random variable uniform on

### Example 4: **Sum of three iid, uniform random variables.**

Suppose

```
tappr
Enter matrix [a b] of x-range endpoints [0 1]
Enter number of x approximation points 100
Enter density as a function of t t<=1
Use row matrices X and PX as in the simple case
EX = 0.5;
VX = 1/12;
[z,pz] = diidsum(X,PX,3);
F = cumsum(pz);
FG = gaussian(3*EX,3*VX,z);
length(z)
ans = 298
a = 1:5:296; % Plot every fifth point
plot(z(a),F(a),z(a),FG(a),'o')
% Plotting details (see Figure 4)
```

For the sum of only three random variables, the fit is remarkably good. This is
not entirely surprising, since the sum of two gives a symmetric triangular
distribution on

### Example 5: **Sum of eight iid random variables**

Suppose the density is one on the intervals

```
tappr
Enter matrix [a b] of x-range endpoints [-1 1]
Enter number of x approximation points 200
Enter density as a function of t (t<=-0.5)|(t>=0.5)
Use row matrices X and PX as in the simple case
[z,pz] = diidsum(X,PX,8);
VX = 7/12;
F = cumsum(pz);
FG = gaussian(0,8*VX,z);
plot(z,F,z,FG)
% Plottting details (see Figure 5)
```

Although the sum of eight random variables is used, the fit to the gaussian is not as good as that for the sum of three in Example 4. In either case, the convergence is remarkable fast—only a few terms are needed for good approximation.