Connexions

You are here: Home » Content » Central Limit Theorem: Using the Central Limit Theorem

Recently Viewed

This feature requires Javascript to be enabled.

Tags

(What is a tag?)

These tags come from the endorsement, affiliation, and other lenses that include this content.

Central Limit Theorem: Using the Central Limit Theorem

Note: You are viewing an old version of this document. The latest version is available here.

It is important for you to understand when to use the CLT. If you are being asked to find the probability of an average or mean, use the CLT for means or averages. If you are being asked to find the probability of a sum or total, use the CLT for sums. This also applies to percentiles for averages and sums.

Note:

If you are being asked to find the probability of an individual value, do not use the CLT. Use the distribution of its random variable.

Law of Large Numbers

The Law of Large Numbers says that if you take samples of larger and larger size from any population, then the mean x¯ x of the sample gets closer and closer to μμ. From the Central Limit Theorem, we know that as nn gets larger and larger, the sample averages follow a normal distribution. The larger n gets, the smaller the standard deviation gets. (Remember that the standard deviation for X¯ X is σ n σ n .) This means that the sample mean x¯ x must be close to the population mean μμ. We can say that μμ is the value that the sample averages approach as nn gets larger. The Central Limit Theorem illustrates the Law of Large Numbers.

Example 1

A study involving stress is done on a college campus among the students. The stress scores follow a uniform distribution with the lowest stress score equal to 1 and the highest equal to 5. Using a sample of 75 students, find:

• a. The probability that the average stress score for the 75 students is less than 2.
• b. The 90th percentile for the average stress score for the 75 students.
• c. The probability that the total of the 75 stress scores is less than 200.
• d. The 90th percentile for the total stress score for the 75 students.

Let XX = one stress score.

Problems a and b ask you to find a probability or a percentile for an average or mean. Problems c and d ask you to find a probability or a percentile for a total or sum. The sample size, nn, is equal to 75.

Since the individual stress scores follow a uniform distribution, XX ~ U(1, 5)U(1,5) where a=1a=1 and b=5b=5 (See the chapter on Continuous Random Variables).

μ X = a + b 2 = 1 + 5 2 = 3 μ X = a + b 2 = 1 + 5 2 =3

σ X = ( b - a ) 2 12 = ( 5 - 1 ) 2 12 = 1.15 σ X = ( b - a ) 2 12 = ( 5 - 1 ) 2 12 =1.15

For problems a and b, let X¯ X = the average stress score for the 75 students. Then,

X¯ X ~ N ( 3 , 1.15 75 ) N(3, 1.15 75 ) where n = 75n = 75.

Problem 1

Find P ( X¯ < 2 ) P ( X 2 ) . Draw the graph.

Solution

P ( X¯ < 2 ) = 0 P ( X 2 ) =0

The probability that the average stress score is less than 2 is about 0.

normalcdf ( 1 , 2 , 3 , 1.15 75 ) = 0 (1,2,3, 1.15 75 )=0

Reminder:
The smallest stress score is 1. Therefore, the smallest average for 75 stress scores is 1.

Problem 2

Find the 90th percentile for the average of 75 stress scores. Draw a graph.

Solution

Let k k = the 90th precentile.

Find kk where P ( X¯ < k ) = 0.90 P ( X k ) =0.90.

k = 3.2 k=3.2

The 90th percentile for the average of 75 scores is about 3.2. This means that 90% of all the averages of 75 stress scores are at most 3.2 and 10% are at least 3.2.

invNorm ( .90 , 3 , 1.15 75 ) = 3.2 (.90,3, 1.15 75 )=3.2

For problems c and d, let ΣXΣX = the sum of the 75 stress scores. Then, ΣXΣX ~ N [ ( 75 ) ( 3 ) , 75 1.15 ] N[(75)(3), 75 1.15]

Problem 3

Find P ( ΣX < 200 ) P ( ΣX 200 ) . Draw the graph.

Solution

The mean of the sum of 75 stress scores is 75 3 = 225 753=225

The standard deviation of the sum of 75 stress scores is 75 1.15 = 9.96 75 1.15=9.96

P ( ΣX < 200 ) = 0 P ( ΣX 200 ) =0

The probability that the total of 75 scores is less than 200 is about 0.

normalcdf ( 75 , 200 , 75 3 , 75 1.15 ) = 0 (75,200,753, 75 1.15)=0.

Reminder:
The smallest total of 75 stress scores is 75 since the smallest single score is 1.

Problem 4

Find the 90th percentile for the total of 75 stress scores. Draw a graph.

Solution

Let k k = the 90th percentile.

Find k k where P ( ΣX < k ) = 0.90 P ( ΣX k ) =0.90.

k = 237.8 k=237.8

The 90th percentile for the sum of 75 scores is about 237.8. This means that 90% of all the sums of 75 scores are no more than 237.8 and 10% are no less than 237.8.

invNorm ( .90 , 75 3 , 75 1.15 ) = 237.8 (.90,753, 75 1.15)=237.8

Example 2

The distribution of ages of statistics students at a certain college has an exponential distribution with a mean age of 22 years. Eighty statistics students are randomly selected. Find

• a. The probability that the average age of the 80 statistics students is more than 20.
• b. The 95th percentile for the average age of the 80 statistics students.

Let XX = the age of one statistics student. Then XX ~ Exp(122)Exp(122) (Chapter 5). μ=22μ=22 and σ=22σ=22. n=80n=80.

Let X¯ X = the average age of the 80 statistics students. Then

X¯ X ~ N ( 22 , 22 80 ) N(22, 22 80 ) by the CLT for Sample Means or Averages

Problem 1

Find P ( X¯ > 20 ) P ( X 20 ) Draw the graph.

Solution

P ( X¯ > 20 ) = 0.7919 P ( X 20 ) =0.7919

The probability that the average stress score is more than 20 is 0.7919.

normalcdf ( 20 , 1E99 , 22 , 22 80 ) (20,1E99,22, 22 80 )

Reminder:
1E99 = 10 99 and -1E99 = - 10 99 1E99= 10 99 and-1E99=- 10 99 . Press the EE key for E.

Problem 2

Find the 95th percentile for the average of 75 stress scores. Draw a graph.

Solution

Let kk = the 95th percentile.

Find kk where P ( X¯ < k ) = 0.95 P ( X k ) =0.95

k = 26.0 k=26.0

The 95th percentile for the average age of 80 statistics students at a certain community college is about 26.0. This means that 95% of the average ages of statistics students are at most 26.0 and 10% are at least 26.0.

 invNorm ( .95 , 22 , 22 80 ) = 26.0 (.95,22, 22 80 )=26.0

Glossary

Average:
A number that describes the central tendency of the data. There are a number of specialized averages, including the arithmetic mean, weighted mean, median, mode, and geometric mean.
Central Limit Theorem:
Given a random variable (RV) with known mean μμ and known variance σσ 22 size 12{ {} rSup { size 8{2} } } {}, we are sampling with size n and we are interested in two new RV - sample mean, XˉXˉ size 12{ { bar {X}}} {},and sample sum,ΣΣ XX size 12{X} {}. If the size n of the sample is sufficiently large, then XˉXˉ size 12{ { bar {X}}} {} N σ 2 n N σ 2 n and ΣXΣX size 12{X} {}N n σ 2 N n σ 2 . In words, if the size n of the sample is sufficiently large, then the distribution of the sample means and the distribution of the sample sums will approximate a normal distribution regardless of the shape of the population. And even more, the mean of the sampling distribution will equal the population mean and mean of sampling sums will equal n times the population mean. The standard deviation of the distribution of the sample means, σ n σ n , is called standard error of the mean.
Exponential Distribution:
Continuous random variable (RV) that appears when we are interested in intervals of time between some random events, for example, the length of time between emergency arrivals at a hospital. Notation: X~Exp(m)X~Exp(m) size 12{X "~" ital "Exp" $$m$$ } {}; the mean is μ=1mμ=1m size 12{μ= { {1} over {m} } } {}, and the variance is σ 2 = 1 m 2 σ 2 = 1 m 2 , the probability density function is f(x)=memx,f(x)=memx, size 12{f $$x$$ = ital "me" rSup { size 8{- ital "mx"} } ," "} {} x 0 x 0 and cumulative distribution is P(Xx)=1emxP(Xx)=1emx size 12{P $$X <= x$$ =1-e rSup { size 8{- ital "mx"} } } {}.
Mean:
A number to measure the central tendency (average), shortening from arithmetic mean. By definition, the mean for a sample (usually denoted by XˉXˉ size 12{ { bar {X}}} {}) is Xˉ=Sum of all values in the sampleNumber of values in the sampleXˉ=Sum of all values in the sampleNumber of values in the sample size 12{ { bar {X}}= { {"Sum of all values in the sample"} over {"Number of values in the sample"} } } {}, and the mean for a population (usually denoted by mm size 12{m} {}) is m=Sum of all values in the populationNumber of values in the populationm=Sum of all values in the populationNumber of values in the population size 12{m= { {"Sum of all values in the population"} over {"Number of values in the population"} } } {}.

Content actions

Give feedback:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags?

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks