Skip to content Skip to navigation

OpenStax-CNX

You are here: Home » Content » Central Limit Theorem: Using the Central Limit Theorem

Navigation

Lenses

What is a lens?

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

This content is ...

Endorsed by Endorsed (What does "Endorsed by" mean?)

This content has been endorsed by the organizations listed. Click each link for a list of all content endorsed by the organization.
  • College Open Textbooks display tagshide tags

    This module is included inLens: Community College Open Textbook Collaborative
    By: CC Open Textbook CollaborativeAs a part of collection: "Collaborative Statistics"

    Comments:

    "Reviewer's Comments: 'I recommend this book. Overall, the chapters are very readable and the material presented is consistent and appropriate for the course. A wide range of exercises introduces […]"

    Click the "College Open Textbooks" link to see all content they endorse.

    Click the tag icon tag icon to display tags associated with this content.

  • JVLA Endorsed

    This module is included inLens: Jesuit Virtual Learning Academy Endorsed Material
    By: Jesuit Virtual Learning AcademyAs a part of collection: "Collaborative Statistics"

    Comments:

    "This is a robust collection (textbook) approved by the College Board as a resource for the teaching of AP Statistics. "

    Click the "JVLA Endorsed" link to see all content they endorse.

  • WebAssign display tagshide tags

    This module is included inLens: WebAssign The Independent Online Homework and Assessment Solution
    By: WebAssignAs a part of collection: "Collaborative Statistics"

    Comments:

    "Online homework and assessment available from WebAssign."

    Click the "WebAssign" link to see all content they endorse.

    Click the tag icon tag icon to display tags associated with this content.

Affiliated with (What does "Affiliated with" mean?)

This content is either by members of the organizations listed or about topics related to the organizations listed. Click each link to see a list of all content affiliated with the organization.
  • OrangeGrove display tagshide tags

    This module is included inLens: Florida Orange Grove Textbooks
    By: Florida Orange GroveAs a part of collection: "Collaborative Statistics"

    Click the "OrangeGrove" link to see all content affiliated with them.

    Click the tag icon tag icon to display tags associated with this content.

  • Bookshare

    This module is included inLens: Bookshare's Lens
    By: Bookshare - A Benetech InitiativeAs a part of collection: "Collaborative Statistics"

    Comments:

    "DAISY and BRF versions of this collection are available."

    Click the "Bookshare" link to see all content affiliated with them.

  • Featured Content display tagshide tags

    This module is included inLens: Connexions Featured Content
    By: ConnexionsAs a part of collection: "Collaborative Statistics"

    Comments:

    "Collaborative Statistics was written by two faculty members at De Anza College in Cupertino, California. This book is intended for introductory statistics courses being taken by students at two- […]"

    Click the "Featured Content" link to see all content affiliated with them.

    Click the tag icon tag icon to display tags associated with this content.

Also in these lenses

  • statistics display tagshide tags

    This module is included inLens: Statistics
    By: Brylie OxleyAs a part of collection: "Collaborative Statistics"

    Click the "statistics" link to see all content selected in this lens.

    Click the tag icon tag icon to display tags associated with this content.

  • Lucy Van Pelt display tagshide tags

    This module is included inLens: Lucy's Lens
    By: Tahiya MaromeAs a part of collection: "Collaborative Statistics"

    Comments:

    "Part of the Books featured on Community College Open Textbook Project"

    Click the "Lucy Van Pelt" link to see all content selected in this lens.

    Click the tag icon tag icon to display tags associated with this content.

  • Educational Technology Lens display tagshide tags

    This module is included inLens: Educational Technology
    By: Steve WilhiteAs a part of collection: "Collaborative Statistics"

    Click the "Educational Technology Lens" link to see all content selected in this lens.

    Click the tag icon tag icon to display tags associated with this content.

  • Statistics

    This module is included inLens: Mathieu Plourde's Lens
    By: Mathieu PlourdeAs a part of collection: "Collaborative Statistics"

    Click the "Statistics" link to see all content selected in this lens.

  • statf12

    This module is included inLens: Statistics Fall 2012
    By: Alex KolesnikAs a part of collection: "Collaborative Statistics"

    Click the "statf12" link to see all content selected in this lens.

  • UTEP display tagshide tags

    This module is included inLens: Amy Wagler's Lens
    By: Amy WaglerAs a part of collection: "Collaborative Statistics"

    Click the "UTEP" link to see all content selected in this lens.

    Click the tag icon tag icon to display tags associated with this content.

  • Make Textbooks Affordable

    This module is included inLens: Make Textbooks Affordable
    By: Nicole AllenAs a part of collection: "Collaborative Statistics"

    Click the "Make Textbooks Affordable" link to see all content selected in this lens.

  • BUS204 Homework display tagshide tags

    This module is included inLens: Saylor BUS 204 Homework
    By: David BourgeoisAs a part of collection: "Collaborative Statistics"

    Comments:

    "Homework for Discrete Variables/Probability. "

    Click the "BUS204 Homework" link to see all content selected in this lens.

    Click the tag icon tag icon to display tags associated with this content.

  • crowe

    This module is included in aLens by: Chris RoweAs a part of collection: "Collaborative Statistics"

    Click the "crowe" link to see all content selected in this lens.

  • Bio 502 at CSUDH display tagshide tags

    This module is included inLens: Bio 502
    By: Terrence McGlynnAs a part of collection: "Collaborative Statistics"

    Comments:

    "This is the course textbook for Biology 502 at CSU Dominguez Hills"

    Click the "Bio 502 at CSUDH" link to see all content selected in this lens.

    Click the tag icon tag icon to display tags associated with this content.

Recently Viewed

This feature requires Javascript to be enabled.

Tags

(What is a tag?)

These tags come from the endorsement, affiliation, and other lenses that include this content.
 

Central Limit Theorem: Using the Central Limit Theorem

Module by: Susan Dean, Barbara Illowsky, Ph.D.. E-mail the authors

Summary: Central Limit Theorem: Using the Central Limit Theorem is part of the collection col10555 written by Barbara Illowsky and Susan Dean. It covers how and when to use the Central Limit Theorem and has contributions from Roberta Bloom.

It is important for you to understand when to use the CLT. If you are being asked to find the probability of the mean, use the CLT for the mean. If you are being asked to find the probability of a sum or total, use the CLT for sums. This also applies to percentiles for means and sums.

Note:

If you are being asked to find the probability of an individual value, do not use the CLT. Use the distribution of its random variable.

Examples of the Central Limit Theorem

Law of Large Numbers

The Law of Large Numbers says that if you take samples of larger and larger size from any population, then the mean x¯ x of the sample tends to get closer and closer to μμ. From the Central Limit Theorem, we know that as nn gets larger and larger, the sample means follow a normal distribution. The larger n gets, the smaller the standard deviation gets. (Remember that the standard deviation for X¯ X is σ n σ n .) This means that the sample mean x¯ x must be close to the population mean μμ. We can say that μμ is the value that the sample means approach as nn gets larger. The Central Limit Theorem illustrates the Law of Large Numbers.

Central Limit Theorem for the Mean and Sum Examples

Example 1

A study involving stress is done on a college campus among the students. The stress scores follow a uniform distribution with the lowest stress score equal to 1 and the highest equal to 5. Using a sample of 75 students, find:

  1. The probability that the mean stress score for the 75 students is less than 2.
  2. The 90th percentile for the mean stress score for the 75 students.
  3. The probability that the total of the 75 stress scores is less than 200.
  4. The 90th percentile for the total stress score for the 75 students.

Let XX = one stress score.

Problems 1. and 2. ask you to find a probability or a percentile for a mean. Problems 3 and 4 ask you to find a probability or a percentile for a total or sum. The sample size, nn, is equal to 75.

Since the individual stress scores follow a uniform distribution, XX ~ U(1, 5)U(1,5) where a=1a=1 and b=5b=5 (See Continuous Random Variables for the uniform).

μ X = a + b 2 = 1 + 5 2 = 3 μ X = a + b 2 = 1 + 5 2 =3

σ X = ( b - a ) 2 12 = ( 5 - 1 ) 2 12 = 1.15 σ X = ( b - a ) 2 12 = ( 5 - 1 ) 2 12 =1.15

For problems 1. and 2., let X¯ X = the mean stress score for the 75 students. Then,

X¯ X ~ N ( 3 , 1.15 75 ) N(3, 1.15 75 ) where n = 75n = 75.

Problem 1

Find P ( x¯ < 2 ) P ( x 2 ) . Draw the graph.

Solution

P ( x¯ < 2 ) = 0 P ( x 2 ) =0

The probability that the mean stress score is less than 2 is about 0.

Normal distribution curve for the average with values of 2 and 3 on the x-axis. A vertical upward line extends from point 2 up to the curve. The probability area occurs from the beginning of the curve to point 2.

normalcdf ( 1 , 2 , 3 , 1.15 75 ) = 0 (1,2,3, 1.15 75 )=0

Reminder:
The smallest stress score is 1. Therefore, the smallest mean for 75 stress scores is 1.

Problem 2

Find the 90th percentile for the mean of 75 stress scores. Draw a graph.

Solution

Let k k = the 90th precentile.

Find kk where P ( x¯ < k ) = 0.90 P ( x k ) =0.90.

k = 3.2 k=3.2

Normal distribution curve graph with a vertical upward line at point k on the x-axis. The probability area under the curve before k is equal to 0.90. k is equal to the 90th percentile.

The 90th percentile for the mean of 75 scores is about 3.2. This tells us that 90% of all the means of 75 stress scores are at most 3.2 and 10% are at least 3.2.

invNorm ( .90 , 3 , 1.15 75 ) = 3.2 (.90,3, 1.15 75 )=3.2

For problems c and d, let ΣXΣX = the sum of the 75 stress scores. Then, ΣXΣX ~ N [ ( 75 ) ( 3 ) , 75 1.15 ] N[(75)(3), 75 1.15]

Problem 3

Find P ( Σx < 200 ) P ( Σx 200 ) . Draw the graph.

Solution

The mean of the sum of 75 stress scores is 75 3 = 225 753=225

The standard deviation of the sum of 75 stress scores is 75 1.15 = 9.96 75 1.15=9.96

P ( Σx < 200 ) = 0 P ( Σx 200 ) =0

Normal distribution curve of the sum x with values of 200 and 225 on the x-axis. A vertical upward line extends from point 200 to the curve. The probability area begins from the beginning of the curve to point 200.

The probability that the total of 75 scores is less than 200 is about 0.

normalcdf ( 75 , 200 , 75 3 , 75 1.15 ) = 0 (75,200,753, 75 1.15)=0.

Reminder:
The smallest total of 75 stress scores is 75 since the smallest single score is 1.

Problem 4

Find the 90th percentile for the total of 75 stress scores. Draw a graph.

Solution

Let k k = the 90th percentile.

Find k k where P ( Σx < k ) = 0.90 P ( Σx k ) =0.90.

k = 237.8 k=237.8

Normal distribution curve of sum x with k on the x-axis. Vertical upward line extends from k to the curve. The probability area under the curve from the beginning of the curve to k is equal to 0.90.

The 90th percentile for the sum of 75 scores is about 237.8. This tells us that 90% of all the sums of 75 scores are no more than 237.8 and 10% are no less than 237.8.

invNorm ( .90 , 75 3 , 75 1.15 ) = 237.8 (.90,753, 75 1.15)=237.8

Example 2

Suppose that a market research analyst for a cell phone company conducts a study of their customers who exceed the time allowance included on their basic cell phone contract; the analyst finds that for those people who exceed the time included in their basic contract, the excess time used follows an exponential distribution with a mean of 22 minutes.

Consider a random sample of 80 customers who exceed the time allowance included in their basic cell phone contract.

Let XX = the excess time used by one INDIVIDUAL cell phone customer who exceeds his contracted time allowance.

XX ~ Exp(122)Exp(122) From Chapter 5, we know that μ=22μ=22 and σ=22σ=22.

Let X¯ X = the mean excess time used by a sample of n = 80 n = 80 customers who exceed their contracted time allowance.

X¯ X ~ N ( 22 , 22 80 ) N(22, 22 80 ) by the CLT for Sample Means

Problem 1

Using the CLT to find Probability:
  • a. Find the probability that the mean excess time used by the 80 customers in the sample is longer than 20 minutes. This is asking us to find P ( x¯ > 20 ) P ( x 20 ) Draw the graph.
  • b. Suppose that one customer who exceeds the time limit for his cell phone contract is randomly selected. Find the probability that this individual customer's excess time is longer than 20 minutes. This is asking us to find P(x>20) P(x 20)
  • c. Explain why the probabilities in (a) and (b) are different.
Solution
Part a.

Find: P ( x¯ > 20 ) P ( x 20 )

P ( x¯ > 20 ) = 0.7919 P ( x 20 ) =0.7919 using normalcdf ( 20 , 1E99 , 22 , 22 80 ) (20,1E99,22, 22 80 )

The probability is 0.7919 that the mean excess time used is more than 20 minutes, for a sample of 80 customers who exceed their contracted time allowance.

Normal distribution curve with values of 20 and 22 on the x-axis. Vertical upward line extends from point 20 to curve. The probability area begins from point 20 to the end of the curve.

Reminder:
1E99 = 10 99 and -1E99 = - 10 99 1E99= 10 99 and-1E99=- 10 99 . Press the EE key for E. Or just use 10^99 instead of 1E99.
Part b.

Find P(x>20) . Remember to use the exponential distribution for an individual: X~Exp(1/22).

P(X>20) = e^(–(1/22)*20) or e^(–.04545*20) = 0.4029

Part c. Explain why the probabilities in (a) and (b) are different.
  • P ( x > 20 ) = 0.4029 P ( x 20 ) =0.4029 but P ( x¯ > 20 ) = 0.7919 P ( x 20 ) =0.7919
  • The probabilities are not equal because we use different distributions to calculate the probability for individuals and for means.
  • When asked to find the probability of an individual value, use the stated distribution of its random variable; do not use the CLT. Use the CLT with the normal distribution when you are being asked to find the probability for an mean.

Problem 2

Using the CLT to find Percentiles:

Find the 95th percentile for the sample mean excess time for samples of 80 customers who exceed their basic contract time allowances. Draw a graph.

Solution

Let kk = the 95th percentile. Find kk where P ( x¯ < k ) = 0.95 P ( x k ) =0.95

k = 26.0 k=26.0 using invNorm ( .95 , 22 , 22 80 ) = 26.0 (.95,22, 22 80 )=26.0

Normal distribution curve with value of k on x-axis. Vertical upward line extends from k to curve. Probability area from the beginning of the curve to point k is equal to 0.95.

The 95th percentile for the sample mean excess time used is about 26.0 minutes for random samples of 80 customers who exceed their contractual allowed time.

95% of such samples would have means under 26 minutes; only 5% of such samples would have means above 26 minutes.

Note:

(HISTORICAL): Normal Approximation to the Binomial

Historically, being able to compute binomial probabilities was one of the most important applications of the Central Limit Theorem. Binomial probabilities were displayed in a table in a book with a small value for nn (say, 20). To calculate the probabilities with large values of nn, you had to use the binomial formula which could be very complicated. Using the Normal Approximation to the Binomial simplified the process. To compute the Normal Approximation to the Binomial, take a simple random sample from a population. You must meet the conditions for a binomial distribution:

  • •. there are a certain number nn of independent trials
  • •. the outcomes of any trial are success or failure
  • •. each trial has the same probability of a success pp
Recall that if XX is the binomial random variable, then XX~ B ( n , p )B(n,p). The shape of the binomial distribution needs to be similar to the shape of the normal distribution. To ensure this, the quantities npnp and nqnq must both be greater than five (np>5np>5 and nq>5nq>5; the approximation is better if they are both greater than or equal to 10). Then the binomial can be approximated by the normal distribution with mean μ= n p μ= n p and standard deviation σ= n p q . σ= n p q . Remember that q=1-p. q=1-p. In order to get the best approximation, add 0.5 to x x or subtract 0.5 from x x ( (use x+0.5x+0.5 or x-0.5x-0.5)). The number 0.50.5 is called the continuity correction factor.

Example 3

Suppose in a local Kindergarten through 12th grade (K - 12) school district, 53 percent of the population favor a charter school for grades K - 5. A simple random sample of 300 is surveyed.

  1. Find the probability that at least 150 favor a charter school.
  2. Find the probability that at most 160 favor a charter school.
  3. Find the probability that more than 155 favor a charter school.
  4. Find the probability that less than 147 favor a charter school.
  5. Find the probability that exactly 175 favor a charter school.

Let X=X= the number that favor a charter school for grades K - 5. XX~ B ( n , p )B(n,p) where n=300n=300 and p=0.53. p=0.53. Since np>5np>5 and nq>5, nq>5, use the normal approximation to the binomial. The formulas for the mean and standard deviation are μ= n p μ= n p and σ= n p q . σ= n p q . The mean is 159 and the standard deviation is 8.6447. The random variable for the normal distribution is YY. Y~N ( 159 , 8.6447 )Y~N ( 159 , 8.6447 ). See The Normal Distribution for help with calculator instructions.

For Problem 1., you include 150 so P ( x 150 ) P ( x 150 ) has normal approximation P ( Y 149.5 ) = 0.8641 P ( Y 149.5 ) = 0.8641.

normalcdf ( 149.5 , 10^99 , 159 , 8.6447 ) = 0.8641 (149.5,10^99,159,8.6447)=0.8641.

For Problem 2., you include 160 so P ( x 160 ) P ( x 160 ) has normal approximation P ( Y 160.5 ) = 0.5689 P ( Y 160.5 ) = 0.5689.

normalcdf ( 0 , 160.5 , 159 , 8.6447 ) = 0.5689 (0,160.5,159,8.6447)=0.5689

For Problem 3., you exclude 155 so P ( x > 155 ) P ( x 155 ) has normal approximation P ( y > 155.5 ) = 0.6572 P ( y 155.5 ) = 0.6572.

normalcdf ( 155.5 , 10^99 , 159 , 8.6447 ) = 0.6572 (155.5,10^99,159,8.6447)=0.6572

For Problem 4., you exclude 147 so P ( x < 147 ) P ( x 147 ) has normal approximation P ( Y < 146.5 ) = 0.0741 P ( Y 146.5 ) = 0.0741.

normalcdf ( 0 , 146.5 , 159 , 8.6447 ) = 0.0741 (0,146.5,159,8.6447)=0.0741

For Problem 5., P ( x = 175 ) P ( x = 175 ) has normal approximation P ( 174.5 < y < 175.5 ) = 0.0083 P(174.5<y<175.5) = 0.0083.

normalcdf ( 174.5 , 175.5 , 159 , 8.6447 ) = 0.0083 (174.5,175.5,159,8.6447)=0.0083

Because of calculators and computer software that easily let you calculate binomial probabilities for large values of nn, it is not necessary to use the the Normal Approximation to the Binomial provided you have access to these technology tools. Most school labs have Microsoft Excel, an example of computer software that calculates binomial probabilities. Many students have access to the TI-83 or 84 series calculators and they easily calculate probabilities for the binomial. In an Internet browser, if you type in "binomial probability distribution calculation," you can find at least one online calculator for the binomial.

For Example 3, the probabilities are calculated using the binomial (n=300n=300 and p=0.53p=0.53) below. Compare the binomial and normal distribution answers. See Discrete Random Variables for help with calculator instructions for the binomial.

P ( x 150 ) P ( x 150 ) : 1 - binomialcdf ( 300 , 0.53 , 149 ) =0.8641(300,0.53,149)=0.8641

P ( x 160 ) P ( x 160 ) : binomialcdf ( 300 , 0.53 , 160 ) =0.5684(300,0.53,160)=0.5684

P ( x > 155 ) P ( x 155 ) : 1 - binomialcdf ( 300 , 0.53 , 155 ) =0.6576(300,0.53,155)=0.6576

P ( x < 147 ) P ( x 147 ) : binomialcdf ( 300 , 0.53 , 146 ) =0.0742(300,0.53,146)=0.0742

P ( x = 175 ) P ( x = 175 ) : (You use the binomial pdf.) binomialpdf ( 175 , 0.53 , 146 ) =0.0083(175,0.53,146)=0.0083

**Contributions made to Example 2 by Roberta Bloom

Glossary

Average:
A number that describes the central tendency of the data. There are a number of specialized averages, including the arithmetic mean, weighted mean, median, mode, and geometric mean.
Central Limit Theorem:
Given a random variable (RV) with known mean μμ and known standard deviation σσ. We are sampling with size n and we are interested in two new RVs - the sample mean, X¯X¯, and the sample sum, ΣXΣX. If the size nn of the sample is sufficiently large, then X¯X¯ size 12{ { bar {X}}} {} N μ σ n N μ σ n and ΣXΣX size 12{X} {}N ( , n σ )N(,nσ). If the size n of the sample is sufficiently large, then the distribution of the sample means and the distribution of the sample sums will approximate a normal distribution regardless of the shape of the population. The mean of the sample means will equal the population mean and the mean of the sample sums will equal n times the population mean. The standard deviation of the distribution of the sample means, σ n σ n , is called the standard error of the mean.
Exponential Distribution:
A continuous random variable (RV) that appears when we are interested in the intervals of time between some random events, for example, the length of time between emergency arrivals at a hospital. Notation: X~Exp(m)X~Exp(m) size 12{X "~" ital "Exp" \( m \) } {}. The mean is μ=1mμ=1m size 12{μ= { {1} over {m} } } {} and the standard deviation is σ = 1 m σ= 1 m . The probability density function is f(x)=memx,f(x)=memx, size 12{f \( x \) = ital "me" rSup { size 8{- ital "mx"} } ," "} {} x 0 x 0 and the cumulative distribution function is P(Xx)=1emxP(Xx)=1emx size 12{P \( X <= x \) =1-e rSup { size 8{- ital "mx"} } } {}.
Mean:
A number that measures the central tendency. A common name for mean is 'average.' The term 'mean' is a shortened form of 'arithmetic mean.' By definition, the mean for a sample (denoted by x¯ x ) is x¯ = Sum of all values in the sampleNumber of values in the sample x = Sum of all values in the sampleNumber of values in the sample size 12{ { bar {X}}= { {"Sum of all values in the sample"} over {"Number of values in the sample"} } } {}, and the mean for a population (denoted by μμ size 12{m} {}) is μ=Sum of all values in the populationNumber of values in the populationμ=Sum of all values in the populationNumber of values in the population size 12{m= { {"Sum of all values in the population"} over {"Number of values in the population"} } } {}.
Uniform Distribution:
A continuous random variable (RV) that has equally likely outcomes over the domain, a<x<ba<x<b size 12{a<x<b} {}. Often referred as the Rectangular distribution because the graph of the pdf has the form of a rectangle. Notation: X~U(a,b)X~U(a,b) size 12{X "~" U \( a,b \) } {}. The mean is μ=a+b2μ=a+b2 size 12{μ= { {a+b} over {2} } } {} and the standard deviation is σ= (b-a)2 12 σ (b-a)2 12 The probability density function is fx = 1b-a fx=1b-a for a<x<b a x b or axb a x b. The cumulative distribution is P(Xx)=xabaP(Xx)=xaba size 12{P \( X <= x \) = { {x-a} over {b-a} } } {}.

Content actions

Download module as:

Add module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks