Skip to content Skip to navigation

Connexions

You are here: Home » Content » Confidence Intervals: Confidence Interval for a Population Proportion

Navigation

Content Actions

  • Download module PDF
  • Add to ...
    Add the module to:
    • My Favorites
    • A lens
    • An external social bookmarking service
    • My Favorites (What is 'My Favorites'?)
      'My Favorites' is a special kind of lens which you can use to bookmark modules and collections directly in Connexions. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need a Connexions account to use 'My Favorites'.
    • A lens (What is a lens?)

      Definition of a lens

      Lenses

      A lens is a custom view of Connexions content. You can think of it as a fancy kind of list that will let you see Connexions through the eyes of organizations and people you trust.

      What is in a lens?

      Lens makers point to Connexions materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

      Who can create a lens?

      Any individual Connexions member, a community, or a respected organization.

    • External bookmarks
  • E-mail the authors

Lenses

What is a lens?

Definition of a lens

Lenses

A lens is a custom view of Connexions content. You can think of it as a fancy kind of list that will let you see Connexions through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to Connexions materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual Connexions member, a community, or a respected organization.

This content is ...

In these lenses

  • Printable Books

    This module is included inLens: Connexions Books Available for Print on Demand
    By: ConnexionsAs a part of collection:"Collaborative Statistics"

    Comments:

    "This book was purchased from the authors by the Maxfield Foundation and provided to the community as an open textbook available freely online and in PDF format. Bound copies of the book can also […]"

    Click the "Printable Books" link to see all content selected in this lens.

  • Bio 502 at CSUDH

    This module is included inLens: Bio 502
    By: Terrence McGlynnAs a part of collection:"Collaborative Statistics"

    Comments:

    "This is the course textbook for Biology 502 at CSU Dominguez Hills"

    Click the "Bio 502 at CSUDH" link to see all content selected in this lens.

Recently Viewed

Tags

(What is a tag?)

These tags come from the endorsement, affiliation, and other lenses that include this content.

Confidence Intervals: Confidence Interval for a Population Proportion

Module by: Dr. Barbara Illowsky, Susan Dean

During an election year, we see articles in the newspaper that state confidence intervals in terms of proportions or percentages. For example, a poll for a particular candidate running for president might show that the candidate has 40% of the vote within 3 percentage points. Often, election polls are calculated with 95% confidence. So, the pollsters would be 95% confident that the true proportion of voters who favored the candidate would be between 0.37 and 0.43 ( 0.40 - 0.03 , 0.40 + 0.03 ) (0.40-0.03,0.40+0.03).

Investors in the stock market are interested in the true proportion of stocks that go up and down each week. Businesses that sell personal computers are interested in the proportion of households in the United States that own personal computers. Confidence intervals can be calculated for the true proportion of stocks that go up or down each week and for the true proportion of households in the United States that own personal computers.

The procedure to find the confidence interval, the sample size, the error bound, and the confidence level for a proportion is similar to that for the population mean. The formulas are different.

How do you know you are dealing with a proportion problem? First, the underlying distribution is binomial. (There is no mention of a mean or average.) If XX is a binomial random variable, then X~B(n,p) X~B(n,p) where nn = the number of trials and pp = the probability of a success. To form a proportion, take XX, the random variable for the number of successes and divide it by nn, the number of trials (or the sample size). The random variable P'P' (read "P prime") is that proportion,

P'=XnP'=Xn

(Sometimes the random variable is P̂P̂, read "P hat".)

When nn is large, we can use the normal distribution to approximate the binomial.

XX ~ N ( n p , n p q ) N(np, n p q )

If we divide all values of the random variable by nn, the mean by nn, and the standard deviation by nn, we get a normal distribution of proportions with P'P', called the estimated proportion, as the random variable. (Recall that a proportion = the number of successes divided by nn.)

X n = P ' X n =P' ~ N ( n p n , n p q n ) N( n p n , n p q n )

By algebra, n p q n = p q n n p q n = p q n

P'P' follows a normal distribution for proportions: P 'P' ~ N ( p , p q n ) N(p, p q n )

The confidence interval has the form (p'-EBP,p'+EBP)(p'-EBP,p'+EBP).

p ' = x n p'= x n

p ' p' = the estimated proportion of successes (p'p' is a point estimate for pp, the true proportion)

xx = the number of successes.

nn = the size of the sample

The error bound for a proportion is

EBP = z α 2 p ' q ' n q ' = 1 - p ' EBP= z α 2 p ' q ' n q'=1-p'

This formula is actually very similar to the error bound formula for a mean. The difference is the standard deviation. For a mean where the population standard deviation is known, the standard deviation is σ n σ n .

For a proportion, the standard deviation is p q n p q n .

However, in the error bound formula, the standard deviation is p ' q ' n p ' q ' n .

In the error bound formula, p'p' and q'q' are estimates of pp and qq. The estimated proportions p'p' and q'q' are used because pp and qq are not known. p'p' and q'q' are calculated from the data. p'p' is the estimated proportion of successes. q'q' is the estimated proportion of failures.

When a study gives a margin of error of "+ or - 3 percentage points", this is determined before the survey is done. Since p'p' and q'q' are unknown, the most conservative choice is p'=0.5p'=0.5 and q'=0.5q'=0.5, because these values give the largest standard deviation, error bound, and confidence interval.

Note:

For the normal distribution of proportions, the z-score formula is as follows.

If P 'P' ~ N ( p , p q n ) N(p, p q n ) then the z-score formula is z = p ' - p p q n z= p ' - p p q n

Example 1

Problem 1

Suppose that a sample of 500 households in Phoenix was taken last May to determine whether the oldest child had given his/her mother a Mother's Day card. Of the 500 households, 421 responded yes. Compute a 95% confidence interval for the true proportion of all Phoenix households whose oldest child gave his/her mother a Mother's Day card.

Note:

  • The first solution is step-by-step.
  • The second solution uses the TI-83+ and TI-84 calculators.

Solution 1.1

Let XX = the number of oldest children who gave their mothers Mother's Day card last May. XX is binomial. XX ~ B(500, 421500)B(500,421500).

To calculate the confidence interval, you must find p'p', q'q', and EBPEBP.

n = 500 x n=500x = the number of successes = 421 =421

p ' = x n = 421 500 = 0.842 p'= x n = 421 500 =0.842

q ' = 1 - p ' = 1 - 0.842 = 0.158 q'=1-p'=1-0.842=0.158

Since CL = 0.95 CL=0.95, then α = 1 - CL = 1 - 0.95 = 0.05 α 2 = 0.025 α=1-CL=1-0.95=0.05 α 2 =0.025.

Then z α 2 = z .025 = 1.96 z α 2 = z .025 =1.96 using a calculator, computer, or standard normal table.

Remember that the area to the right = 0.025 and therefore, area to the left is 0.975.

The z-score that corresponds to 0.975 is 1.96.

EBP = z α 2 p ' q ' n = 1.96 [ ( .842 ) ( .158 ) 500 ] = 0.032 EBP= z α 2 p ' q ' n =1.96 [ ( .842 ) ( .158 ) 500 ] =0.032

p ' - EBP = 0.842 - 0.032 = 0.81 p'-EBP=0.842-0.032=0.81

p ' + EBP = 0.842 + 0.032 = 0.874 p'+EBP=0.842+0.032=0.874

The confidence interval for the true binomial population proportion is (p'-EBP,p'+EBP) =(p'-EBP,p'+EBP)=(0.810,0.874)(0.810,0.874).

We are 95% confident that between 81% and 87.4% of the oldest children in households in Phoenix gave their mothers a Mother's Day card last May.

We can also say that 95% of the confidence intervals constructed in this way contain the true proportion of oldest children in Phoenix who gave their mothers a Mother's Day card last May.

Solution 1.2

TI-83+ and TI-84: Press STAT and arrow over to TESTS. Arrow down to A:PropZint. Press ENTER. Enter 421 for xx, 500 for nn, and .95 for C-Level. Arrow down to Calculate and press ENTER. The confidence interval is (0.81003, 0.87397).

Example 2

Problem 1

For a class project, a political science student at a large university wants to determine the percent of students that are registered voters. He surveys 500 students and finds that 300 are registered voters. Compute a 90% confidence interval for the true percent of students that are registered voters and interpret the confidence interval.

Solution 1

x=300x=300 and n=500n=500. Using a TI-83+ or 84 calculator, the 90% confidence interval for the true percent of students that are registered voters is (0.564, 0.636).

Interpretation:

  • We are 90% confident that the true percent of students that are registered voters is between 56.4% and 63.6%.
  • Ninety percent (90 %) of all confidence intervals constructed in this way contain the true percent of students that are registered voters.

Glossary

Binomial Distribution:
A discrete random variable (RV) which arises from the Bernoulli trials with the next additional requirements. There are fixed number, n, of independent trials. “Independent” means that the result to any trial (for example, trial 1) in no way affects the answer to all the following trials, and all trials are conducted under the same conditions. Under these circumstances the binomial RV XX size 12{X} {} is defined as the number of success in n trials. The notation is: XX~ B ( n , p )B(n,p); the domain is the mean is μ=np μ np , and the variance is σ 2 = df σ 2 =df. The probability to have exactly xx successes in nn trials is P ( X = x ) = n x p x q n x P(X=x)= n x p x q n x .
Confidential Interval:
An interval estimate for unknown population parameter. This depends on:
  • The desired confidence level.
  • What is known for the distribution information (for ex., known variance).
  • Gathering from the sampling information.
Confidence Level:
The percent expression for the probability that the confidence interval contains the true population parameter. That is, for ex., if CL=90%, then in 90 out of 100 samples the interval estimate will enclose the true population parameter.
Error Bound for a Population Mean (EBM):
The margin of error. Depends on the confidence level, sample size, and known or estimated population standard deviation.
Normal Distribution:
A continuous random variable (RV) with pdf=1σe(xμ)2/2pdf=1σe(xμ)2/2 size 12{ ital "pdf"= { {1} over {σ sqrt {2π} } } e rSup { size 8{ - \( x - μ \) rSup { size 6{2} } /2σ rSup { size 6{2} } } } } {}, where μμ is the mean of the distribution and σσ is its standard deviation. Notation: XX ~ N μ σ 2 N μ σ 2 . If μ=0μ=0 and σ=1σ=1, the RV is called standard normal distribution, or z-score.

Comments, questions, feedback, criticisms?

Send feedback