# Connexions

You are here: Home » Content » SAMPLE SIZE

## Navigation

### Lenses

What is a lens?

#### Definition of a lens

##### Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

##### What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

##### Who can create a lens?

Any individual member, a community, or a respected organization.

##### What are tags?

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

#### In these lenses

• Statistics

This module is included inLens: Mathieu Plourde's Lens
By: Mathieu PlourdeAs a part of collection: "Introduction to Statistics"

Click the "Statistics" link to see all content selected in this lens.

### Recently Viewed

This feature requires Javascript to be enabled.

# SAMPLE SIZE

Module by: Ewa Paszek. E-mail the author

Summary: This course is a short series of lectures on Introductory Statistics. Topics covered are listed in the Table of Contents. The notes were prepared by Ewa Paszek and Marek Kimmel. The development of this course has been supported by NSF 0203396 grant.

## Size Sample

Very frequently asked question in statistical consulting is, how large should the sample size be to estimate a mean?

The answer will depend on the variation associated with the random variable under observation. The statistician could correctly respond, only one item is needed, provided that the standard deviation of the distribution is zero. That is, if σ σ is equal zero, then the value of that one item would necessarily equal the unknown mean of the distribution. This is the extreme case and one that is not met in practice. However, the smaller the variance, the smaller the sample size needed to achieve a given degree of accuracy.

### Example 1

A mathematics department wishes to evaluate a new method of teaching calculus that does mathematics using a computer. At the end of the course, the evaluation will be made on the basis of scores of the participating students on a standard test. Because there is an interest in estimating the mean score μ μ , for students taking calculus using computer so there is a desire to determine the number of students, n, who are to be selected at random from a larger group. So, let find the sample size n such that we are fairly confident that x ¯ ±1 x ¯ ±1 contains the unknown test mean μ μ , from past experience it is believed that the standard deviation associated with this type of test is 15. Accordingly, using the fact that the sample mean of the test scores, X ¯ X ¯ , is approximately N( μ, σ 2 /n ) N( μ, σ 2 /n ) , it is seen that the interval given by x ¯ ±1.96( 15/ n ) x ¯ ±1.96( 15/ n ) will serve as an approximate 95% confidence interval for μ μ .

That is, 1.96( 15 n )=1 1.96( 15 n )=1 or equivalently n =29.4 n =29.4 and thus n864.36 n864.36 or n=865 because n must be an integer. It is quite likely that it had not been anticipated that as many as 865 students would be needed in this study. If that is the case, the statistician must discuss with those involved in the experiment whether or not the accuracy and the confidence level could be relaxed some. For illustration, rather than requiring x ¯ ±1 x ¯ ±1 to be a 95% confidence interval for μ μ , possibly x ¯ ±2 x ¯ ±2 would be satisfactory for 80% one. If this modification is acceptable, we now have 1.282( 15 n )=2 1.282( 15 n )=2 or equivalently, n =9.615 n =9.615 and thus n92.4 n92.4 . Since n must be an integer = 93 is used in practice.

Most likely, the person involved in this project would find this a more reasonable sample size. Of course, any sample size greater than 93 could be used. Then either the length of the confidence interval could be decreased from that of x ¯ ±2 x ¯ ±2 or the confidence coefficient could be increased from 80% or a combination of both. Also, since there might be some question of whether the standard deviation σ σ actually equals 15, the sample standard deviations would no doubt be used in the construction of the interval.

For example, suppose that the sample characteristics observed are n=145, x ¯ =77.2,s=13.2; n=145, x ¯ =77.2,s=13.2; then, x ¯ ± 1.282s n x ¯ ± 1.282s n or 77.2±1.41 77.2±1.41 provides an approximate 80% confidence interval for μ μ .

In general, if we want the 100( 1α )% 100( 1α )% confidence interval for μ μ , x ¯ ± z α/2 ( σ/ n ) x ¯ ± z α/2 ( σ/ n ) , to be no longer than that given by x ¯ ±ε x ¯ ±ε , the sample size n is the solution of ε= z α/2 σ n , ε= z α/2 σ n , where Φ( z α/2 )=1 α 2 . Φ( z α/2 )=1 α 2 .

That is, n= z α/2 2 σ 2 ε 2 , n= z α/2 2 σ 2 ε 2 , where it is assumed that σ 2 σ 2 is known.

Sometimes ε= z α/2 σ/ n ε= z α/2 σ/ n is called the maximum error of the estimate. If the experimenter has no ideas about the value of σ 2 σ 2 , it may be necessary to first take a preliminary sample to estimate σ 2 σ 2 .

The type of statistic we see most often in newspaper and magazines is an estimate of a proportion p. We might, for example, want to know the percentage of the labor force that is unemployed or the percentage of voters favoring a certain candidate. Sometimes extremely important decisions are made on the basis of these estimates. If this is the case, we would most certainly desire short confidence intervals for p with large confidence coefficients. We recognize that these conditions will require a large sample size. On the other hand, if the fraction p being estimated is not too important, an estimate associated with a longer confidence interval with a smaller confidence coefficients is satisfactory; and thus a smaller sample size can be used.

In general, to find the required sample size to estimate p, recall that the point estimate of p is p ^ = z α/2 p ^ ( 1 p ^ ) n . p ^ = z α/2 p ^ ( 1 p ^ ) n .

Suppose we want an estimate of p that is within ε ε of the unknown p with 100( 1α )% 100( 1α )% confidence where ε= z α/2 p ^ ( 1 p ^ )/n ε= z α/2 p ^ ( 1 p ^ )/n is the maximum error of the point estimate p ^ =y/n p ^ =y/n . Since p ^ p ^ is unknown before the experiment is run, we cannot use the value of p ^ p ^ in our determination of n. However, if it is known that p is about equal to p * p * , the necessary sample size n is the solution of ε= z α/2 p ( 1 p ) n . ε= z α/2 p ( 1 p ) n . That is, n= z α/2 2 p ( 1 p ) ε 2 . n= z α/2 2 p ( 1 p ) ε 2 .

## Content actions

PDF | EPUB (?)

### What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

### Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

### Add module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

#### Definition of a lens

##### Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

##### What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

##### Who can create a lens?

Any individual member, a community, or a respected organization.

##### What are tags?

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks