Skip to content Skip to navigation

OpenStax-CNX

You are here: Home » Content » TEST ABOUT PROPORTIONS

Navigation

Lenses

What is a lens?

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

This content is ...

In these lenses

  • Statistics

    This module is included inLens: Mathieu Plourde's Lens
    By: Mathieu PlourdeAs a part of collection: "Introduction to Statistics"

    Click the "Statistics" link to see all content selected in this lens.

Recently Viewed

This feature requires Javascript to be enabled.
 

TEST ABOUT PROPORTIONS

Module by: Ewa Paszek. E-mail the author

Summary: This course is a short series of lectures on Introductory Statistics. Topics covered are listed in the Table of Contents. The notes were prepared by Ewa Paszek and Marek Kimmel. The development of this course has been supported by NSF 0203396 grant.

TEST ABOUT PROPORTIONS

Tests of statistical hypotheses are a very important topic, let introduce it through an illustration.

Suppose a manufacturer of a certain printed circuit observes that about p=0.05 of the circuits fails. An engineer and statistician working together suggest some changes that might improve the design of the product. To test this new procedure, it was agreed that n=100 circuits would be produced using the proposed method and the checked. Let Y equal the number of these 200 circuits that fail. Clearly, if the number of failures, Y, is such that Y/200 is about to 0.05, then it seems that the new procedure has not resulted in an improvement. On the other hand, If Y is small so that Y/200 is about 0.01 or 0.02, we might believe that the new method is better than the old one. On the other hand, if Y/200 is 0.08 or 0.09, the proposed method has perhaps caused a greater proportion of failures. What is needed is to establish a formal rule that tells when to accept the new procedure as an improvement. For example, we could accept the new procedure as an improvement if Y5 Y5 of Y/n0.025 Y/n0.025 . We do note, however, that the probability of the failure could still be about p=0.05 even with the new procedure, and yet we could observe 5 of fewer failures in n=200 trials.

That is, we would accept the new method as being an improvement when, in fact, it was not. This decision is a mistake which we call a Type I error. On the other hand, the new procedure might actually improve the product so that p is much smaller, say p=0.02, and yet we could observe y=7 failures so that y/200=0.035. Thus we would not accept the new method as resulting in an improvement when in fact it had. This decision would also be a mistake which we call a Type II error.

If it we believe these trials, using the new procedure, are independent and have about the same probability of failure on each trial, then Y is binomial b( 200,p ) b( 200,p ) . We wish to make a statistical inference about p using the unbiased p ^ =Y/200 p ^ =Y/200 . We could also construct a confidence interval, say one that has 95% confidence, obtaining p ^ ±1.96 p ^ ( 1 p ^ ) 200 . p ^ ±1.96 p ^ ( 1 p ^ ) 200 .

This inference is very appropriate and many statisticians simply do this. If the limits of this confidence interval contain 0.05, they would not say the new procedure is necessarily better, al least until more data are taken. If, on the other hand, the upper limit of this confidence interval is less than 0.05, then they fell 95% confident that the true p is now less than 0.05. Here, in this illustration, we are testing whether or not the probability of failure has or has not decreased from 0.05 when the new manufacturing procedure is used.

The no change hypothesis, H 0 :p=0.05 H 0 :p=0.05 , is called the null hypothesis. Since H 0 :p=0.05 H 0 :p=0.05 completely specifies the distribution it is called a simple hypothesis; thus H 0 :p=0.05 H 0 :p=0.05 is a simple null hypothesis.

The research worker’s hypothesis H 1 :p<0.05 H 1 :p<0.05 is called the alternative hypothesis. Since H 1 :p<0.05 H 1 :p<0.05 does not completely specify the distribution, it is a composite hypothesis because it is composed of many simple hypotheses.

The rule of rejecting H 0 H 0 and accepting H 1 H 1 if Y5 Y5 , and otherwise accepting H 0 H 0 is called a test of a statistical hypothesis.

It is clearly seen that two types of errors can be recorded

  • Type I error: Rejecting H 0 H 0 and accepting H 1 H 1 , when H 0 H 0 is true;
  • Type II error: Accepting H 0 H 0 when H 1 H 1 is true, that is, when H 0 H 0 is false.

Since, in the example above, we make a Type I error if Y5 Y5 when in fact p=0.05. we can calculate the probability of this error, which we denote by α α and call the significance level of the test. Under an assumption, it is α=P( Y5;p=0.05 )= y=0 5 ( 200 y ) ( 0.05 ) y ( 0.95 ) 200y . α=P( Y5;p=0.05 )= y=0 5 ( 200 y ) ( 0.05 ) y ( 0.95 ) 200y . .

Since n is rather large and p is small, these binomial probabilities can be approximated extremely well by Poisson probabilities with λ=200( 0.05 )=10. λ=200( 0.05 )=10. That is, from the Poisson table, the probability of the Type I error is α y=0 5 10 y e 10 y! =0.067. α y=0 5 10 y e 10 y! =0.067.

Thus, the approximate significance level of this test is α=0.067 α=0.067 . This value is reasonably small. However, what about the probability of Type II error in case p has been improved to 0.02, say? This error occurs if Y>5 Y>5 when, in fact, p=0.02; hence its probability, denoted by β β , is β=P( Y>5;p=0.02 )= y=6 200 ( 200 y ) ( 0.02 ) y ( 0.98 ) 200y . β=P( Y>5;p=0.02 )= y=6 200 ( 200 y ) ( 0.02 ) y ( 0.98 ) 200y .

Again we use the Poisson approximation, here λ=200(0.02)=4 λ=200(0.02)=4 , to obtain β1 y=0 5 4 y e 4 y! =10.785=0.215. β1 y=0 5 4 y e 4 y! =10.785=0.215.

The engineers and the statisticians who created this new procedure probably are not too pleased with this answer. That is, they note that if their new procedure of manufacturing circuits has actually decreased the probability of failure to 0.02 from 0.05 (a big improvement), there is still a good chance, 0.215, that H 0 : p=0.05  H 0 : p=0.05  is accepted and their improvement rejected. Thus, this test of H 0 : p=0.05  H 0 : p=0.05  against H 1 : p=0.02  H 1 : p=0.02  is unsatisfactory. Without worrying more about the probability of the Type II error, here, above was presented a frequently used procedure for testing H 0 : p=p 0 H 0 : p=p 0 , where p 0 p 0 is some specified probability of success. This test is based upon the fact that the number of successes, Y, in n independent Bernoulli trials is such that Y/n Y/n has an approximate normal distribution, N[p 0 , p 0 (1- p 0 )/n] N[p 0 , p 0 (1- p 0 )/n] , provided H 0 : p=p 0 H 0 : p=p 0 is true and n is large. Suppose the alternative hypothesis is H 0 : p>p 0   H 0 : p>p 0   ; that is, it has been hypothesized by a research worker that something has been done to increase the probability of success. Consider the test of H 0 : p=p 0 H 0 : p=p 0 against H 1 : p> p 0 H 1 : p> p 0 that rejects H 0 H 0 and accepts H 1 H 1 if and only if

Z= Y/n p 0 p 0 ( 1 p 0 )/n z α . Z= Y/n p 0 p 0 ( 1 p 0 )/n z α .

That is, if Y/n Y/n exceeds p 0 p 0 by standard deviations of Y/n Y/n , we reject H 0 H 0 and accept the hypothesis H 1 : p> p 0 H 1 : p> p 0 . Since, under H 0 H 0 Z is approximately N( 0,1 ) N( 0,1 ) , the approximate probability of this occurring when H 0 : p=p 0 H 0 : p=p 0 is true is α α . That is the significance level of that test is approximately α α . If the alternative is H 1 : p< p 0 H 1 : p< p 0 instead of H 1 : p> p 0 H 1 : p> p 0 , then the appropriate α α -level test is given by Z z α Z z α . That is, if Y/n Y/n is smaller than p 0 p 0 by standard deviations of Y/n Y/n , we accept H 1 : p< p 0 H 1 : p< p 0 .

In general, without changing the sample size or the type of the test of the hypothesis, a decrease in α α causes an increase in β β , and a decrease in β β causes an increase in α α . Both probabilities α α and β β of the two types of errors can be decreased only by increasing the sample size or, in some way, constructing a better test of the hypothesis.

EXAMPLE

If n=100 and we desire a test with significance level α α =0.05, then α=P( X ¯ c;μ=60 )=0.05 α=P( X ¯ c;μ=60 )=0.05 means, since X ¯ X ¯ is N(μ,100/100=1) N(μ,100/100=1) ,

P( X ¯ 60 1 c60 1 ;μ=60 )=0.05 P( X ¯ 60 1 c60 1 ;μ=60 )=0.05 and c60=1.645 c60=1.645 . Thus c=61.645. The power function is

K( μ )=P( X ¯ 61.645;μ )=P( X ¯ μ 1 61.645μ 1 ;μ )=1Φ( 61.645μ ). K( μ )=P( X ¯ 61.645;μ )=P( X ¯ μ 1 61.645μ 1 ;μ )=1Φ( 61.645μ ).

In particular, this means that β β at μ μ =65 is =1K( μ )=Φ( 61.64565 )=Φ( 3.355 )0; =1K( μ )=Φ( 61.64565 )=Φ( 3.355 )0; so, with n=100, both α α and β β have decreased from their respective original values of 0.1587 and 0.0668 when n=25. Rather than guess at the value of n, an ideal power function determines the sample size. Let us use a critical region of the form x ¯ c x ¯ c . Further, suppose that we want α α =0.025 and, when μ μ =65, β β =0.05. Thus, since X ¯ X ¯ is N(μ,100/n) N(μ,100/n) ,

0.025=P( X ¯ c;μ=60 )=1Φ( c60 10/ n ) 0.025=P( X ¯ c;μ=60 )=1Φ( c60 10/ n ) and 0.05=1P( X ¯ c;μ=65 )=Φ( c65 10/ n ). 0.05=1P( X ¯ c;μ=65 )=Φ( c65 10/ n ).

That is, c60 10/ n =1.96 c60 10/ n =1.96 and c65 10/ n =1.645 c65 10/ n =1.645 .

Solving these equations simultaneously for c and 10/ n 10/ n , we obtain c=60+1.96 5 3.605 =62.718; c=60+1.96 5 3.605 =62.718; 10 n = 5 3.605 . 10 n = 5 3.605 .

Thus, n =7.21 n =7.21 and n=51.98 n=51.98 . Since n must be an integer, we would use n=52 and obtain α α =0.025 and β β =0.05, approximately.

For a number of years there has been another value associated with a statistical test, and most statistical computer programs automatically print this out; it is called the probability value or, for brevity, p-value. The p-value associated with a test is the probability that we obtain the observed value of the test statistic or a value that is more extreme in the direction of the alternative hypothesis, calculated when H 0 H 0 is true. Rather than select the critical region ahead of time, the p-value of a test can be reported and the reader then makes a decision.

Say we are testing H 0 μ=60  H 0 μ=60  against H 1 μ>60 H 1 μ>60 with a sample mean X ¯ X ¯ based on n=52 observations. Suppose that we obtain the observed sample mean of x ¯ =62.75 x ¯ =62.75 . If we compute the probability of obtaining an x ¯ x ¯ of that value of 62.75 or greater when μ μ =60, then we obtain the p-value associated with x ¯ =62.75 x ¯ =62.75 . That is,

pvalue=P( X ¯ 62.75;μ=60 )=P( X ¯ 60 10/ 52 62.7560 10/ 52 ;μ=60 ) =1Φ( 62.7560 10/ 52 )=1Φ( 1.983 )=0.0237. pvalue=P( X ¯ 62.75;μ=60 )=P( X ¯ 60 10/ 52 62.7560 10/ 52 ;μ=60 ) =1Φ( 62.7560 10/ 52 )=1Φ( 1.983 )=0.0237.

If this p-value is small, we tend to reject the hypothesis H 0 μ=60  H 0 μ=60  . For example, rejection of H 0 μ=60  H 0 μ=60  if the p-value is less than or equal to 0.025 is exactly the same as rejection if x ¯ =62.718 x ¯ =62.718 .That is, x ¯ =62.718 x ¯ =62.718 has a p-value of 0.025. To help keep the definition of p-value in mind, we note that it can be thought of as that tail-end probability, under H 0 H 0 , of the distribution of the statistic, here X ¯ X ¯ , beyond the observed value of the statistic. See Figure 1 for the p-value associated with x ¯ =62.75. x ¯ =62.75.

Figure 1: The p-value associated with x ¯ =62.75. x ¯ =62.75.
 (p_value.gif)
Example 1

Suppose that in the past, a golfer’s scores have been (approximately) normally distributed with mean μ μ =90 and σ 2 σ 2 =9. After taking some lessons, the golfer has reason to believe that the mean μ μ has decreased. (We assume that σ 2 σ 2 is still about 9.) To test the null hypothesis H 0 μ=90  H 0 μ=90  against the alternative hypothesis H 1 μ<90  H 1 μ<90  , the golfer plays 16 games, computing the sample mean x ¯ x ¯ .If x ¯ x ¯ is small, say x ¯ c x ¯ c , then H 0 H 0 is rejected and H 1 H 1 accepted; that is, it seems as if the mean μ μ has actually decreased after the lessons. If c=88.5, then the power function of the test is

K( μ )=P( X ¯ 88.5;μ )=P( X ¯ μ 3/4 88.5μ 3/4 ;μ )=Φ( 88.5μ 3/4 ). K( μ )=P( X ¯ 88.5;μ )=P( X ¯ μ 3/4 88.5μ 3/4 ;μ )=Φ( 88.5μ 3/4 ).

Because 9/16 is the variance of X ¯ X ¯ . In particular, α=K( 90 )=Φ( 2 )=10.9772=0.0228. α=K( 90 )=Φ( 2 )=10.9772=0.0228.

If, in fact, the true mean is equal to μ μ =88 after the lessons, the power is K( 88 )=Φ( 2/3 )=0.7475 K( 88 )=Φ( 2/3 )=0.7475 . If μ μ =87, then K( 87 )=Φ( 2 )=0.9772 K( 87 )=Φ( 2 )=0.9772 . An observed sample mean of x ¯ =88.25 x ¯ =88.25 has a

pvalue=P( X ¯ 88.25;μ=90 )=Φ( 88.2590 3/4 )=Φ( 7 3 )=0.0098, pvalue=P( X ¯ 88.25;μ=90 )=Φ( 88.2590 3/4 )=Φ( 7 3 )=0.0098,

and this would lead to a rejection at α α =0.0228 (or even α α =0.01).

Content actions

Download module as:

Add module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks