Skip to ContentGo to accessibility pageKeyboard shortcuts menu
OpenStax Logo
Introductory Business Statistics

6.3 Estimating the Binomial with the Normal Distribution

Introductory Business Statistics6.3 Estimating the Binomial with the Normal Distribution

We found earlier that various probability density functions are the limiting distributions of others; thus, we can estimate one with another under certain circumstances. We will find here that the normal distribution can be used to estimate a binomial process. The Poisson was used to estimate the binomial previously, and the binomial was used to estimate the hypergeometric distribution.

In the case of the relationship between the hypergeometric distribution and the binomial, we had to recognize that a binomial process assumes that the probability of a success remains constant from trial to trial: a head on the last flip cannot have an effect on the probability of a head on the next flip. In the hypergeometric distribution this is the essence of the question because the experiment assumes that any "draw" is without replacement. If one draws without replacement, then all subsequent "draws" are conditional probabilities. We found that if the hypergeometric experiment draws only a small percentage of the total objects, then we can ignore the impact on the probability from draw to draw.

Imagine that there are 312 cards in a deck comprised of 6 normal decks. If the experiment called for drawing only 10 cards, less than 5% of the total, than we will accept the binomial estimate of the probability, even though this is actually a hypergeometric distribution because the cards are presumably drawn without replacement.

The Poisson likewise was considered an appropriate estimate of the binomial under certain circumstances. In Chapter 4 we found that if the number of trials of interest is large and the probability of success is small, such that μ=npμ=np < 77, the Poisson can be used to estimate the binomial with good results. Again, these rules of thumb do not in any way claim that the actual probability is what the estimate determines, only that the difference is in the third or fourth decimal and is thus de minimus.

Here, again, we find that the normal distribution makes particularly accurate estimates of a binomial process under certain circumstances. Figure 6.10 is a frequency distribution of a binomial process for the experiment of flipping three coins where the random variable is the number of heads. The sample space is listed below the distribution. The experiment assumed that the probability of a success is 0.5; the probability of a failure, a tail, is thus also 0.5. In observing Figure 6.10 we are struck by the fact that the distribution is symmetrical. The root of this result is that the probabilities of success and failure are the same, 0.5. If the probability of success were smaller than 0.5, the distribution becomes skewed right. Indeed, as the probability of success diminishes, the degree of skewness increases. If the probability of success increases from 0.5, then the skewness increases in the lower tail, resulting in a left-skewed distribution.

A histogram showing the frequency distribution of flipping three coins where x represents the number of heads. The vertical y axis represents Probability. Each bar has a label on the horizontal axis in the center of the bar. The labels are 0, 1, 2, 3. The height of the bar representing 0 heads is 1/8. The height of the bar representing 1 head is 3/8. The height of the bar representing 2 heads is 3/8. The height of the bar representing 3 heads is 1/8. Below the histogram is the set, s, representing the sample space. The elements of the set are HHH, HHT, HTH, THH, TTT, TTH, THT, HTT.
Figure 6.10

The reason the skewness of the binomial distribution is important is because if it is to be estimated with a normal distribution, then we need to recognize that the normal distribution is symmetrical. The closer the underlying binomial distribution is to being symmetrical, the better the estimate that is produced by the normal distribution. Figure 6.11 shows a symmetrical normal distribution transposed on a graph of a binomial distribution where p = 0.2 and n = 5. The discrepancy between the estimated probability using a normal distribution and the probability of the original binomial distribution is apparent. The criteria for using a normal distribution to estimate a binomial thus addresses this problem by requiring BOTH np AND n(1 − p) are greater than five. Again, this is a rule of thumb, but is effective and results in acceptable estimates of the binomial probability.

A histogram showing the frequency distribution of a binomial distribution with p = 0.2 and n = 5. The random variable X represents number of heads. The vertical y axis represents Probability P(X). Each bar has a label on the horizontal axis in the center of the bar. The labels are 0, 1, 2, 3, 4, 5. The height of the bar at 0 is 0.3277. The height of the bar at 1 is 0.4096. The height of the bar at 2 is 0.2048. The height of the bar at 3 is 0.0512. The height of the bar at 4 is 0.0064. The height of the bar at 5 is 0.0003. Superimposed on the histogram is a normal distribution curve with mean mu = 1.
Figure 6.11

Example 6.7

Imagine that it is known that only 10% of Australian Shepherd puppies are born with what is called "perfect symmetry" in their three colors, black, white, and copper. Perfect symmetry is defined as equal coverage on all parts of the dog when looked at in the face and measuring left and right down the centerline. A kennel would have a good reputation for breeding Australian Shepherds if they had a high percentage of dogs that met this criterion. During the past 5 years and out of the 100 dogs born to Dundee Kennels, 16 were born with this coloring characteristic.

Problem

What is the probability that, in 100 births, more than 16 would have this characteristic?

Order a print copy

As an Amazon Associate we earn from qualifying purchases.

Citation/Attribution

This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution License and you must attribute OpenStax.

Attribution information
  • If you are redistributing all or part of this book in a print format, then you must include on every physical page the following attribution:
    Access for free at https://openstax.org/books/introductory-business-statistics/pages/1-introduction
  • If you are redistributing all or part of this book in a digital format, then you must include on every digital page view the following attribution:
    Access for free at https://openstax.org/books/introductory-business-statistics/pages/1-introduction
Citation information

© Jun 23, 2022 OpenStax. Textbook content produced by OpenStax is licensed under a Creative Commons Attribution License . The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.