Skip to content Skip to navigation

OpenStax_CNX

You are here: Home » Content » Base Rates

Navigation

Recently Viewed

This feature requires Javascript to be enabled.
 

Base Rates

Module by: David Lane. E-mail the author

Summary: Finding the true probability of an event taking into account misses, false positives, and base rates. Also, using Bayes' Theorem.

Suppose that at your regular physical exam you test positive for Disease X. Although Disease X has only mild symptoms, you are concerned and ask your doctor about the accuracy of the test. It turns out that the test is 95% accurate. It would appear that the probability that you have Disease X is therefore 0.95. However, the situation is not that simple.

For one thing, more information about the accuracy of the test is needed because there two kinds of errors the test can make: misses and false positives. If you actually had Disease X and the test failed to detect it, that would be a miss. If you did not have Disease X and the test indicated you did, that would be a false positive. The miss and false positive rates are not necessarily the same.

Example 1

Lets' say that the test accurately indicates the disease in 99% of the people who have it and accurately indicates no disease in 91% of the people who do not have it. This would mean that the test has a miss rate of 0.01 and a false positive rate of 0.09. This would lead you to revise your judgment and conclude that your chance of having the disease is 0.09 rather than 0.05. This would be true if half the people in your situation (people who show up for a regular physical exam) had disease X.

The analysis becomes complicated if more or less than half the people in your situation have Disease X. The proportion of the people having the disease is called the base rate. It is very important to consider the base rate when classifying people. As the saying does, "if you hear hoofs, think horse not zebra" since you are more likely to encounter a horse than a zebra (at least in most places.)

Assume that Disease X is a rare disease, and only 2% of people in your situation have it. How does that affect the probability that you have it? Or, more generally, what is the probability that someone who tests positive actually has the disease. Lets consider what would happen if one million people were tested. Out of these one million people, 2% or 20,000 people would have the disease. Of these 20,000 with the disease, the test would accurately detect it in 99% of them. This means that 19,800 cases would be accurately identified. Now lets consider the 98% of the one million people (980,000) who do not have the disease. Since the false positive rate is 0.09, 9% of these 980,000 people will test positive for the disease. This is a total of 88,200 people incorrectly diagnosed.

To sum up, 19,800 people who tested positive would actually have the disease and 88,200 people who tested positive would not have the disease. This means that of all those who tested positive, only 1980019800+88200=0.1833 19800 19800 88200 0.1833 of them would actually have the disease. So the probability that you have the disease is not 0.95, or 0.91, but only 0.1833.

These results are summarized in Table 1. The numbers of people diagnosed with the disease are shown in italics. Of the one million people tested, the test was correct for 891,800 of those without the disease and for 19,800 with the disease; the test was correct 91% of the time. However, if you look only at the people testing positive (shown in italics), only 19,800 (0.1833) of the 108,000 testing positive actually have the disease.

Table 1: Table 1. Diagnosing Disease X
True Condition
No Disease - 980,000 Disease - 20,000
Test Results Test Results Test Results Test Results
Positive - 88,200 Negative - 891,800 Positive - 19,800 Negative - 200

Bayes' Theorem

This same result can be obtained using Bayes' theorem. Bayes' theorem considers both the prior probability of an event and the diagnostic value of a test to determine the posterior probability of the event. For the current example, the event is that you have Disease X. Let's call this Event D D. Since only 2% of people in your situation have Disease X, the prior probability of Event D D is 0.02. Or, more formally, PrD=0.02 D 0.02 . If D D represents the probability that Event D D is false, then PrD=1PrD=0.98 D 1 D 0.98

To define the diagnostic value of the test, we need to define another event: that you test positive for Disease X. Let's call this event T T. The diagnostic value of the test depends on the probability you will test positive given that you actually have the disease, written as PrT| D D T , and the probability you test positive given that you do not have the disease, written as PrT| D D T . Bayes' theorem shown below allows you to calculate PrD| T T D , the probability that you have the disease given that you test positive for it.

Theorem 1: Bayes' Theorem

PrD| T =PrT| D PrDPrT| D PrD+PrT| D PrD T D D T D D T D D T D
(1)

The various terms are:

  • PrT| D =0.99 D T 0.99
  • PrT| D =0.09 D T 0.09
  • PrD=0.02 D 0.02
  • PrD=0.98 D 0.98

Therefore, PrD| T =0.99×0.020.99×0.02+0.09×0.98=0.1833 T D 0.99 0.02 0.99 0.02 0.09 0.98 0.1833 which is the same value computed previously.

Glossary

Misses:
Occur when a diagnostic test returns a negative result, but the true state of the subject is positive. For example, if a person has strep throat and the diagnostic test indicates fails to indicate it, then a miss has occured. The concept is similar to a Type II error in signficance testing.
False Positive:
Occurs when a diagnostic procedure returns a positive result while the true state of the subject is negative. For example, if a test for strep says the patient has strep when in fact he or she does not, then the error in diagnosis would be called a false positive. In some contexts, a false positive is called a false alarm. The concept is similar to a Type I error in signficance testing.
Base Rate:
The true proportion of a population having some condition, attribute or disease. For example, the proportion of people with schizophrenia is about 0.01.
Prior Probability:
The prior probability of an event is the probability of the event computed before the collection of new data. One begins with a prior probability of an event and revises it in the light of new data. For example, if 0.01 of a population has schizophrenia then the probability that a person drawn at random would have schizophrenia is 0.01. This is the prior probability. If you then learn that that there score on a personality test suggests the person is schizophrenic, you would adjust your probability accordingly. The adjusted probability is the posterior probability.
Posterior Probability:
The posterior probability of an event is the probability of the event computed following the collection of new data. One begins with a prior probability of an event and revises it in the light of new data. For example, if 0.01 of a population has schizophrenia then the probability that a person drawn at random would have schizophrenia is 0.01. This is the prior probability. If you then learn that that there score on a personality test suggests the person is schizophrenic, you would adjust your probability accordingly. The adjusted probability is the posterior probability.

Content actions

Download module as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Add module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks