If you flip a coin, what is the chance of getting heads? That’s easy: 50/50. In the language of probability, we say that the probability is
1
2
1
2
. That is to say, half the time you flip coins, you will get heads.

So here is a harder question: if you flip *two* coins, what is the chance that you will get heads *both times*? I asked this question of my son, who has good mathematical intuition but no training in probability. His immediate answer:
1
3
1
3
. There are three possibilities: two heads, one heads and one tails, and two tails. So there is a
1
3
1
3
chance of getting each possibility, including two heads. Makes sense, right?

But it is not right. If you try this experiment 100 times, you will not find about 33 “both heads” results, 33 “both tails,” and 33 “one heads and one tails.” Instead, you will find something much closer to: 25 “both heads,” 25 “both tails,” and 50 “one of each” results. Why?

Because hidden inside this experiment are actually *four different results*, each as likely as the others. These results are: heads-heads, heads-tails, tails-heads, and tails-tails. Even if you don’t keep track of what “order” the coins flipped in, heads-tails is still a different result from tails-heads, and each must be counted.

And what if you flip a coin *three* times? In this case, there are actually eight results. In case this is getting hard to keep track of, here is a systematic way of listing all eight results.

Table 1
First Coin |
Second Coin |
Third Coin |
End Result |

Heads |
Heads |
Heads |
HHH |

Tails |
HHT |

Tails |
Heads |
HTH |

Tails |
HTT |

Tails |
Heads |
Heads |
THH |

Tails |
THT |

Tails |
Heads |
TTH |

Tails |
TTT |

When you make a table like this, the pattern becomes apparent: each new coin doubles the number of possibilities. The chance of three heads in a row is
1
8
1
8
. What would be the chance of four heads in row?

Let’s take a slightly more complicated—and more interesting—example. You are the proud inventor of the SongWriter 2000tm.

The user sets the song speed (“fast,” “medium,” or “slow”); the volume (“loud” or “quiet”); and the style (“rock” or “country”). Then, the SongWriter automatically writes a song to match.

How many possible settings are there? You might suspect that the answer is
3
+
2
+
2
=
7
3+2+2=7, but in fact there are many more than that. We can see them all on the following “tree diagram.”

If you start at the top of a tree like this and follow all the way down, you end up with one particular kind of song: for instance, “fast loud country song.” There are 12 different song types in all. This comes from *multiplying* the number of settings for each knob:
3
×
2
×
2
=
12
3×2×2=12.

Now, suppose the machine has a “Randomize” setting that randomly chooses the speed, volume, and style. What is the probability that you will end up with a loud rock song that is *not* slow? To answer a question like this, you can use the following process.

- Count the total number of results (the “leaves” in the tree) that match your criterion. In this case there are 2: the “fast-loud-rock” and “medium-loud-rock” paths.
- Count the total number of results: as we said previously, there are 12.
- Divide. The probability of a non-slow loud rock song is 2/12, or 1/6.

Note that this process will always give you a number between 0 (no results match) and 1 (all results match). Probabilities are always between 0 (for something that never happens) and 1 (for something that is guaranteed to happen).

But what does it really mean to say that “the probability is 1/6?” You aren’t going to get 1/6 of a song. One way to make this result more concrete is to imagine that you run the machine on its “Randomize” setting 100 times. You should expect to get non-slow loud rock songs 1 out of every 6 times; roughly 17 songs will match that description. This gives us another way to express the answer: there is a 17% probability of any given song matching this description.

We can look at the above problem another way.

What is the chance that any given, randomly selected song will be non-slow?
2
3
2
3
. That is to say, 2 out of every three randomly chosen songs will be non-slow.

Now...*out of those
2
3
2
3
*, how many will be loud? Half of them. The probability that a randomly selected song is both non-slow and loud is half of
2
3
2
3
, or
1
2
×
2
3
1
2
×
2
3
, or
1
3
1
3
.

And now, *out of that
1
3
1
3
*, how many will be rock? Again, half of them:
1
2
×
1
3
1
2
×
1
3
. This leads us back to the conclusion we came to earlier: 1/6 of randomly chosen songs will be non-slow, loud, rock songs. But it also gives us an example of a very general principle that is at the heart of all probability calculations:

*When two events are independent, the probability that they will
both
occur is the probability of one, multiplied by the probability of the other.*

What does it mean to describe two events as “independent?” It means that they have no effect on each other. In real life, we know that rock songs are more likely to be fast and loud than slow and quiet. Our machine, however, keeps all three categories independent: choosing “Rock” does not make a song more likely to be fast or slow, loud or quiet.

In some cases, applying the multiplication rule is very straightforward. Suppose you generate two different songs: what is the chance that they will *both* be fast songs? The two songs are independent of each other, so the chance is
1
3
×
1
3
=
1
9
1
3
×
1
3
=
1
9
.

Now, suppose you generate five different songs. What is the chance that they will *all* be fast?
1
3
×
1
3
×
1
3
×
1
3
×
1
3
1
3
×
1
3
×
1
3
×
1
3
×
1
3
, or
(
1
3
)
5
(
1
3
)
5
, or 1 chance in 243. Not very likely, as you might suspect!

Other cases are less obvious. Suppose you generate five different songs. What is the probability that *none* of them will be a fast song? The multiplication rule tells us only how to find the probability of “this *and* that”; how can we apply it to this question?

The key is to reword the question, as follows. What is the chance that the first song will not be fast, *and* the second song will not be fast, *and* the third song will not be fast, and so on? Expressed in this way, the question is a perfect candidate for the multiplication rule. The probability of the first song being non-fast is
2
3
2
3
. Same for the second, and so on. So the probability is
(
2
3
)
5
(
2
3
)
5
, or 32/243, or roughly 13%.

Based on this, we can easily answer another question: if you generate five different songs, what is the probability that *at least one* of them will be fast? Once again, the multiplication rule does not apply directly here: it tells us “this and that,” not “this or that.” But we can recognize that this is the opposite of the previous question. We said that 13% of the time, none of the songs will be fast. That means that the *other* 87% of the time, at least one of them will!

Comments:"DAISY and BRF versions of this collection are available."