We consider in this chapter real random variables (i.e., real-valued random variables).
In the chapter "Random Vectors and Joint Distributions", we extend the notion to vector-valued random quantites. The fundamental idea of
a real random variable is the assignment of a real number to
each elementary outcome ω in the basic space Ω. Such an assignment
amounts to determining a function X, whose domain is Ω and whose range is
a subset of the real line R. Recall that a real-valued function on a domain (say an
interval I on the real line) is characterized by the assignment of a real number y to
each element x (argument) in the domain. For a real-valued function of a real variable,
it is often possible to write a formula or otherwise state a rule describing the assignment
of the value to each argument. Except in special cases, we cannot write a formula for a random
variable X. However, random variables share some important general properties of
functions which play an essential role in determining their usefulness.
Mappings and inverse mappings
There are various ways of characterizing a function. Probably the most useful for our
purposes is as a mapping from the domain Ω to the codomain R. We find the mapping
diagram of Figure 1 extremely useful in visualizing the essential patterns. Random
variable X, as a mapping from basic space Ω to the real line R, assigns to
each element ω a value t=X(ω)t=X(ω). The object point ω is mapped, or carried,
into the image point t. Each ω is mapped into exactly one t, although
several ω may have the same image point.
Associated with a function X as a mapping are the inverse mapping X-1X-1 and the
inverse images it produces. Let M be a set of numbers on the real line. By the
inverse image of M under the mapping X, we mean the set of all those ω∈Ωω∈Ω which are mapped into M by X (see Figure 2). If X does not take a value
in M, the inverse image is the empty set (impossible event). If M includes the range of
X, (the set of all possible values of X), the inverse image is the entire basic space Ω.
Formally we write
X
-
1
(
M
)
=
{
ω
:
X
(
ω
)
∈
M
}
X
-
1
(
M
)
=
{
ω
:
X
(
ω
)
∈
M
}
(1)Now we assume the set X-1(M)X-1(M), a subset of Ω, is an event for each M. A
detailed examination of that assertion is a topic in measure theory. Fortunately,
the results of measure theory ensure that we may make the assumption for any X and any
subset M of the real line likely to be encountered in practice. The set X-1(M)X-1(M) is
the event that X takes a value in M. As an event, it may be assigned a probability.
- X=IEX=IE where E is an event with probability p. Now X takes on only two
values, 0 and 1. The event that X take on the value 1 is the set
{ω:X(ω)=1}=X-1({1})=E{ω:X(ω)=1}=X-1({1})=E
(2)
so that P({ω:X(ω)=1})=pP({ω:X(ω)=1})=p. This rather ungainly notation is shortened to
P(X=1)=pP(X=1)=p. Similarly, P(X=0)=1-pP(X=0)=1-p. Consider any set M.
If neither 1 nor 0 is in M, then X-1(M)=∅X-1(M)=∅
If 0 is in M, but 1 is not, then X-1(M)=EcX-1(M)=Ec
If 1 is in M, but 0 is not, then X-1(M)=EX-1(M)=E
If both 1 and 0 are in M, then X-1(M)=ΩX-1(M)=Ω
In this case the class of all events X-1(M)X-1(M) consists of event E,
its complement Ec, the impossible event ∅, and the sure event Ω.
- Consider a sequence of n Bernoulli trials, with probability p of success. Let Sn
be the random variable whose value is the number of successes in the sequence of n
component trials. Then, according to the analysis in the section "Bernoulli Trials and the Binomial Distribution"
P(Sn=k)=C(n,k)pk(1-p)n-k0≤k≤nP(Sn=k)=C(n,k)pk(1-p)n-k0≤k≤n
(3)
Before considering further examples, we note a general property of inverse images. We state it
in terms of a random variable, which maps Ω to the real line (see Figure 3).
Preservation of set operations
Let X be a mapping from Ω to the real line R. If M,Mi,i∈JM,Mi,i∈J, are sets of real numbers, with respective inverse images E,EiE,Ei, then
X
-
1
(
M
c
)
=
E
c
,
X
-
1
(
⋃
i
∈
J
M
i
)
=
⋃
i
∈
J
E
i
and
X
-
1
(
⋂
i
∈
J
M
i
)
=
⋂
i
∈
J
E
i
X
-
1
(
M
c
)
=
E
c
,
X
-
1
(
⋃
i
∈
J
M
i
)
=
⋃
i
∈
J
E
i
and
X
-
1
(
⋂
i
∈
J
M
i
)
=
⋂
i
∈
J
E
i
(4)Examination of simple graphical examples exhibits the plausibility of these patterns. Formal
proofs amount to careful reading of the notation. Central to the structure are the facts that
each element ω is mapped into only one image point t and that the inverse image of M is the set
of all those ω which are mapped into image points in M.
An easy, but important, consequence of the general patterns is that the inverse images of disjoint M,NM,N
are also disjoint. This implies that the inverse of a disjoint union of Mi is a disjoint
union of the separate inverse images.
Consider, again, the random variable Sn which counts the number of successes
in a sequence of n Bernoulli trials. Let n=10n=10 and p=0.33p=0.33. Suppose we want to
determine the probability P(2<S10≤8)P(2<S10≤8).
Let Ak={ω:S10(ω)=k}Ak={ω:S10(ω)=k}, which
we usually shorten to Ak={S10=k}Ak={S10=k}. Now the Ak form a partition, since
we cannot have ω∈Akω∈Ak and ω∈Aj,j≠kω∈Aj,j≠k (i.e., for any ω, we cannot have
two values for Sn(ω)Sn(ω)). Now,
{
2
<
S
10
≤
8
}
=
A
3
⋁
A
4
⋁
A
5
⋁
A
6
⋁
A
7
⋁
A
8
{
2
<
S
10
≤
8
}
=
A
3
⋁
A
4
⋁
A
5
⋁
A
6
⋁
A
7
⋁
A
8
(5)since S10 takes on a value greater than 2 but no greater than 8 iff it takes one
of the integer values from 3 to 8. By the additivity of probability,
P
(
2
<
S
10
≤
8
)
=
∑
k
=
3
8
P
(
S
10
=
k
)
=
0
.
6927
P
(
2
<
S
10
≤
8
)
=
∑
k
=
3
8
P
(
S
10
=
k
)
=
0
.
6927
(6)