The concept of independence for classes of events is developed in terms of a product rule. In this unit, we extend the concept to classes of random variables.
Summary: The concept of independence for classes of events is developed in terms of a product rule. Recall that for a real random variable X, the inverse image of each reasonable subset M on the real line (i.e., the set of all outcomes which are mapped into M by X) is an event. Similarly, the inverse image of N by random variable Y is an event. We extend the notion of independence to a pair of random variables by requiring independence of the events they determine in this fashion. This condition may be stated in terms of the product rule P(X in M, Y in N) = P(X in M)P(Y in N) for all Borel sets M, N. This product rule holds for the distribution functions FXY(t,u) = FX(t)FY(u) for all t, u. And similarly for density functions when they exist. This condition puts restrictions on the nature of the probability mass distribution on the plane. For a rectangle with sides M, N the probability mass in M x N is P(X in M)P(Y in N). Extension to general classes is simple and immediate.
The concept of independence for classes of events is developed in terms of a product rule. In this unit, we extend the concept to classes of random variables.
Recall that for a random variable X, the inverse image
Definition
A pair
This condition may be stated in terms of the product rule
Independence implies
Note that the product rule on the distribution function is equivalent to the condition the product
rule holds for the inverse images of a special class of sets
The pair
Suppose
so that the product rule
If there is a joint density function, then the relationship to the joint distribution function makes it clear that the pair is independent iff the product rule holds for the density. That is, the pair is independent iff
Suppose the joint probability mass distributions induced by the pair
Thus it follows that X is uniform on
It should be apparent that the independence condition puts restrictions on the character of the joint mass distribution on the plane. In order to describe this more succinctly, we employ the following terminology.
Definition
If M is a subset of the horizontal axis and N is a subset of the vertical axis,
then the cartesian product
The rectangle in Example 2 is the Cartesian product
![]() |
We restate the product rule for independence in terms of cartesian product sets.
Reference to Figure 1 illustrates the basic pattern. If M, N are intervals on
the horizontal and vertical axes, respectively, then the rectangle
This suggests a useful test for nonindependence which we call the rectangle test. We illustrate with a simple example.
![]() |
Supose probability mass is uniformly distributed over the square with vertices at
(1,0), (2,1), (1,2), (0,1). It is evident from Figure 2 that a value of X determines
the possible values of Y and vice versa, so that we would not expect independence of
the pair. To establish this, consider the small rectangle
Remark. There are nonindependent cases for which this test does not work. And it does not provide a test for independence. In spite of these limitations, it is frequently useful. Because of the information contained in the independence condition, in many cases the complete joint and marginal distributions may be obtained with appropriate partial information. The following is a simple example.
Suppose the pair
These values are shown in bold type on Figure 3. A combination of the product rule
and the fact that the total probability mass is one are used to calculate each of
the marginal and joint probabilities. For example
![]() |
A pair
where
The marginal densities are obtained with the aid of some algebraic tricks to integrate
the joint density. The result is that
so that the pair is independent iff
Remark. While it is true that every independent pair of normally distributed random variables is joint normal, not every pair of normally distributed random variables has the joint normal distribution.
We start with the distribution for a joint normal pair and derive a joint distribution for a normal pair which is not joint normal. The function
is the joint normal density for an independent pair (
Both
Since independence of random variables is independence of the events determined by the random variables, extension to general classes is simple and immediate.
Definition
A class
Remark. The index set J in the definition may be finite or infinite.
For a finite class
Since we may obtain the joint distribution function for any finite subclass by letting the arguments for the others be ∞ (i.e., by taking the limits as the appropriate ti increase without bound), the single product rule suffices to account for all finite subclasses.
Absolutely continuous random variables
If a class
Similarly, if each finite subclass is jointly absolutely continuous, then each individual variable is absolutely continuous and the product rule holds for the densities. Frequently we deal with independent classes in which each random variable has the same marginal distribution. Such classes are referred to as iid classes (an acronym for independent,identically distributed). Examples are simple random samples from a given population, or the results of repetitive trials with the same distribution on the outcome of each component trial. A Bernoulli sequence is a simple example.
Consider a pair
Since
According to the rectangle test, no gridpoint having one of the ti
or uj as a coordinate has zero probability mass . The marginal distributions
determine the joint distributions. If X has n distinct values and Y has m distinct
values, then the
Suppose X and Y are in affine form. That is,
Since
Calculations in the joint simple case are readily handled by appropriate m-functions and m-procedures.
MATLAB and independent simple random variables
In the general case of pairs of joint simple random variables we have the m-procedure jcalc,
which uses information in matrices
Once we have both marginal distributions, we use an m-procedure we call icalc. Formation of the joint probability matrix is simply a matter of determining all the joint probabilities
Once these are calculated, formation of the calculation matrices t and u is achieved exactly as in jcalc.
X = [-4 -2 0 1 3];
Y = [0 1 2 4];
PX = 0.01*[12 18 27 19 24];
PY = 0.01*[15 43 31 11];
icalc
Enter row matrix of X-values X
Enter row matrix of Y-values Y
Enter X probabilities PX
Enter Y probabilities PY
Use array operations on matrices X, Y, PX, PY, t, u, and P
disp(P) % Optional display of the joint matrix
0.0132 0.0198 0.0297 0.0209 0.0264
0.0372 0.0558 0.0837 0.0589 0.0744
0.0516 0.0774 0.1161 0.0817 0.1032
0.0180 0.0270 0.0405 0.0285 0.0360
disp(t) % Calculation matrix t
-4 -2 0 1 3
-4 -2 0 1 3
-4 -2 0 1 3
-4 -2 0 1 3
disp(u) % Calculation matrix u
4 4 4 4 4
2 2 2 2 2
1 1 1 1 1
0 0 0 0 0
M = (t>=-3)&(t<=2); % M = [-3, 2]
PM = total(M.*P) % P(X in M)
PM = 0.6400
N = (u>0)&(u.^2<=15); % N = {u: u > 0, u^2 <= 15}
PN = total(N.*P) % P(Y in N)
PN = 0.7400
Q = M&N; % Rectangle MxN
PQ = total(Q.*P) % P((X,Y) in MxN)
PQ = 0.4736
p = PM*PN
p = 0.4736 % P((X,Y) in MxN) = P(X in M)P(Y in N)
As an example, consider again the problem of joint Bernoulli trials described in the treatment of Composite trials.
1 Bill and Mary take ten basketball free throws each. We assume the two seqences of trials are independent of each other, and each is a Bernoulli sequence.
Mary: Has probability 0.80 of success on each trial.
Bill: Has probability 0.85 of success on each trial.
What is the probability Mary makes more free throws than Bill?
SOLUTION
Let X be the number of goals that Mary makes and Y be the number that Bill makes. Then
X = 0:10;
Y = 0:10;
PX = ibinom(10,0.8,X);
PY = ibinom(10,0.85,Y);
icalc
Enter row matrix of X-values X % Could enter 0:10
Enter row matrix of Y-values Y % Could enter 0:10
Enter X probabilities PX % Could enter ibinom(10,0.8,X)
Enter Y probabilities PY % Could enter ibinom(10,0.85,Y)
Use array operations on matrices X, Y, PX, PY, t, u, and P
PM = total((t>u).*P)
PM = 0.2738 % Agrees with solution in Example 9 from "Composite Trials".
Pe = total((u==t).*P) % Additional information is more easily
Pe = 0.2276 % obtained than in the event formulation
Pm = total((t>=u).*P) % of Example 9 from "Composite Trials".
Pm = 0.5014
Twelve world class sprinters in a meet are running in two heats of six persons each. Each runner has a reasonable chance of breaking the track record. We suppose results for individuals are independent.
First heat probabilities: 0.61 0.73 0.55 0.81 0.66 0.43
Second heat probabilities: 0.75 0.48 0.62 0.58 0.77 0.51
Compare the two heats for numbers who break the track record.
SOLUTION
Let X be the number of successes in the first heat and Y be the number
who are successful in the second heat. Then the pair
c1 = [ones(1,6) 0];
c2 = [ones(1,6) 0];
P1 = [0.61 0.73 0.55 0.81 0.66 0.43];
P2 = [0.75 0.48 0.62 0.58 0.77 0.51];
[X,PX] = canonicf(c1,minprob(P1));
[Y,PY] = canonicf(c2,minprob(P2));
icalc
Enter row matrix of X-values X
Enter row matrix of Y-values Y
Enter X probabilities PX
Enter Y probabilities PY
Use array operations on matrices X, Y, PX, PY, t, u, and P
Pm1 = total((t>u).*P) % Prob first heat has most
Pm1 = 0.3986
Pm2 = total((u>t).*P) % Prob second heat has most
Pm2 = 0.3606
Peq = total((t==u).*P) % Prob both have the same
Peq = 0.2408
Px3 = (X>=3)*PX' % Prob first has 3 or more
Px3 = 0.8708
Py3 = (Y>=3)*PY' % Prob second has 3 or more
Py3 = 0.8525
As in the case of jcalc, we have an m-function version icalcf
We have a related m-function idbn for obtaining the joint probability matrix from the marginal probabilities. Its formation of the joint matrix utilizes the same operations as icalc.
PX = 0.1*[3 5 2];
PY = 0.01*[20 15 40 25];
P = idbn(PX,PY)
P =
0.0750 0.1250 0.0500
0.1200 0.2000 0.0800
0.0450 0.0750 0.0300
0.0600 0.1000 0.0400
An m- procedure itest checks a joint distribution for independence. It does this by calculating the marginals, then forming an independent joint test matrix, which is compared with the original. We do not ordinarily exhibit the matrix P to be tested. However, this is a case in which the product rule holds for most of the minterms, and it would be very difficult to pick out those for which it fails. The m-procedure simply checks all of them.
idemo1 % Joint matrix in datafile idemo1
P = 0.0091 0.0147 0.0035 0.0049 0.0105 0.0161 0.0112
0.0117 0.0189 0.0045 0.0063 0.0135 0.0207 0.0144
0.0104 0.0168 0.0040 0.0056 0.0120 0.0184 0.0128
0.0169 0.0273 0.0065 0.0091 0.0095 0.0299 0.0208
0.0052 0.0084 0.0020 0.0028 0.0060 0.0092 0.0064
0.0169 0.0273 0.0065 0.0091 0.0195 0.0299 0.0208
0.0104 0.0168 0.0040 0.0056 0.0120 0.0184 0.0128
0.0078 0.0126 0.0030 0.0042 0.0190 0.0138 0.0096
0.0117 0.0189 0.0045 0.0063 0.0135 0.0207 0.0144
0.0091 0.0147 0.0035 0.0049 0.0105 0.0161 0.0112
0.0065 0.0105 0.0025 0.0035 0.0075 0.0115 0.0080
0.0143 0.0231 0.0055 0.0077 0.0165 0.0253 0.0176
itest
Enter matrix of joint probabilities P
The pair {X,Y} is NOT independent % Result of test
To see where the product rule fails, call for D
disp(D) % Optional call for D
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
1 1 1 1 1 1 1
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
1 1 1 1 1 1 1
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
Next, we consider an example in which the pair is known to be independent.
jdemo3 % call for data in m-file
disp(P) % call to display P
0.0132 0.0198 0.0297 0.0209 0.0264
0.0372 0.0558 0.0837 0.0589 0.0744
0.0516 0.0774 0.1161 0.0817 0.1032
0.0180 0.0270 0.0405 0.0285 0.0360
itest
Enter matrix of joint probabilities P
The pair {X,Y} is independent % Result of test
The procedure icalc can be extended to deal with an independent class of three random variables. We call the m-procedure icalc3. The following is a simple example of its use.
X = 0:4;
Y = 1:2:7;
Z = 0:3:12;
PX = 0.1*[1 3 2 3 1];
PY = 0.1*[2 2 3 3];
PZ = 0.1*[2 2 1 3 2];
icalc3
Enter row matrix of X-values X
Enter row matrix of Y-values Y
Enter row matrix of Z-values Z
Enter X probabilities PX
Enter Y probabilities PY
Enter Z probabilities PZ
Use array operations on matrices X, Y, Z,
PX, PY, PZ, t, u, v, and P
G = 3*t + 2*u - 4*v; % W = 3X + 2Y -4Z
[W,PW] = csort(G,P); % Distribution for W
PG = total((G>0).*P) % P(g(X,Y,Z) > 0)
PG = 0.3370
Pg = (W>0)*PW' % P(Z > 0)
Pg = 0.3370
An m-procedure icalc4 to handle an independent class of four variables is also available. Also several variations of the m-function mgsum and the m-function diidsum are used for obtaining distributions for sums of independent random variables. We consider them in various contexts in other units.
In the study of functions of random variables, we show that an approximating simple
random variable Xs of the type we use is a function of the random variable
X which is approximated. Also, we show that if
Suppose
Since
tuappr
Enter matrix [a b] of X-range endpoints [0 4]
Enter matrix [c d] of Y-range endpoints [0 6]
Enter number of X approximation points 200
Enter number of Y approximation points 300
Enter expression for joint density 6*exp(-(3*t + 2*u))
Use array operations on X, Y, PX, PY, t, u, and P
itest
Enter matrix of joint probabilities P
The pair {X,Y} is independent
The pair
so that
tuappr
Enter matrix [a b] of X-range endpoints [0 1]
Enter matrix [c d] of Y-range endpoints [0 1]
Enter number of X approximation points 100
Enter number of Y approximation points 100
Enter expression for joint density 4*t.*u
Use array operations on X, Y, PX, PY, t, u, and P
itest
Enter matrix of joint probabilities P
The pair {X,Y} is independent