We examine, first, calculations on a pair of simple random variables X,YX,Y, considered jointly.
These are, in effect, two components of a random vector W=(X,Y)W=(X,Y), which maps from
the basic space Ω to the plane. The induced distribution is on the
(t,u)(t,u)-plane. Values on the horizontal axis (t-axis) correspond to values of
the first coordinate random variable X and values on the vertical axis (u-axis)
correspond to values of Y. We extend the computational strategy used for a single
random variable.
First, let us review the one-variable strategy. In this case, data consist of values ti
and corresponding probabilities P(X=ti)P(X=ti) arranged in matrices
X
=
[
t
1
,
t
2
,
⋯
,
t
n
]
and
P
X
=
[
P
(
X
=
t
1
)
,
P
(
X
=
t
2
)
,
⋯
,
P
(
X
=
t
n
)
]
X
=
[
t
1
,
t
2
,
⋯
,
t
n
]
and
P
X
=
[
P
(
X
=
t
1
)
,
P
(
X
=
t
2
)
,
⋯
,
P
(
X
=
t
n
)
]
(1)
To perform calculations on Z=g(X)Z=g(X), we we use array operations on X to form a matrix
G
=
[
g
(
t
1
)
g
(
t
2
)
⋯
g
(
t
n
)
]
G
=
[
g
(
t
1
)
g
(
t
2
)
⋯
g
(
t
n
)
]
(2)
which has g(ti)g(ti) in a position corresponding to
P(X=ti)P(X=ti) in matrix PXPX.
Basic problem. Determine P(g(X)∈M)P(g(X)∈M), where M is some prescribed set of
values.
- Use relational operations to determine the positions for which g(ti)∈Mg(ti)∈M.
These will be in a zero-one matrix N, with ones in the desired positions.
- Select the P(X=ti)P(X=ti) in the corresponding positions and sum. This is
accomplished by one of the MATLAB operations to determine the inner product of N and PXPX
We extend these techniques and strategies to a pair of simple random variables, considered
jointly.
- The data for a pair {X,Y}{X,Y} of random variables are the values of X
and Y, which we may put in row matrices
X=[t1t2⋯tn]andY=[u1u2⋯um]X=[t1t2⋯tn]andY=[u1u2⋯um](3)
and the joint probabilities P(X=ti,Y=uj)P(X=ti,Y=uj) in a matrix P.
We usually represent the distribution graphically by putting probability mass
P(X=ti,Y=uj)P(X=ti,Y=uj) at the point (ti,uj)(ti,uj) on the plane. This joint probability
may is represented by the matrix P with elements arranged corresponding to the mass points
on the plane. Thus
PhaselementP(X=ti,Y=uj)atthe(ti,uj)positionPhaselementP(X=ti,Y=uj)atthe(ti,uj)position(4)
- To perform calculations, we form computational matrices t and u such that
— t has element ti at each (ti,uj)(ti,uj) position (i.e., at each point on
the ith column from the left)
— u has element uj at each (ti,uj)(ti,uj) position (i.e., at each point on
the jth row from the bottom)
MATLAB array and logical operations on t,u,Pt,u,P perform the specified operations
on ti,ujti,uj, and P(X=ti,Y=uj)P(X=ti,Y=uj) at each (ti,uj)(ti,uj) position, in a manner analogous
to the operations in the single-variable case.
- Formation of the t and u matrices is achieved by a basic setup m-procedure
called jcalc. The data for this procedure are in three matrices:
X=[t1,t2,⋯,tn]X=[t1,t2,⋯,tn] is the set of values for random variable X
Y=[u1,u2,⋯,um]Y=[u1,u2,⋯,um] is the set of values for random variable Y, and
P=[pij]P=[pij], where pij=P(X=ti,Y=uj)pij=P(X=ti,Y=uj).
We arrange the joint probabilities as on the plane, with X-values increasing to the right
and Y-values increasing upward. This is different from the usual arrangement in a matrix, in
which values of the second variable increase downward. The m-procedure takes care of this
inversion.
The m-procedure forms the matrices t and u, utilizing the MATLAB function meshgrid,
and computes the marginal distributions for X and Y.
In the following example, we display the various steps utilized in the setup procedure.
Ordinarily, these intermediate steps would not be displayed.
>> jdemo4 % Call for data in file jdemo4.m
>> jcalc % Call for setup procedure
Enter JOINT PROBABILITIES (as on the plane) P
Enter row matrix of VALUES of X X
Enter row matrix of VALUES of Y Y
Use array operations on matrices X, Y, PX, PY, t, u, and P
>> disp(P) % Optional call for display of P
0.0360 0.0198 0.0297 0.0209 0.0180
0.0372 0.0558 0.0837 0.0589 0.0744
0.0516 0.0774 0.1161 0.0817 0.1032
0.0264 0.0270 0.0405 0.0285 0.0132
>> PX % Optional call for display of PX
PX = 0.1512 0.1800 0.2700 0.1900 0.2088
>> PY % Optional call for display of PY
PY = 0.1356 0.4300 0.3100 0.1244
- - - - - - - - - - % Steps performed by jcalc
>> PX = sum(P) % Calculation of PX as performed by jcalc
PX = 0.1512 0.1800 0.2700 0.1900 0.2088
>> PY = fliplr(sum(P')) % Calculation of PY (note reversal)
PY = 0.1356 0.4300 0.3100 0.1244
>> [t,u] = meshgrid(X,fliplr(Y)); % Formation of t, u matrices (note reversal)
>> disp(t) % Display of calculating matrix t
-3 0 1 3 5 % A row of X-values for each value of Y
-3 0 1 3 5
-3 0 1 3 5
-3 0 1 3 5
>> disp(u) % Display of calculating matrix u
2 2 2 2 2 % A column of Y-values (increasing
1 1 1 1 1 % upward) for each value of X
0 0 0 0 0
-2 -2 -2 -2 -2
Suppose we wish to determine the probability
P(X2-3Y≥1)P(X2-3Y≥1).
Using array operations on
t and
u, we obtain the matrix
G=[g(ti,uj)]G=[g(ti,uj)].
>> G = t.^2 - 3*u % Formation of G = [g(t_i,u_j)] matrix
G = 3 -6 -5 3 19
6 -3 -2 6 22
9 0 1 9 25
15 6 7 15 31
>> M = G >= 1 % Positions where G >= 1
M = 1 0 0 1 1
1 0 0 1 1
1 0 1 1 1
1 1 1 1 1
>> pM = M.*P % Selection of probabilities
pM =
0.0360 0 0 0.0209 0.0180
0.0372 0 0 0.0589 0.0744
0.0516 0 0.1161 0.0817 0.1032
0.0264 0.0270 0.0405 0.0285 0.0132
>> PM = total(pM) % Total of selected probabilities
PM = 0.7336 % P(g(X,Y) >= 1)
- In Example 3 from "Random Vectors and Joint Distributions" we note that the joint distribution function FXYFXY
is constant over any grid cell, including the left-hand and lower boundaries, at
the value taken on at the lower left-hand corner of the cell. These lower left-hand
corner values may be obtained systematically from the joint probability matrix P by
a two step operation.
- Take cumulative sums upward of the columns of P.
- Take cumulative sums of the rows of the resultant matrix.
This can be done with the MATLAB function cumsum, which takes column cumulative
sums downward. By flipping the matrix and transposing, we can achieve the desired
results.
>> P = 0.1*[3 0 0; 0 6 0; 0 0 1];
>> FXY = flipud(cumsum(flipud(P))) % Cumulative column sums upward
FXY =
0.3000 0.6000 0.1000
0 0.6000 0.1000
0 0 0.1000
>> FXY = cumsum(FXY')' % Cumulative row sums
FXY =
0.3000 0.9000 1.0000
0 0.6000 0.7000
0 0 0.1000
Comparison with Example 3 from "Random Vectors and Joint Distributions" shows agreement with values obtained by
hand.
The two step procedure has been incorprated into an m-procedure jddbn. As
an example, return to the distribution in Example Example 1
>> jddbn
Enter joint probability matrix (as on the plane) P
To view joint distribution function, call for FXY
>> disp(FXY)
0.1512 0.3312 0.6012 0.7912 1.0000
0.1152 0.2754 0.5157 0.6848 0.8756
0.0780 0.1824 0.3390 0.4492 0.5656
0.0264 0.0534 0.0939 0.1224 0.1356
These values may be put on a grid, in the same manner as in Figure 2 for Example 3 in "Random Vectors and Joint Distributions".
- As in the case of canonic for a single random variable, it is often useful
to have a function version of the procedure jcalc to provide the freedom to name the
outputs conveniently.
function
[x,y,t,u,px,py,p] = jcalcf(X,Y,P)
The quantities x,y,t,u,px,py,x,y,t,u,px,py, and p may be given any desired names.
In the single-variable case, the condition that there are no point mass concentrations on the line
ensures the existence of a probability density function, useful in probability calculations.
A similar situation exists for a joint distribution for two (or more) variables. For any joint
mapping to the plane which assigns zero probability to each
set with zero area (discrete points, line or curve segments, and countable unions of these)
there is a density function.
Definition
If the joint probability distribution for the pair {X,Y}{X,Y} assigns
zero probability to every set of points with zero area, then there exists a joint density
function fXYfXY with the property
P
[
(
X
,
Y
)
∈
Q
]
=
∫
∫
Q
f
X
Y
P
[
(
X
,
Y
)
∈
Q
]
=
∫
∫
Q
f
X
Y
(5)
We have three properties analogous to those for the single-variable case:
(f1)
f
X
Y
≥
0
(f2)
∫
∫
R
2
f
X
Y
=
1
(f3)
F
X
Y
(
t
,
u
)
=
∫
-
∞
t
∫
-
∞
u
f
X
Y
(f1)
f
X
Y
≥
0
(f2)
∫
∫
R
2
f
X
Y
=
1
(f3)
F
X
Y
(
t
,
u
)
=
∫
-
∞
t
∫
-
∞
u
f
X
Y
(6)
At every continuity point for fXYfXY, the density is the second partial
f
X
Y
(
t
,
u
)
=
∂
2
F
X
Y
(
t
,
u
)
∂
t
∂
u
f
X
Y
(
t
,
u
)
=
∂
2
F
X
Y
(
t
,
u
)
∂
t
∂
u
(7)
Now
F
X
(
t
)
=
F
X
Y
(
t
,
∞
)
=
∫
-
∞
t
∫
-
∞
∞
f
X
Y
(
r
,
s
)
d
s
d
r
F
X
(
t
)
=
F
X
Y
(
t
,
∞
)
=
∫
-
∞
t
∫
-
∞
∞
f
X
Y
(
r
,
s
)
d
s
d
r
(8)
A similar expression holds for FY(u)FY(u). Use of the fundamental theorem of calculus to
obtain the derivatives gives the result
f
X
(
t
)
=
∫
-
∞
∞
f
X
Y
(
t
,
s
)
d
s
and
f
Y
(
u
)
=
∫
-
∞
∞
f
X
Y
(
r
,
u
)
d
u
f
X
(
t
)
=
∫
-
∞
∞
f
X
Y
(
t
,
s
)
d
s
and
f
Y
(
u
)
=
∫
-
∞
∞
f
X
Y
(
r
,
u
)
d
u
(9)
Marginal densities. Thus, to obtain the marginal density for the first variable,
integrate out the second variable
in the joint density, and similarly for the marginal for the second variable.
Let fXY(t,u)=8tu0≤u≤t≤1fXY(t,u)=8tu0≤u≤t≤1. This region is the triangle
bounded by u=0u=0, u=tu=t, and t=1t=1 (see Figure 2)
f
X
(
t
)
=
∫
f
X
Y
(
t
,
u
)
d
u
=
8
t
∫
0
t
u
d
u
=
4
t
3
,
0
≤
t
≤
1
f
X
(
t
)
=
∫
f
X
Y
(
t
,
u
)
d
u
=
8
t
∫
0
t
u
d
u
=
4
t
3
,
0
≤
t
≤
1
(10)
f
Y
(
u
)
=
∫
f
X
Y
(
t
,
u
)
d
t
=
8
u
∫
u
1
t
d
t
=
4
u
(
1
-
u
2
)
,
0
≤
u
≤
1
f
Y
(
u
)
=
∫
f
X
Y
(
t
,
u
)
d
t
=
8
u
∫
u
1
t
d
t
=
4
u
(
1
-
u
2
)
,
0
≤
u
≤
1
(11)
P(0.5≤X≤0.75,Y>0.5)=P[(X,Y)∈Q]P(0.5≤X≤0.75,Y>0.5)=P[(X,Y)∈Q] where Q is the common part of the
triangle with the strip between t=0.5t=0.5 and t=0.75t=0.75 and above the line u=0.5u=0.5. This is the small triangle bounded by u=0.5u=0.5, u=tu=t, and t=0.75t=0.75. Thus
p
=
8
∫
1
/
2
3
/
4
∫
1
/
2
t
t
u
d
u
d
t
=
25
/
256
≈
0
.
0977
p
=
8
∫
1
/
2
3
/
4
∫
1
/
2
t
t
u
d
u
d
t
=
25
/
256
≈
0
.
0977
(12)
The pair {X,Y}{X,Y} has joint density fXY(t,u)=637(t+2u)fXY(t,u)=637(t+2u)
on the region bounded by t=0t=0, t=2t=2, u=0u=0, and u=max{1,t}u=max{1,t} (see Figure 3).
Determine the marginal density fX.
SOLUTION
Examination of the figure shows that we have different limits for the integral with respect
to u for 0≤t≤10≤t≤1 and for 1<t≤21<t≤2.
We may combine these into a single expression in a manner used extensively in subsequent
treatments. Suppose M=[0,1]M=[0,1] and N=(1,2]N=(1,2]. Then IM(t)=1IM(t)=1 for t∈Mt∈M
(i.e., 0≤t≤10≤t≤1) and zero elsewhere. Likewise, IN(t)=1IN(t)=1 for t∈Nt∈N and
zero elsewhere. We can, therefore express fX by
f
X
(
t
)
=
I
M
(
t
)
6
37
(
t
+
1
)
+
I
N
(
t
)
12
37
t
2
f
X
(
t
)
=
I
M
(
t
)
6
37
(
t
+
1
)
+
I
N
(
t
)
12
37
t
2
(15)
For a pair {X,Y}{X,Y} with joint density fXYfXY, we approximate the distribution
in a manner similar to that for a single random variable. We then utilize the techniques
developed for a pair of simple random variables. If we have n approximating values
ti for X and m approximating values uj for Y, we then have n·mn·m pairs
(ti,uj)(ti,uj), corresponding to points on the plane. If we subdivide the horizontal axis
for values of X, with constant increments dxdx, as in the single-variable case, and the
vertical axis for values of Y, with constant increments dydy, we have a grid structure
consisting of rectangles of size dx·dydx·dy. We select ti and uj at the
midpoint of its increment, so that the point (ti,uj)(ti,uj) is at the midpoint of the
rectangle. If we let the approximating pair be {X*,Y*}{X*,Y*}, we assign
p
i
j
=
P
(
X
*
,
Y
*
)
=
(
t
i
,
u
j
)
=
P
(
X
*
=
t
i
,
Y
*
=
u
j
)
=
P
(
(
X
,
Y
)
in
i
j
th
rectangle)
p
i
j
=
P
(
X
*
,
Y
*
)
=
(
t
i
,
u
j
)
=
P
(
X
*
=
t
i
,
Y
*
=
u
j
)
=
P
(
(
X
,
Y
)
in
i
j
th
rectangle)
(16)
As in the one-variable case, if the increments are small enough,
P
(
(
X
,
Y
)
∈
i
j
th
rectangle)
≈
d
x
·
d
y
·
f
X
Y
(
t
i
,
u
j
)
P
(
(
X
,
Y
)
∈
i
j
th
rectangle)
≈
d
x
·
d
y
·
f
X
Y
(
t
i
,
u
j
)
(17)
The m-procedure tuappr calls for endpoints of intervals which include the
ranges of X and Y and for the numbers of subintervals on each. It then
prompts for an expression for fXY(t,u)fXY(t,u), from which it determines
the joint probability distribution. It calculates the marginal approximate
distributions and sets up the calculating matrices t and u as does the
m-process jcalc for simple random variables. Calculations are then carried out
as for any joint simple pair.
f
X
Y
(
t
,
u
)
=
3
on
0
≤
u
≤
t
2
≤
1
f
X
Y
(
t
,
u
)
=
3
on
0
≤
u
≤
t
2
≤
1
(18)
Determine P(X≤0.8,Y>0.1)P(X≤0.8,Y>0.1).
>> tuappr
Enter matrix [a b] of X-range endpoints [0 1]
Enter matrix [c d] of Y-range endpoints [0 1]
Enter number of X approximation points 200
Enter number of Y approximation points 200
Enter expression for joint density 3*(u <= t.^2)
Use array operations on X, Y, PX, PY, t, u, and P
>> M = (t <= 0.8)&(u > 0.1);
>> p = total(M.*P) % Evaluation of the integral with
p = 0.3355 % Maple gives 0.3352455531
The discrete approximation may be used to obtain approximate plots of marginal
distribution and density functions.
fXY(t,u)=3ufXY(t,u)=3u on the triangle bounded by u=0u=0, u≤1+tu≤1+t, and
u≤1-tu≤1-t.
>> tuappr
Enter matrix [a b] of X-range endpoints [-1 1]
Enter matrix [c d] of Y-range endpoints [0 1]
Enter number of X approximation points 400
Enter number of Y approximation points 200
Enter expression for joint density 3*u.*(u<=min(1+t,1-t))
Use array operations on X, Y, PX, PY, t, u, and P
>> fx = PX/dx; % Density for X (see Figure 4)
% Theoretical (3/2)(1 - |t|)^2
>> fy = PY/dy; % Density for Y
>> FX = cumsum(PX); % Distribution function for X (Figure 4)
>> FY = cumsum(PY); % Distribution function for Y
>> plot(X,fx,X,FX) % Plotting details omitted
These approximation techniques useful in dealing with functions of random variables,
expectations, and conditional expectation and regression.