In the unit on Random Selection, we develop some general theoretical results
and computational procedures using MATLAB. In this unit, we extend the treatment to a
variety of problems. We establish some useful theoretical results and in some cases use
MATLAB procedures, including those in the unit on random selection.
In many problems, the individual demands may be categorized in one of m types. If the
random variable Ti is the type of the ith arrival and the class {Ti:1≤i}{Ti:1≤i} is iid,
we have multinomial trials. For m=2m=2 we have the Bernoulli or binomial case, in
which one type is called a success and the other a failure.
Multinomial trials
We analyze such a sequence of trials as follows. Suppose there are m types, which we
number 1 through m. Let EkiEki be the event that type k occurs on the ith component
trial. For each i, the class {Eki:1≤k≤m}{Eki:1≤k≤m} is a partition, since on each
component trial exactly one of the types will occur. The type on the ith trial may be
represented by the type random variable
T
i
=
∑
k
=
1
m
k
I
E
k
i
T
i
=
∑
k
=
1
m
k
I
E
k
i
(1)
We assume
{
T
i
:
1
≤
i
}
is
iid,
with
P
(
T
i
=
k
)
=
P
(
E
k
i
)
=
p
k
invariant
with
i
{
T
i
:
1
≤
i
}
is
iid,
with
P
(
T
i
=
k
)
=
P
(
E
k
i
)
=
p
k
invariant
with
i
(2)
In a sequence of n trials, we let NknNkn be the number of occurrences of type k. Then
N
k
n
=
∑
i
=
1
n
I
E
k
i
with
∑
k
=
1
m
N
k
n
=
n
N
k
n
=
∑
i
=
1
n
I
E
k
i
with
∑
k
=
1
m
N
k
n
=
n
(3)
Now each Nkn∼Nkn∼ binomial (n,pk)(n,pk). The class {Nkn:1≤k≤m}{Nkn:1≤k≤m} cannot
be independent, since it sums to n. If the values of m-1m-1 of them are known, the
value of the other is determined. If n1+n2+⋯+nm=nn1+n2+⋯+nm=n, the event
{
N
1
n
=
n
1
,
N
2
n
=
n
2
,
⋯
,
N
m
n
=
n
m
}
{
N
1
n
=
n
1
,
N
2
n
=
n
2
,
⋯
,
N
m
n
=
n
m
}
(4)
is one of the
C
(
n
;
n
1
,
n
2
,
⋯
,
n
m
)
=
n
!
/
(
n
1
!
n
2
!
⋯
n
m
!
)
C
(
n
;
n
1
,
n
2
,
⋯
,
n
m
)
=
n
!
/
(
n
1
!
n
2
!
⋯
n
m
!
)
(5)
ways of arranging n1 of the E1iE1i, n2 of the E2i,⋯E2i,⋯ , nm of the
EmiEmi. Each such arrangement has probability p1n1p2n2⋯pmnmp1n1p2n2⋯pmnm, so that
P
(
N
1
n
=
n
1
,
N
2
n
=
n
2
,
⋯
N
m
n
=
n
m
)
=
n
!
∏
k
=
1
m
p
k
n
k
n
k
!
P
(
N
1
n
=
n
1
,
N
2
n
=
n
2
,
⋯
N
m
n
=
n
m
)
=
n
!
∏
k
=
1
m
p
k
n
k
n
k
!
(6)
This set of joint probabilities constitutes the multinomial distribution. For m=2m=2,
and type 1 a success, this is the binomial distribution with parameter (n,p1)(n,p1).
A random number of multinomial trials
We consider, in particular, the case of a random number N of multinomial trials, where N∼N∼
Poisson (μ)(μ). Let Nk be the number of results of type k in a random number N of multinomial trials.
N
k
=
∑
i
=
1
N
I
E
k
i
=
∑
n
=
1
∞
I
{
N
=
n
}
N
k
n
with
∑
k
=
1
m
N
k
=
N
N
k
=
∑
i
=
1
N
I
E
k
i
=
∑
n
=
1
∞
I
{
N
=
n
}
N
k
n
with
∑
k
=
1
m
N
k
=
N
(7)
Poisson decomposition
Suppose
- N∼N∼ Poisson (μ)(μ)
- {Ti:1≤i}{Ti:1≤i} is iid with P(Ti=k)=pk,1≤k≤mP(Ti=k)=pk,1≤k≤m
- {N,Ti:1≤i}{N,Ti:1≤i} is independent
Then
- Each Nk∼Nk∼ Poisson (μpk)(μpk)
- {Nk:1≤k≤m}{Nk:1≤k≤m} is independent.
— □□
The usefulness of this remarkable result is enhanced by the fact that the sum of
independent Poisson random variables is also Poisson, with μ for the sum
the sum of the μi for the variables added. This is readily established
with the aid of the generating function. Before verifying the propositions above,
we consider some examples.
The number N of orders per day received by a mail order house is Poisson (300). Orders
are shipped by next day express, by second day priority, or by regular parcel mail. Suppose
4/10 of the customers want next day express, 5/10 want second day priority, and 1/10
require regular mail. Make the usual assumptions on compound demand. What is the probability
that fewer than 150 want next day express? What is the probability that fewer than 300 want
one or the other of the two faster deliveries?
SOLUTION
Model as a random number of multinomial trials, with three outcome types: Type 1 is next
day express, Type 2 is second day priority, and Type 3 is regular mail, with respective
probabilities p1=0.4p1=0.4, p2=0.5p2=0.5, and p3=0.1p3=0.1. Then N1∼N1∼ Poisson
(0.4·300=120)(0.4·300=120), N2∼N2∼ Poisson (0.5·300=150)(0.5·300=150), and N3∼N3∼ Poisson
(0.1·300=30)(0.1·300=30). Also N1+N2∼N1+N2∼ Poisson (120 + 150 = 270).
P1 = 1 - cpoisson(120,150)
P1 = 0.9954
P12 = 1 - cpoisson(270,300)
P12 = 0.9620
A junction point in a network has two incoming lines and two outgoing lines.
The number of incoming messages N1 on line one in one hour is Poisson (50); on line 2 the
number is N2∼N2∼ Poisson (45). On incoming line 1 the messages have probability p1a=0.33p1a=0.33 of leaving on outgoing line a and 1-p1a1-p1a of leaving on line b. The messages
coming in on line 2 have probability p2a=0.47p2a=0.47 of leaving on line a. Under the
usual independence assumptions, what is the distribution of outgoing messages on line a?
What are the probabilities of at least 30, 35, 40 outgoing messages on line a?
SOLUTION
By the Poisson decomposition, Na∼Na∼ Poisson (50·0.33+45·0.47=37.65)(50·0.33+45·0.47=37.65).
ma = 50*0.33 + 45*0.47
ma = 37.6500
Pa = cpoisson(ma,30:5:40)
Pa = 0.9119 0.6890 0.3722
VERIFICATION of the Poisson decomposition
- Nk=∑i=1NIEkiNk=∑i=1NIEki.
This is composite demand with Yk=IEkiYk=IEki, so that gYk(s)=qk+spk=1+pk(s-1)gYk(s)=qk+spk=1+pk(s-1).
Therefore,
gNk(s)=gN[gYk(s)]=eμ(1+pk(s-1)-1)=eμpk(s-1)gNk(s)=gN[gYk(s)]=eμ(1+pk(s-1)-1)=eμpk(s-1)(8)
which is the generating function for Nk∼Nk∼ Poisson (μpk)(μpk).
- For any n1,n2,⋯,nmn1,n2,⋯,nm, let n=n1+n2+⋯+nmn=n1+n2+⋯+nm,
and consider
A={N1=n1,N2=n2,⋯,Nm=nm}={N=n}∩{N1n=n1,N2n=n2,⋯,Nmn=nm}A={N1=n1,N2=n2,⋯,Nm=nm}={N=n}∩{N1n=n1,N2n=n2,⋯,Nmn=nm}(9)
Since N is independent of the class of IEkiIEki, the class
{{N=n},{N1n=n1,N2n=n2,⋯,Nmn=nm}}{{N=n},{N1n=n1,N2n=n2,⋯,Nmn=nm}}(10)
is independent. By the product rule and the multinomial distribution
P(A)=e-μμnn!·n!∏k=1mpknk(nk)!=∏k=1me-μpkpknknk!=∏k=1mP(Nk=nk)P(A)=e-μμnn!·n!∏k=1mpknk(nk)!=∏k=1me-μpkpknknk!=∏k=1mP(Nk=nk)(11)
The second product uses the fact that
eμ=eμ(p1+p2+⋯+pm)=∏k=1meμpkeμ=eμ(p1+p2+⋯+pm)=∏k=1meμpk(12)
Thus, the product rule holds for the class {Nk:1≤k≤m}{Nk:1≤k≤m}, so that
it is independent.
Consider an iid class {Yi:1≤i}{Yi:1≤i} of nonnegative random variables. For any
positive integer n we let
V
n
=
min
{
Y
1
,
Y
2
,
⋯
,
Y
n
}
and
W
n
=
max
{
Y
1
,
Y
2
,
⋯
,
Y
n
}
V
n
=
min
{
Y
1
,
Y
2
,
⋯
,
Y
n
}
and
W
n
=
max
{
Y
1
,
Y
2
,
⋯
,
Y
n
}
(13)
Then
P
(
V
n
>
t
)
=
P
n
(
Y
>
t
)
and
P
(
W
n
≤
t
)
=
P
n
(
Y
≤
t
)
P
(
V
n
>
t
)
=
P
n
(
Y
>
t
)
and
P
(
W
n
≤
t
)
=
P
n
(
Y
≤
t
)
(14)
Now consider a random number N of the Yi. The minimum and maximum random variables are
V
N
=
∑
n
=
0
∞
I
{
N
=
n
}
V
n
and
W
N
=
∑
n
=
0
∞
I
{
N
=
n
}
W
n
V
N
=
∑
n
=
0
∞
I
{
N
=
n
}
V
n
and
W
N
=
∑
n
=
0
∞
I
{
N
=
n
}
W
n
(15)
— □□
Computational formulas
If we set V0=W0=0V0=W0=0, then
-
FV(t)=P(V≤t)=1+P(N=0)-gN[P(Y>t)]FV(t)=P(V≤t)=1+P(N=0)-gN[P(Y>t)]
- FW(t)=gN[P(Y≤t)]FW(t)=gN[P(Y≤t)]
These results are easily established as follows.
{VN>t}=⋁n=0∞{N=n}{Vn>t}{VN>t}=⋁n=0∞{N=n}{Vn>t}. By additivity and independence of {N,Vn}{N,Vn} for each n
P
(
V
N
>
t
)
=
∑
n
=
0
∞
P
(
N
=
n
)
P
(
V
n
>
t
)
=
∑
n
=
1
∞
P
(
N
=
n
)
P
n
(
Y
>
t
)
,
since
P
(
V
0
>
t
)
=
0
P
(
V
N
>
t
)
=
∑
n
=
0
∞
P
(
N
=
n
)
P
(
V
n
>
t
)
=
∑
n
=
1
∞
P
(
N
=
n
)
P
n
(
Y
>
t
)
,
since
P
(
V
0
>
t
)
=
0
(16)
If we add into the last sum the term P(N=0)P0(Y>t)=P(N=0)P(N=0)P0(Y>t)=P(N=0) then subtract it, we have
P
(
V
N
>
t
)
=
∑
n
=
0
∞
P
(
N
=
n
)
P
n
(
Y
>
t
)
-
P
(
N
=
0
)
=
g
N
[
P
(
Y
>
t
)
]
-
P
(
N
=
0
)
P
(
V
N
>
t
)
=
∑
n
=
0
∞
P
(
N
=
n
)
P
n
(
Y
>
t
)
-
P
(
N
=
0
)
=
g
N
[
P
(
Y
>
t
)
]
-
P
(
N
=
0
)
(17)
A similar argument holds for proposition (b). In this case, we do not have
the extra term for {N=0}{N=0}, since P(W0≤t)=1P(W0≤t)=1.
Special case. In some cases, N=0N=0 does not correspond to
an admissible outcome (see Example 4, below, on lowest bidder and Example 6).
In that case
F
V
(
t
)
=
∑
n
=
1
∞
P
(
V
n
≤
t
)
P
(
N
=
n
)
=
∑
n
=
1
∞
[
1
-
P
n
(
Y
>
t
)
]
P
(
N
=
n
)
=
∑
n
=
1
∞
P
(
N
=
n
)
-
∑
n
=
1
∞
P
n
(
Y
>
t
)
P
(
N
=
n
)
F
V
(
t
)
=
∑
n
=
1
∞
P
(
V
n
≤
t
)
P
(
N
=
n
)
=
∑
n
=
1
∞
[
1
-
P
n
(
Y
>
t
)
]
P
(
N
=
n
)
=
∑
n
=
1
∞
P
(
N
=
n
)
-
∑
n
=
1
∞
P
n
(
Y
>
t
)
P
(
N
=
n
)
(18)
Add P(N=0)=P0(Y>t)P(N=0)P(N=0)=P0(Y>t)P(N=0) to each of the sums to get
F
V
(
t
)
=
1
-
∑
n
=
0
∞
P
n
(
Y
>
t
)
P
(
N
=
n
)
=
1
-
g
N
[
P
(
Y
>
t
)
]
F
V
(
t
)
=
1
-
∑
n
=
0
∞
P
n
(
Y
>
t
)
P
(
N
=
n
)
=
1
-
g
N
[
P
(
Y
>
t
)
]
(19)
— □□
The number N of jobs coming into a service center in a week is a random quantity
having a Poisson (20) distribution. Suppose the service times (in hours) for individual
units are iid, with common distribution exponential (1/3). What is the probability
the maximum service time for the units is no greater than 6, 9, 12, 15, 18 hours?
SOLUTION
P
(
W
N
≤
t
)
=
g
N
[
P
(
Y
≤
t
)
]
=
e
20
[
F
Y
(
t
)
-
1
]
=
exp
(
-
20
e
-
t
/
3
)
P
(
W
N
≤
t
)
=
g
N
[
P
(
Y
≤
t
)
]
=
e
20
[
F
Y
(
t
)
-
1
]
=
exp
(
-
20
e
-
t
/
3
)
(20)
t = 6:3:18;
PW = exp(-20*exp(-t/3));
disp([t;PW]')
6.0000 0.0668
9.0000 0.3694
12.0000 0.6933
15.0000 0.8739
18.0000 0.9516
A manufacturer seeks bids on a modification of one of his processing units. Twenty
contractors are invited to bid. They bid with probability 0.3, so that the number
of bids N∼N∼ binomial (20,0.3). Assume the bids Yi (in thousands of dollars)
form an iid class. The market is such that the bids have a common distribution
symmetric triangular on (150,250). What is the probability of at least one bid
no greater than 170, 180, 190, 200, 210? Note that no bid is not a low bid of
zero, hence we must use the special case.
P
(
V
≤
t
)
=
1
-
g
N
[
P
(
Y
>
t
)
]
=
1
-
(
0
.
7
+
0
.
3
p
)
20
where
p
=
P
(
Y
>
t
)
P
(
V
≤
t
)
=
1
-
g
N
[
P
(
Y
>
t
)
]
=
1
-
(
0
.
7
+
0
.
3
p
)
20
where
p
=
P
(
Y
>
t
)
Solving graphically for p=P(V>t)p=P(V>t) , we get
p
=
[
23
/
25
41
/
50
17
/
25
1
/
2
8
/
25
]
for
t
=
[
170
180
190
200
210
]
p
=
[
23
/
25
41
/
50
17
/
25
1
/
2
8
/
25
]
for
t
=
[
170
180
190
200
210
]
Now gN(s)=(0.7+0.3s)20gN(s)=(0.7+0.3s)20. We use MATLAB to obtain
t = [170 180 190 200 210];
p = [23/25 41/50 17/25 1/2 8/25];
PV = 1 - (0.7 + 0.3*p).^20;
disp([t;p;PV]')
170.0000 0.9200 0.3848
180.0000 0.8200 0.6705
190.0000 0.6800 0.8671
200.0000 0.5000 0.9612
210.0000 0.3200 0.9896
Suppose the number of bids is 1, 2 or 3 with probabilities 0.3, 0.5, 0.2, respectively.
Determine P(V≤t)P(V≤t) in each case.
SOLUTION.
The minimum of the selected Y'sY's is no greater than t if and only if there is at
least one Y less than or equal to t.
We determine in each case probabilities for the number of bids satisfying Y≤tY≤t. For each t, we are
interested in the probability of one or more occurrences of the event Y≤tY≤t. This is
essentially the problem in Example 7 from "Random Selection", with probability p=P(Y≤t)p=P(Y≤t).
t = [170 180 190 200 210];
p = [23/25 41/50 17/25 1/2 8/25]; % Probabilities Y <= t are 1 - p
gN = [0 0.3 0.5 0.2]; % Zero for missing value
PV = zeros(1,length(t));
for i=1:length(t)
gY = [p(i),1 - p(i)];
[d,pd] = gendf(gN,gY);
PV(i) = (d>0)*pd'; % Selects positions for d > 0 and
end % adds corresponding probabilities
disp([t;PV]')
170.0000 0.1451
180.0000 0.3075
190.0000 0.5019
200.0000 0.7000
210.0000 0.8462
Example 4 may be worked in this manner by using
gN = ibinom(20,0.3,0:20). The results, of course, are the same as in the previous solution.
The fact that the probabilities in this example are lower for each t than in Example 4 reflects the fact that
there are probably fewer bids in each case.
Electrical units from a production line are first inspected for operability. However,
experience indicates that a fraction p of those passing the initial operability
test are defective. All operable units are subsequenly tested in a batch under continuous
operation ( a “burn in” test). Statistical data indicate the defective units have times
to failure Yi iid, exponential (λ)(λ), whereas good units have very long life
(infinite from the point of view of the test). A batch of n units is tested. Let V
be the time of the first failure and N be the number of defective units in the batch.
If the test goes t units of time with no failure (i.e., V>tV>t), what is the probability
of no defective units?
SOLUTION
Since no defective units implies no failures in any reasonable test time, we have
{
N
=
0
}
⊂
{
V
>
t
}
so
that
P
(
N
=
0
|
V
>
t
)
=
P
(
N
=
0
)
P
(
V
>
t
)
{
N
=
0
}
⊂
{
V
>
t
}
so
that
P
(
N
=
0
|
V
>
t
)
=
P
(
N
=
0
)
P
(
V
>
t
)
(21)
Since N=0N=0 does not yield a minimum value, we have P(V>t)=gN[P(Y>t)]P(V>t)=gN[P(Y>t)]. Now under the condition above, the number of defective units N∼N∼ binomial
(n,p)(n,p), so that gN(s)=(q+ps)ngN(s)=(q+ps)n. If N is large and p is reasonably
small, N is approximately Poisson (np)(np) with gN(s)=enp(s-1)gN(s)=enp(s-1) and
P(N=0)=e-npP(N=0)=e-np. Now P(Y>t)=e-λtP(Y>t)=e-λt; for large n
P
(
N
=
0
|
V
>
t
)
=
e
-
n
p
e
n
p
[
P
(
Y
>
t
)
-
1
]
=
e
-
n
p
P
(
Y
>
t
)
=
e
-
n
p
e
-
λ
t
P
(
N
=
0
|
V
>
t
)
=
e
-
n
p
e
n
p
[
P
(
Y
>
t
)
-
1
]
=
e
-
n
p
P
(
Y
>
t
)
=
e
-
n
p
e
-
λ
t
(22)
For n=5000,p=0.001,λ=2n=5000,p=0.001,λ=2, and t=1,2,3,4,5t=1,2,3,4,5, MATLAB
calculations give
t = 1:5;
n = 5000;
p = 0.001;
lambda = 2;
P = exp(-n*p*exp(-lambda*t));
disp([t;P]')
1.0000 0.5083
2.0000 0.9125
3.0000 0.9877
4.0000 0.9983
5.0000 0.9998
It appears that a test of three to five hours should give reliable results. In actually
designing the test, one should probably make calculations with a number of different
assumptions on the fraction of defective units and the life duration of defective units.
These calculations are relatively easy to make with MATLAB.
Consider a Bernoulli sequence with probability p of success on any component trial.
Let N be the number of the trial on which the first success occurs. Let Yi be
the time (or cost) to execute the ith trial. Then the total time (or cost) from
the beginning to the completion of the first success is
T
=
∑
i
=
1
N
Y
i
(composite
``demand''
with
N
-
1
∼
geometric
p
)
T
=
∑
i
=
1
N
Y
i
(composite
``demand''
with
N
-
1
∼
geometric
p
)
(23)
We suppose the Yi form an iid class, independent of N. Now N-1∼N-1∼ geometric (p)(p) implies
gN(s)=ps/(1-qs)gN(s)=ps/(1-qs), so that
M
T
(
s
)
=
g
N
[
M
Y
(
s
)
]
=
p
M
Y
(
s
)
1
-
q
M
Y
(
s
)
M
T
(
s
)
=
g
N
[
M
Y
(
s
)
]
=
p
M
Y
(
s
)
1
-
q
M
Y
(
s
)
(24)
There are two useful special cases:
- Yi∼Yi∼ exponential (λ)(λ), so that MY(s)=λλ-sMY(s)=λλ-s.
MT(s)=pλ/(λ-s)1-qλ/(λ-s)=pλpλ-sMT(s)=pλ/(λ-s)1-qλ/(λ-s)=pλpλ-s(25)
which implies T∼T∼ exponential (pλ)(pλ).
- Yi-1∼Yi-1∼ geometric (p0)(p0), so that gY(s)=p0s1-q0sgY(s)=p0s1-q0s
gT(s)=pp0s/(1-q0s)1-pp0s/(1-q0s)=pp0s1-(1-pp0)sgT(s)=pp0s/(1-q0s)1-pp0s/(1-q0s)=pp0s1-(1-pp0)s(26)
so that T-1∼T-1∼ geometric (pp0)(pp0).
Suppose a prospective employer is interviewing candidates for a job from a pool in which
twenty percent are qualified. Interview times (in hours) Yi are presumed to form an
iid class, each exponential (3). Thus, the average interview time is 1/3 hour (twenty
minutes). We take the probability for success on any interview to be p=0.2p=0.2. What
is the probability a satisfactory candidate will be found in four hours or less?
What is the probability the maximum interview time will be no greater than 0.5, 0.75,
1, 1.25, 1.5 hours?
SOLUTION
T∼T∼ exponential (0.2·3=0.6)(0.2·3=0.6), so that P(T≤4)=1-e-0.6·4=0.9093P(T≤4)=1-e-0.6·4=0.9093.
P
(
W
≤
t
)
=
g
N
[
P
(
Y
≤
t
)
]
=
0
.
2
(
1
-
e
-
3
t
)
1
-
0
.
8
(
1
-
e
-
3
t
)
=
1
-
e
-
3
t
1
+
4
e
-
3
t
P
(
W
≤
t
)
=
g
N
[
P
(
Y
≤
t
)
]
=
0
.
2
(
1
-
e
-
3
t
)
1
-
0
.
8
(
1
-
e
-
3
t
)
=
1
-
e
-
3
t
1
+
4
e
-
3
t
(27)
MATLAB computations give
t = 0.5:0.25:1.5;
PWt = (1 - exp(-3*t))./(1 + 4*exp(-3*t));
disp([t;PWt]')
0.5000 0.4105
0.7500 0.6293
1.0000 0.7924
1.2500 0.8925
1.5000 0.9468
The average interview time is 1/3 hour; with probability 0.63 the maximum is 3/4 hour
or less; with probability 0.79 the maximum is one hour or less; etc.
In the general case, solving for the distribution of T requires transform theory,
and may be handled best by a program such as Maple or Mathematica.
For the case of simple Yi, we may use approximation procedures based on
properties of the geometric series. Since N-1∼N-1∼ geometric (p)(p),
g
N
(
s
)
=
p
s
1
-
q
s
=
p
s
∑
k
=
0
∞
(
q
s
)
k
=
p
s
∑
k
=
0
n
(
q
s
)
k
+
∑
k
=
n
+
1
∞
(
q
s
)
k
=
p
s
∑
k
=
0
n
(
q
s
)
k
+
(
q
s
)
n
+
1
∑
k
=
0
∞
(
q
s
)
k
g
N
(
s
)
=
p
s
1
-
q
s
=
p
s
∑
k
=
0
∞
(
q
s
)
k
=
p
s
∑
k
=
0
n
(
q
s
)
k
+
∑
k
=
n
+
1
∞
(
q
s
)
k
=
p
s
∑
k
=
0
n
(
q
s
)
k
+
(
q
s
)
n
+
1
∑
k
=
0
∞
(
q
s
)
k
(28)
=
p
s
∑
k
=
0
n
(
q
s
)
k
+
(
q
s
)
n
+
1
g
N
(
s
)
=
g
n
(
s
)
+
(
q
s
)
n
+
1
g
N
(
s
)
=
p
s
∑
k
=
0
n
(
q
s
)
k
+
(
q
s
)
n
+
1
g
N
(
s
)
=
g
n
(
s
)
+
(
q
s
)
n
+
1
g
N
(
s
)
(29)
Note that gn(s)gn(s) has the form of the generating function for a simple approximation
Nn which matches values and probabilities with N up to k=nk=n. Now
g
T
(
s
)
=
g
n
[
g
Y
(
s
)
]
+
(
q
s
)
n
+
1
g
N
[
g
Y
(
s
)
]
g
T
(
s
)
=
g
n
[
g
Y
(
s
)
]
+
(
q
s
)
n
+
1
g
N
[
g
Y
(
s
)
]
(30)
The evaluation involves convolution of coefficients which effectively sets s=1s=1. Since
gN(1)=gY(1)=1gN(1)=gY(1)=1,
(
q
s
)
n
+
1
g
N
[
g
Y
(
s
)
]
for
s
=
1
reduces
to
q
n
+
1
=
P
(
N
>
n
)
(
q
s
)
n
+
1
g
N
[
g
Y
(
s
)
]
for
s
=
1
reduces
to
q
n
+
1
=
P
(
N
>
n
)
(31)
which is negligible if n is large enough. Suitable n may be determined in each case.
With such an n, if the Yi are nonnegative, integer-valued, we may use the gend procedure
on gn[gY(s)]gn[gY(s)], where
g
n
(
s
)
=
p
s
+
p
q
s
2
+
p
q
2
s
3
+
⋯
+
p
q
n
s
n
+
1
g
n
(
s
)
=
p
s
+
p
q
s
2
+
p
q
2
s
3
+
⋯
+
p
q
n
s
n
+
1
(32)
For the integer-valued case, as in the general case of simple Yi, we could use mgd. However,
gend is usually faster and more efficient for the integer-valued case. Unless q is small,
the number of terms needed to approximate gn is likely to be too great.
Let p=0.3p=0.3 and Y be uniformly distributed on {1,2,⋯,10}{1,2,⋯,10}. Determine
the distribution for
T
=
∑
k
=
1
N
Y
k
T
=
∑
k
=
1
N
Y
k
(33)
SOLUTION
p = 0.3;
q = 1 - p;
a = [30 35 40]; % Check for suitable n
b = q.^a
b = 1.0e-04 * % Use n = 40
0.2254 0.0379 0.0064
n = 40;
k = 1:n;
gY = 0.1*[0 ones(1,10)];
gN = p*[0 q.^(k-1)]; % Probabilities, 0 <= k <= 40
gend
Do not forget zero coefficients for missing powers
Enter gen fn COEFFICIENTS for gN gN
Enter gen fn COEFFICIENTS for gY gY
Values are in row matrix D; probabilities are in PD.
To view the distribution, call for gD.
sum(PD) % Check sum of probabilities
ans = 1.0000
FD = cumsum(PD); % Distribution function for D
plot(0:100,FD(1:101)) % See Figure 1
P50 = (D<=50)*PD'
P50 = 0.9497
P30 = (D<=30)*PD'
P30 = 0.8263
The same results may be achieved with mgd, although at the cost of more computing time.
In that case, use gNgN as in Example 8, but use the actual distribution for Y.
Suppose we have phenomena which take place at discrete instants of time, separated by
random waiting or interarrival times. These may be arrivals of customers in a store,
of noise pulses on a communications line, vehicles passing a position on a road, the
failures of a system, etc. We refer to these occurrences as arrivals and
designate the times of occurrence as arrival times. A stream of arrivals may be described in three equivalent ways.
- Arrival times: {Sn:0≤n}{Sn:0≤n}, with 0=S0<S1<⋯a.s.0=S0<S1<⋯a.s. (basic sequence)
- Interarrival times: {Wi:1≤i}{Wi:1≤i}, with each Wi>0a.s.Wi>0a.s. (incremental sequence)
The strict inequalities imply that with probability one there are no simultaneous arrivals.
The relations between the two sequences are simply
S
0
=
0
,
S
n
=
∑
i
=
1
n
W
i
and
W
n
=
S
n
-
S
n
-
1
for
all
n
≥
1
S
0
=
0
,
S
n
=
∑
i
=
1
n
W
i
and
W
n
=
S
n
-
S
n
-
1
for
all
n
≥
1
(34)
The formulation indicates the essential equivalence of the problem with that of the
compound demand. The notation and terminology are changed to correspond to that
customarily used in the treatment of arrival and counting processes.
The stream of arrivals may be described in a third way.
- Counting processes: Nt=N(t)Nt=N(t) is the number of arrivals in time period
(0,t](0,t]. It should be clear that this is a random quantity for each nonnegative
t. For a given t,ωt,ω the value is N(t,ω)N(t,ω). Such a family of random
variables constitutes a random process. In this
case the random process is a counting process.
We thus have three equivalent descriptions for the stream of arrivals.
{
S
n
:
0
≤
n
}
{
W
n
:
1
≤
n
}
{
N
t
:
0
≤
t
}
{
S
n
:
0
≤
n
}
{
W
n
:
1
≤
n
}
{
N
t
:
0
≤
t
}
(35)
Several properties of the counting process N should be noted:
- N(t+h)-N(t)N(t+h)-N(t) counts the arrivals in the interval (t,t+h](t,t+h], h>0h>0,
so that N(t+h)≥N(t)N(t+h)≥N(t) for h>0h>0.
- N0=0N0=0 and for t>0t>0 we have
Nt=∑i=1∞I(0,t](Si)=max{n:Sn≤t}=min{n:Sn+1>t}Nt=∑i=1∞I(0,t](Si)=max{n:Sn≤t}=min{n:Sn+1>t}(36)
- For any given ω, N(·,ω)N(·,ω) is a nondecreasing, right-continuous,
integer-valued function defined on [0,∞)[0,∞), with N(0,ω)=0N(0,ω)=0.
The essential relationships between the three ways of describing the stream of arrivals
is displayed in
W
n
=
S
n
-
S
n
-
1
,
{
N
t
≥
n
}
=
{
S
n
≤
t
}
,
{
N
t
=
n
}
=
{
S
n
≤
t
<
S
n
+
1
}
W
n
=
S
n
-
S
n
-
1
,
{
N
t
≥
n
}
=
{
S
n
≤
t
}
,
{
N
t
=
n
}
=
{
S
n
≤
t
<
S
n
+
1
}
(37)
This imples
P
(
N
t
=
n
)
=
P
(
S
n
≤
t
)
-
P
(
S
n
+
1
≤
t
)
=
P
(
S
n
+
1
>
t
)
-
P
(
S
n
>
t
)
P
(
N
t
=
n
)
=
P
(
S
n
≤
t
)
-
P
(
S
n
+
1
≤
t
)
=
P
(
S
n
+
1
>
t
)
-
P
(
S
n
>
t
)
(38)
Although there are many possibilities for the interarrival time distributions, we assume
{
W
i
:
1
≤
i
}
is
iid,
with
W
i
>
0
a
.
s
.
{
W
i
:
1
≤
i
}
is
iid,
with
W
i
>
0
a
.
s
.
(39)
Under such assumptions, the counting process is often referred to as a renewal
process and the interrarival times are called renewal times. In the literature
on renewal processes, it is common for the random variable to count an arrival at
t=0t=0. This requires an adjustment of the expressions relating Nt and the
Si. We use the convention above.
Exponential iid interarrival times
The case of exponential interarrival times is natural in many applications and
leads to important mathematical results. We utilize the following
propositions about the arrival times Sn, the interarrival times Wi, and the
counting process N.
- If {Wi:1≤i}{Wi:1≤i} is iid exponential (λ)(λ), then Sn∼Sn∼
gamma (n,λ)(n,λ) for all n≥1n≥1.
This is worked out in the unit on TRANSFORM METHODS, in the discussion
of the connection between the
gamma distribution and the exponential distribution.
- Sn∼Sn∼ gamma (n,λ)(n,λ) for all n≥1n≥1, and S0=0S0=0,
iff Nt∼Nt∼ Poisson (λt)(λt) for all t>0t>0.
This follows the result in the unit DISTRIBUTION APPROXI9MATIONS on
the relationship between the Poisson and gamma
distributions, along with the fact that {Nt≥n}={Sn≤t}{Nt≥n}={Sn≤t}.
Remark. The counting process is a Poisson process in the sense that Nt∼Nt∼
Poisson (λt)(λt) for all t>0t>0. More advanced treatments
show that the process has independent, stationary increments. That is
- N(t+h)-N(t)=N(h)N(t+h)-N(t)=N(h) for all t,h>0t,h>0, and
- For t1<t2≤t3<t4≤⋯≤tm-1<tmt1<t2≤t3<t4≤⋯≤tm-1<tm, the class
{N(t2)-N(N1),N(t4)-N(t3),⋯,N(tm)-N(tm-1)}{N(t2)-N(N1),N(t4)-N(t3),⋯,N(tm)-N(tm-1)} is independent.
In words, the number of arrivals in any time interval depends upon the length of the
interval and not its location in time, and the numbers of arrivals in nonoverlapping
time intervals are independent.
Emergency calls arrive at a police switchboard with interarrival times (in hours)
exponential (15). Thus, the average interarrival time is 1/15 hour (four minutes).
What is the probability the number of calls in an eight hour shift is no more than
100, 120, 140?
p = 1 - cpoisson(8*15,[101 121 141])
p = 0.0347 0.5243 0.9669
We develop next a simple computational result for arrival processes for which Sn∼Sn∼ gamma (n,λ)(n,λ).
Suppose the arrival times Sn∼Sn∼ gamma (n,λ)(n,λ) and g is such that
∫
0
∞
|
g
|
<
∞
and
E
∑
n
=
1
∞
|
g
(
S
n
)
|
<
∞
∫
0
∞
|
g
|
<
∞
and
E
∑
n
=
1
∞
|
g
(
S
n
)
|
<
∞
(40)
Then
E
∑
n
=
1
∞
g
(
S
n
)
=
λ
∫
0
∞
g
E
∑
n
=
1
∞
g
(
S
n
)
=
λ
∫
0
∞
g
(41)
VERIFICATION
We use the countable sums property (E8b) for expectation and the corresponding property
for integrals to assert
E
∑
n
=
1
∞
g
(
S
n
)
=
∑
n
=
1
∞
E
[
g
(
S
n
)
]
=
∑
n
=
1
∞
∫
0
∞
g
(
t
)
f
n
(
t
)
d
t
where
f
n
(
t
)
=
λ
e
-
λ
t
(
λ
t
)
n
-
1
(
n
-
1
)
!
E
∑
n
=
1
∞
g
(
S
n
)
=
∑
n
=
1
∞
E
[
g
(
S
n
)
]
=
∑
n
=
1
∞
∫
0
∞
g
(
t
)
f
n
(
t
)
d
t
where
f
n
(
t
)
=
λ
e
-
λ
t
(
λ
t
)
n
-
1
(
n
-
1
)
!
(42)
We may apply (E8b) to assert
∑
n
=
1
∞
∫
0
∞
g
f
n
=
∫
0
∞
g
∑
n
=
1
∞
f
n
∑
n
=
1
∞
∫
0
∞
g
f
n
=
∫
0
∞
g
∑
n
=
1
∞
f
n
(43)
Since
∑
n
=
1
∞
f
n
(
t
)
=
λ
e
-
λ
t
∑
n
=
1
∞
(
λ
t
)
n
-
1
(
n
-
1
)
!
=
λ
e
-
λ
t
e
λ
t
=
λ
∑
n
=
1
∞
f
n
(
t
)
=
λ
e
-
λ
t
∑
n
=
1
∞
(
λ
t
)
n
-
1
(
n
-
1
)
!
=
λ
e
-
λ
t
e
λ
t
=
λ
(44)
the proposition is established.
A critical unit in a production system has life duration exponential (λ)(λ). Upon failure the unit is replaced immediately by a similar unit. Units fail independently.
Cost of replacement of a unit is c dollars. If money is discounted at a rate α,
then a dollar spent t units of time in the future has a current value e-αte-αt. If Sn is the time of replacement of the nth unit, then Sn∼Sn∼ gamma (n,λ)(n,λ)
and the present value of all future replacements is
C
=
∑
n
=
1
∞
c
e
-
α
S
n
C
=
∑
n
=
1
∞
c
e
-
α
S
n
(45)
The expected replacement cost is
E
[
C
]
=
E
∑
n
=
1
∞
g
(
S
n
)
where
g
(
t
)
=
c
e
-
α
t
E
[
C
]
=
E
∑
n
=
1
∞
g
(
S
n
)
where
g
(
t
)
=
c
e
-
α
t
(46)
Hence
E
[
C
]
=
λ
∫
0
∞
c
e
-
α
t
d
t
=
λ
c
α
E
[
C
]
=
λ
∫
0
∞
c
e
-
α
t
d
t
=
λ
c
α
(47)
Suppose unit replacement cost c=1200c=1200, average time (in years) to failure 1/λ=1/41/λ=1/4, and the discount rate per year α=0.08α=0.08 (eight percent). Then
E
[
C
]
=
1200
·
4
0
.
08
=
60
,
000
E
[
C
]
=
1200
·
4
0
.
08
=
60
,
000
(48)
Suppose the cost of the nth replacement in Example 11 is a random quantity Cn, with
{Cn,Sn}{Cn,Sn} independent and E[Cn]=cE[Cn]=c, invariant with n. Then
E
[
C
]
=
E
∑
n
=
1
∞
C
n
e
-
α
S
n
=
∑
n
=
1
∞
E
[
C
n
]
E
[
e
-
α
S
n
]
=
∑
n
=
1
∞
c
E
[
e
-
α
S
n
]
=
λ
c
α
E
[
C
]
=
E
∑
n
=
1
∞
C
n
e
-
α
S
n
=
∑
n
=
1
∞
E
[
C
n
]
E
[
e
-
α
S
n
]
=
∑
n
=
1
∞
c
E
[
e
-
α
S
n
]
=
λ
c
α
(49)
The analysis to this point assumes the process will continue endlessly into the future. Often,
it is desirable to plan for a specific, finite period. The result of Example 10 may be
modified easily to account for a finite period, often referred to as a finite horizon.
Under the conditions assumed in Example 10, above, let Nt be the counting random
variable for arrivals in the interval (0,t](0,t].
If
Z
t
=
∑
n
=
1
N
t
g
(
S
n
)
,
then
E
[
Z
t
]
=
λ
∫
0
t
g
(
u
)
d
u
If
Z
t
=
∑
n
=
1
N
t
g
(
S
n
)
,
then
E
[
Z
t
]
=
λ
∫
0
t
g
(
u
)
d
u
(50)
VERIFICATION
Since Nt≥nNt≥n iff Sn≤tSn≤t, ∑n=1Ntg(Sn)=∑n=0∞I(0,t](Sn)g(Sn)∑n=1Ntg(Sn)=∑n=0∞I(0,t](Sn)g(Sn). In the result of Example 10,
replace g by I(0,t]gI(0,t]g and note that
∫
0
∞
I
(
0
,
t
]
(
u
)
g
(
u
)
d
u
=
∫
0
t
g
(
u
)
d
u
∫
0
∞
I
(
0
,
t
]
(
u
)
g
(
u
)
d
u
=
∫
0
t
g
(
u
)
d
u
(51)
Under the conditions of Example 11, consider the replacement costs over a two-year
period.
SOLUTION
E
[
C
]
=
λ
c
∫
0
t
e
-
α
u
d
u
=
λ
c
α
(
1
-
e
-
α
t
)
E
[
C
]
=
λ
c
∫
0
t
e
-
α
u
d
u
=
λ
c
α
(
1
-
e
-
α
t
)
(52)
Thus, the expected cost for the infinite horizon λc/αλc/α is reduced by the
factor 1-e-αt1-e-αt. For t=2t=2 and the numbers in Example 11, the
reduction factor is 1-e-0.16=0.14791-e-0.16=0.1479 to give
E[C]=60000·0.1479=8,871.37E[C]=60000·0.1479=8,871.37.
In the important special case that g(u)=ce-αug(u)=ce-αu, the expression for
E∑n=1∞g(Sn)E∑n=1∞g(Sn) may be put into a form which
does not require the interarrival times to be exponential.
Suppose S0=0S0=0 and Sn=∑i=1nWiSn=∑i=1nWi, where {Wi:1≤i}{Wi:1≤i} is iid.
Let {Vn:1≤n}{Vn:1≤n} be a class such that each E[Vn]=cE[Vn]=c and each pair {Vn,Sn}{Vn,Sn}
is independent. Then for α>0α>0
E
[
C
]
=
E
∑
n
=
1
∞
V
n
e
-
α
S
n
=
c
·
M
W
(
-
α
)
1
-
M
W
(
-
α
)
E
[
C
]
=
E
∑
n
=
1
∞
V
n
e
-
α
S
n
=
c
·
M
W
(
-
α
)
1
-
M
W
(
-
α
)
(53)
where MW is the moment generating function for W.
DERIVATION
First we note that
E
[
V
n
e
-
α
S
n
]
=
c
M
S
n
(
-
α
)
=
c
M
W
n
(
-
α
)
E
[
V
n
e
-
α
S
n
]
=
c
M
S
n
(
-
α
)
=
c
M
W
n
(
-
α
)
(54)
Hence, by properties of expectation and the geometric series
E
[
C
]
=
c
∑
n
=
1
∞
M
W
n
(
-
α
)
=
M
W
(
-
α
)
1
-
M
W
(
-
α
)
,
provided
|
M
W
(
-
α
)
|
<
1
E
[
C
]
=
c
∑
n
=
1
∞
M
W
n
(
-
α
)
=
M
W
(
-
α
)
1
-
M
W
(
-
α
)
,
provided
|
M
W
(
-
α
)
|
<
1
(55)
Since α>0α>0 and W>0W>0, we have 0<e-αW<10<e-αW<1, so that
MW(-α)=E[e-αW]<1MW(-α)=E[e-αW]<1.
Suppose each Wi∼Wi∼ uniform (a,b)(a,b). Then (see Appendix C),
M
W
(
-
α
)
=
e
-
a
α
-
e
-
b
α
α
(
b
-
a
)
so
that
E
[
C
]
=
c
·
e
-
a
α
-
e
-
b
α
α
(
b
-
a
)
-
[
e
-
a
α
-
e
-
b
α
]
M
W
(
-
α
)
=
e
-
a
α
-
e
-
b
α
α
(
b
-
a
)
so
that
E
[
C
]
=
c
·
e
-
a
α
-
e
-
b
α
α
(
b
-
a
)
-
[
e
-
a
α
-
e
-
b
α
]
(56)
Let a=1a=1, b=5b=5, c=100c=100 and α=0.08α=0.08. Then,
a = 1;
b = 5;
c = 100;
A = 0.08;
MW = (exp(-a*A) - exp(-b*A))/(A*(b - a))
MW = 0.7900
EC = c*MW/(1 - MW)
EC = 376.1643