Let first discuss the problem when r =1. That is, consider a sequence of Bernoulli trials with probability p of success. This sequence is observed until the first success occurs. Let X denot the trial number on which the first success occurs.
For example, if F and S represent failure and success, respectively, and the sequence starts with F,F,F,S,…, then X =4. Moreover, because the trials are independent, the probability of such sequence is
P(
X=4
)=(
q
)(
q
)(
q
)(
p
)=
q
3
p=
(
1−p
)
3
p.
P(
X=4
)=(
q
)(
q
)(
q
)(
p
)=
q
3
p=
(
1−p
)
3
p.
In general, the p.d.f.
f(
x
)=P(
X=x
)
f(
x
)=P(
X=x
)
, of X is given by
f(
x
)=
(
1−p
)
x−1
p,
f(
x
)=
(
1−p
)
x−1
p,
x=1,2,...
x=1,2,...
, because there must be x -1 failures before the first success that occurs on trail x. We say that X has a geometric distribution.
for a geometric series, the sum is given by
∑
k=0
∞
a
r
k
=
∑
k=1
∞
a
r
k−1
=
a
1−r
,
∑
k=0
∞
a
r
k
=
∑
k=1
∞
a
r
k−1
=
a
1−r
,
when
| r |<1
| r |<1
.
Thus,
∑
x=1
∞
f(
x
)=
∑
x=1
∞
(
1−p
)
k−1
p=
p
1−(
1−p
)
=1
,
∑
x=1
∞
f(
x
)=
∑
x=1
∞
(
1−p
)
k−1
p=
p
1−(
1−p
)
=1
,
so that
f(
x
)
f(
x
)
does satisfy the properties of a p.d.f..
From the sum of geometric series we also note that, when k is an integer,
P(
X>k
)=
∑
x=k+1
∞
(
1−p
)
x−1
p=
(
1−p
)
k
p
1−(
1−p
)
=
(
1−p
)
k
=
q
k
,
P(
X>k
)=
∑
x=k+1
∞
(
1−p
)
x−1
p=
(
1−p
)
k
p
1−(
1−p
)
=
(
1−p
)
k
=
q
k
,
and thus the value of the distribution function at a positive integer k is
P(
X≤k
)=
∑
x=k+1
∞
(
1−p
)
x−1
p=1−P(
X>k
)=1−
(
1−p
)
k
=1−
q
k
.
P(
X≤k
)=
∑
x=k+1
∞
(
1−p
)
x−1
p=1−P(
X>k
)=1−
(
1−p
)
k
=1−
q
k
.
Some biology students were checking the eye color for a large number of fruit flies. For the individual fly, suppose that the probability of white eyes is
1
4
1
4
and the probability of red eyes is
3
4
3
4
, and that we may treat these flies as independent Bernoulli trials. The probability that at least four flies have to be checked for eye color to observe a white-eyed fly is given by
P(
X≥4
)=P(
X>3
)=
q
3
=
(
3
4
)
3
=0.422.
P(
X≥4
)=P(
X>3
)=
q
3
=
(
3
4
)
3
=0.422.
The probability that at most four flies have to be checked for eye color to observe a white-eyed fly is given by
P(
X≤4
)=1−
q
4
=1−
(
3
4
)
4
=0.684.
P(
X≤4
)=1−
q
4
=1−
(
3
4
)
4
=0.684.
The probability that the first fly with white eyes is the fourth fly that is checked is
P(
X=4
)=
q
4−1
p=
(
3
4
)
3
(
1
4
)=0.105.
P(
X=4
)=
q
4−1
p=
(
3
4
)
3
(
1
4
)=0.105.
It is also true that
P(
X=4
)=P(
X≤4
)−P(
X≤3
)=[
1−
(
3
4
)
4
]−[
1−
(
3
4
)
3
]=
(
3
4
)
3
(
1
4
).
P(
X=4
)=P(
X≤4
)−P(
X≤3
)=[
1−
(
3
4
)
4
]−[
1−
(
3
4
)
3
]=
(
3
4
)
3
(
1
4
).
In general,
f(
x
)=P(
X=x
)=
(
3
4
)
x−1
(
1
4
),x=1,2,3,...
f(
x
)=P(
X=x
)=
(
3
4
)
x−1
(
1
4
),x=1,2,3,...
To find a mean and variance for the geometric distribution, let use the following results about the sum and the first and second derivatives of a geometric series. For
−1<r<1
−1<r<1
, let
g(
r
)=
∑
k=0
∞
a
r
k
=
a
1−r
.
g(
r
)=
∑
k=0
∞
a
r
k
=
a
1−r
.
Then
g'(
r
)=
∑
k=1
∞
ak
r
k−1
=
a
(
1−r
)
2
,
g'(
r
)=
∑
k=1
∞
ak
r
k−1
=
a
(
1−r
)
2
,
and
g''(
r
)=
∑
k=2
∞
ak(
k−1
)
r
k−2
=
2a
(
1−r
)
3
.
g''(
r
)=
∑
k=2
∞
ak(
k−1
)
r
k−2
=
2a
(
1−r
)
3
.
If X has a geometric distribution and
0<p<1
0<p<1
, then the mean of X is given by
E(
X
)=
∑
x=1
∞
x
q
x−1
p=
p
(
1−q
)
2
=
1
p
,
E(
X
)=
∑
x=1
∞
x
q
x−1
p=
p
(
1−q
)
2
=
1
p
,
(1)
using the formula for
g'(
x
)
g'(
x
)
with
a=p
a=p
and
r=q
r=q
.
for example, that if p =1/4 is the probability of success, then
E(
X
)=1/(
1/4
)=4
E(
X
)=1/(
1/4
)=4
trials are needed on the average to observe a success.
To find the variance of X, let first find the second factorial moment
E[
X(
X−1
)
]
E[
X(
X−1
)
]
. We have
E[
X(
X−1
)
]=
∑
x=1
∞
x(
x−1
)
q
x−1
p=
∑
x=1
∞
pqx(
x−1
)
q
x−2
=
2pq
(
1−q
)
3
=
2q
p
2
.
E[
X(
X−1
)
]=
∑
x=1
∞
x(
x−1
)
q
x−1
p=
∑
x=1
∞
pqx(
x−1
)
q
x−2
=
2pq
(
1−q
)
3
=
2q
p
2
.
Using formula for
g''(
x
)
g''(
x
)
with
a=pq
a=pq
and
r=q
r=q
. Thus the variance of X is
Var(
X
)=E(
X
2
)−
[
E(
X
)
]
2
={
E[
X(
X−1
)
]+E(
X
)
}−
[
E(
X
)
]
2
=
=
2q
p
2
+
1
p
−
1
p
2
=
2q+p−1
p
2
=
1−p
p
2
.
Var(
X
)=E(
X
2
)−
[
E(
X
)
]
2
={
E[
X(
X−1
)
]+E(
X
)
}−
[
E(
X
)
]
2
=
=
2q
p
2
+
1
p
−
1
p
2
=
2q+p−1
p
2
=
1−p
p
2
.
The standard deviation of X is
σ=
(
1−p
)/
p
2
.
σ=
(
1−p
)/
p
2
.
Continuing with example 1, with p =1/4, we obtain
μ=
1
1/4
=4,
σ
2
=
3/4
(
1/4
)
2
=12,
μ=
1
1/4
=4,
σ
2
=
3/4
(
1/4
)
2
=12,
and
σ=
12
=3.464.
σ=
12
=3.464.