Let
X
1
,
X
2
,...,
X
n
X
1
,
X
2
,...,
X
n
be a random sample from the exponential distribution with p.d.f.
f(
x;θ
)=
1
θ
e
−x/θ
,0<x<∞,θ∈Ω={
θ;0<θ<∞
}.
f(
x;θ
)=
1
θ
e
−x/θ
,0<x<∞,θ∈Ω={
θ;0<θ<∞
}.
The likelihood function is given by
L(
θ
)=L(
θ;
x
1
,
x
2
,...,
x
n
)=(
1
θ
e
−
x
1
/θ
)(
1
θ
e
−
x
2
/θ
)···(
1
θ
e
−
x
n
/θ
)=
1
θ
n
exp(
−
∑
i=1
n
x
i
θ
),0<θ<∞.
L(
θ
)=L(
θ;
x
1
,
x
2
,...,
x
n
)=(
1
θ
e
−
x
1
/θ
)(
1
θ
e
−
x
2
/θ
)···(
1
θ
e
−
x
n
/θ
)=
1
θ
n
exp(
−
∑
i=1
n
x
i
θ
),0<θ<∞.
The natural logarithm of
L(
θ
)
L(
θ
)
is
lnL(
θ
)=−(
n
)ln(
θ
)−
1
θ
∑
i=1
n
x
i
,0<θ<∞.
lnL(
θ
)=−(
n
)ln(
θ
)−
1
θ
∑
i=1
n
x
i
,0<θ<∞.
Thus,
d[
lnL(
θ
)
]
dθ
=
−n
θ
+
∑
i=1
n
x
i
θ
2
=0.
d[
lnL(
θ
)
]
dθ
=
−n
θ
+
∑
i=1
n
x
i
θ
2
=0.
The solution of this equation for
θ
θ
is
θ=
1
n
∑
i=1
n
x
i
=
x
¯
.
θ=
1
n
∑
i=1
n
x
i
=
x
¯
.
Note that,
d[
lnL(
θ
)
]
dθ
=
1
θ
(
−n+
n
x
¯
θ
)>0,θ<
x
¯
,
d[
lnL(
θ
)
]
dθ
=
1
θ
(
−n+
n
x
¯
θ
)>0,θ<
x
¯
,
d[
lnL(
θ
)
]
dθ
=
1
θ
(
−n+
n
x
¯
θ
)=0,θ=
x
¯
,
d[
lnL(
θ
)
]
dθ
=
1
θ
(
−n+
n
x
¯
θ
)=0,θ=
x
¯
,
d[
lnL(
θ
)
]
dθ
=
1
θ
(
−n+
n
x
¯
θ
)<0,θ>
x
¯
,
d[
lnL(
θ
)
]
dθ
=
1
θ
(
−n+
n
x
¯
θ
)<0,θ>
x
¯
,
Hence,
lnL(
θ
)
lnL(
θ
)
does have a maximum at
x
¯
x
¯
, and thus the maximum likelihood estimator for
θ
θ
is
θ
^
=
X
¯
=
1
n
∑
i=1
n
X
i
.
θ
^
=
X
¯
=
1
n
∑
i=1
n
X
i
.
This is both an unbiased estimator and the method of moments estimator for
θ
θ
.
Let
X
1
,
X
2
,...,
X
n
X
1
,
X
2
,...,
X
n
be a random sample from the geometric distribution with p.d.f.
f(
x;p
)=
(
1−p
)
x−1
p,x=1,2,3,....
f(
x;p
)=
(
1−p
)
x−1
p,x=1,2,3,....
The likelihood function is given by
L(
p
)=
(
1−p
)
x
1
−1
p
(
1−p
)
x
2
−1
p···
(
1−p
)
x
n
−1
p=
p
n
(
1−p
)
∑
x
i
−n
,0≤p≤1.
L(
p
)=
(
1−p
)
x
1
−1
p
(
1−p
)
x
2
−1
p···
(
1−p
)
x
n
−1
p=
p
n
(
1−p
)
∑
x
i
−n
,0≤p≤1.
The natural logarithm of
L(
θ
)
L(
θ
)
is
lnL(
p
)=nlnp+(
∑
i=1
n
x
i
−n
)ln(
1−p
),0<p<1.
lnL(
p
)=nlnp+(
∑
i=1
n
x
i
−n
)ln(
1−p
),0<p<1.
Thus restricting p to
0<p<1
0<p<1
so as to be able to take the derivative, we have
dlnL(
p
)
dp
=
n
p
−
∑
i=1
n
x
i
−n
1−p
=0.
dlnL(
p
)
dp
=
n
p
−
∑
i=1
n
x
i
−n
1−p
=0.
Solving for p, we obtain
p=
n
∑
i=1
n
x
i
=
1
x
¯
.
p=
n
∑
i=1
n
x
i
=
1
x
¯
.
So the maximum likelihood estimator of p is
p
^
=
n
∑
i=1
n
X
i
=
1
X
p
^
=
n
∑
i=1
n
X
i
=
1
X
Again this estimator is the method of moments estimator, and it agrees with the intuition because, in n observations of a geometric random variable, there are n successes in the
∑
i=1
n
x
i
∑
i=1
n
x
i
trials. Thus the estimate of p is the number of successes divided by the total number of trials.
Let
X
1
,
X
2
,...,
X
n
X
1
,
X
2
,...,
X
n
be a random sample from
N(
θ
1
,
θ
2
)
N(
θ
1
,
θ
2
)
, where
Ω=(
(
θ
1
,
θ
2
):−∞<
θ
1
<∞,0<
θ
2
<∞
).
Ω=(
(
θ
1
,
θ
2
):−∞<
θ
1
<∞,0<
θ
2
<∞
).
That is, here let
θ
1
=μ
θ
1
=μ
and
θ
2
=
σ
2
θ
2
=
σ
2
. Then
L(
θ
1
,
θ
2
)=
∏
i−1
n
(
1
2π
θ
2
exp[
−
(
x
i
−
θ
1
)
2
2
θ
2
]
) ,
L(
θ
1
,
θ
2
)=
∏
i−1
n
(
1
2π
θ
2
exp[
−
(
x
i
−
θ
1
)
2
2
θ
2
]
) ,
or equivalently,
L(
θ
1
,
θ
2
)=
(
1
2π
θ
2
)
n
exp[
−
−
∑
i=1
n
(
x
i
−
θ
1
)
2
2
θ
2
],(
θ
1
,
θ
2
)∈Ω.
L(
θ
1
,
θ
2
)=
(
1
2π
θ
2
)
n
exp[
−
−
∑
i=1
n
(
x
i
−
θ
1
)
2
2
θ
2
],(
θ
1
,
θ
2
)∈Ω.
The natural logarithm of the likelihood function is
lnL(
θ
1
,
θ
2
)=−
n
2
ln(
2π
θ
2
)−
−
∑
i=1
n
(
x
i
−
θ
1
)
2
2
θ
2
.
lnL(
θ
1
,
θ
2
)=−
n
2
ln(
2π
θ
2
)−
−
∑
i=1
n
(
x
i
−
θ
1
)
2
2
θ
2
.
The partial derivatives with respect to
θ
1
θ
1
and
θ
2
θ
2
are
∂(
lnL
)
∂
θ
1
=
1
θ
2
∑
i=1
n
(
x
i
−
θ
1
)
∂(
lnL
)
∂
θ
1
=
1
θ
2
∑
i=1
n
(
x
i
−
θ
1
)
and
∂(
lnL
)
∂
θ
2
=
−n
2
θ
2
+
1
2
θ
2
2
∑
i=1
n
(
x
i
−
θ
1
)
2
.
∂(
lnL
)
∂
θ
2
=
−n
2
θ
2
+
1
2
θ
2
2
∑
i=1
n
(
x
i
−
θ
1
)
2
.
The equation
∂(
lnL
)
∂
θ
1
=0
∂(
lnL
)
∂
θ
1
=0
has the solution
θ
1
=
x
¯
θ
1
=
x
¯
. Setting
∂(
lnL
)
∂
θ
2
=0
∂(
lnL
)
∂
θ
2
=0
and replacing
θ
1
θ
1
by
x
¯
x
¯
yields
θ
2
=
1
n
∑
i=1
n
(
x
i
−
x
¯
)
2
.
θ
2
=
1
n
∑
i=1
n
(
x
i
−
x
¯
)
2
.
By considering the usual condition on the second partial derivatives, these solutions do provide a maximum. Thus the maximum likelihood estimators
μ=
θ
1
μ=
θ
1
and
σ
2
=
θ
2
σ
2
=
θ
2
are
θ
^
1
=
X
¯
θ
^
1
=
X
¯
and
θ
^
2
=
1
n
∑
i=1
n
(
X
i
−
X
¯
)
2
.
θ
^
2
=
1
n
∑
i=1
n
(
X
i
−
X
¯
)
2
.
Where we compare the above example with the introductory one, we see that the method of moments estimators and the maximum likelihood estimators for
μ
μ
and
σ
2
σ
2
are the same. But this is not always the case. If they are not the same, which is better? Due to the fact that the maximum likelihood estimator of
θ
θ
has an approximate normal distribution with mean
θ
θ
and a variance that is equal to a certain lower bound, thus at least approximately, it is unbiased minimum variance estimator. Accordingly, most statisticians prefer the maximum likelihood estimators than estimators found using the method of moments.
Observations: k successes in n Bernoulli trials.
f(
x
)=
n!
x!(
n−x
)!
p
x
(
1−p
)
n−x
f(
x
)=
n!
x!(
n−x
)!
p
x
(
1−p
)
n−x
L(
p
)=
∏
i=1
n
f(
x
i
)=
∏
i=1
n
(
n!
x
i
!(
n−
x
i
)!
p
x
i
(
1−p
)
n−
x
i
)
=(
∏
i=1
n
n!
x
i
!(
n−
x
i
)!
)
p
x
i
(
1−p
)
n−
∑
i=1
n
x
i
L(
p
)=
∏
i=1
n
f(
x
i
)=
∏
i=1
n
(
n!
x
i
!(
n−
x
i
)!
p
x
i
(
1−p
)
n−
x
i
)
=(
∏
i=1
n
n!
x
i
!(
n−
x
i
)!
)
p
x
i
(
1−p
)
n−
∑
i=1
n
x
i
lnL(
p
)=
∑
i=1
n
x
i
lnp+(
n−
∑
i=1
n
x
i
)ln(
1−p
)
lnL(
p
)=
∑
i=1
n
x
i
lnp+(
n−
∑
i=1
n
x
i
)ln(
1−p
)
dlnL(
p
)
dp
=
1
p
∑
i=1
n
x
i
−(
n−
∑
i=1
n
x
i
)
1
1−p
=0
dlnL(
p
)
dp
=
1
p
∑
i=1
n
x
i
−(
n−
∑
i=1
n
x
i
)
1
1−p
=0
(
1−
p
^
)
∑
i=1
n
x
i
−(
n−
∑
i=1
n
x
i
)
p
^
p
^
(
1−
p
^
)
=0
(
1−
p
^
)
∑
i=1
n
x
i
−(
n−
∑
i=1
n
x
i
)
p
^
p
^
(
1−
p
^
)
=0
∑
i=1
n
x
i
−
p
^
∑
i=1
n
x
i
−n
p
^
+
∑
i=1
n
x
i
p
^
=0
∑
i=1
n
x
i
−
p
^
∑
i=1
n
x
i
−n
p
^
+
∑
i=1
n
x
i
p
^
=0
p
^
=
∑
i=1
n
x
i
n
=
k
n
p
^
=
∑
i=1
n
x
i
n
=
k
n
Observations:
x
1
,
x
2
,...,
x
n
x
1
,
x
2
,...,
x
n
,
f(
x
)=
λ
x
e
−λ
x!
,x=0,1,2,...
f(
x
)=
λ
x
e
−λ
x!
,x=0,1,2,...
L(
λ
)=
∏
i=1
n
(
λ
x
i
e
−λ
x
i
!
)
=
e
−λn
λ
∑
i=1
n
x
i
∏
i=1
n
x
i
L(
λ
)=
∏
i=1
n
(
λ
x
i
e
−λ
x
i
!
)
=
e
−λn
λ
∑
i=1
n
x
i
∏
i=1
n
x
i
lnL(
λ
)=−λn+
∑
i=1
n
x
i
lnλ−ln(
∏
i=1
n
x
i
)
lnL(
λ
)=−λn+
∑
i=1
n
x
i
lnλ−ln(
∏
i=1
n
x
i
)
dl
dλ
=−n+
∑
i=1
n
x
i
1
λ
dl
dλ
=−n+
∑
i=1
n
x
i
1
λ
−n+
∑
i=1
n
x
i
1
λ
=0
−n+
∑
i=1
n
x
i
1
λ
=0
λ
^
=
∑
i=1
n
x
i
n
λ
^
=
∑
i=1
n
x
i
n