Let LL statistically independent
observations be obtained, each of which is expressed by
rl=θ+nl
r
l
θ
n
l
.
Each
nl
n
l
is a Gaussian random variable having zero mean and variance
σ
n
2
σ
n
2
. Thus, the unknown parameter in this problem is the
mean of the observations. Assume it to be a Gaussian random
variable a priori (mean
m
θ
m
θ
and variance
σ
θ
2
σ
θ
2
).
The likelihood function is easily found to be

pr|
θ
=∏l=0L−112π
σ
n
2e−(12rl−θ
σ
n
2)
p
θ
r
l
0
L
1
1
2
σ
n
2
1
2
r
l
θ
σ
n
2

(5)
so that the

a posteriori density is given by

pθ|
r
=12π
σ
θ
2e−(12θ−
m
θ
σ
θ
2)∏l=0L−112π
σ
n
2e−(12rl−θ
σ
n
2)pr
p
r
θ
1
2
σ
θ
2
1
2
θ
m
θ
σ
θ
2
l
0
L
1
1
2
σ
n
2
1
2
r
l
θ
σ
n
2
p
r

(6)
In an attempt to find the expected value of this distribution,
lump all terms that do not depend

*explicitly* on the quantity

θθ
into a proportionality term.

pθ|
r
∝e−(12(∑rl−θ2
σ
n
2+θ−
m
θ
2
σ
θ
2))
∝
p
r
θ
1
2
r
l
θ
2
σ
n
2
θ
m
θ
2
σ
θ
2

(7)
After some manipulation, this expression can be written as

pθ|
r
∝e−(12σ2θ−σ2(
m
θ
σ
θ
2+∑rl
σ
n
2)2)
∝
p
r
θ
1
2
σ
2
θ
σ
2
m
θ
σ
θ
2
r
l
σ
n
2
2

(8)
where

σ2
σ
2
is a quantity that succinctly expresses the ratio

σ
n
2
σ
θ
2
σ
n
2+L
σ
θ
2
σ
n
2
σ
θ
2
σ
n
2
L
σ
θ
2
. The form of the

a posteriori
density suggests that it too is Gaussian; its mean, and
therefore the MMSE estimate of

θθ, is given by

θ^MMSEr=σ2(
m
θ
σ
θ
2+∑rl
σ
n
2)
θ
MMSE
r
σ
2
m
θ
σ
θ
2
r
l
σ
n
2

(9)
More insight into the nature of this estimate is gained by
rewriting it as

θ^MMSEr=
σ
n
2L
σ
θ
2+
σ
n
2L
m
θ
+
σ
θ
2
σ
θ
2+
σ
n
2L(1L∑
l
=0L−1rl)
θ
MMSE
r
σ
n
2
L
σ
θ
2
σ
n
2
L
m
θ
σ
θ
2
σ
θ
2
σ
n
2
L
1
L
l
0
L
1
r
l

(10)
The term

σ
n
2L
σ
n
2
L
is the variance of the averaged observations for a given value
of

θθ; it expresses the
squared error encountered in estimating the mean by simple
averaging. If this error is much greater than the

a
priori variance of

θθ (

σ
n
2L≫
σ
θ
2
≫
σ
n
2
L
σ
θ
2
), implying that the observations are noisier than
the variation of the parameter, the MMSE estimate ignores the
observations and tends to yield the

a
priori mean

m
θ
m
θ
as its value. If the averaged observations are less variable
than the parameter, the second term dominates, and the average
of the observations is the estimate's value. This estimate
behavior between these extremes is very intuitive. The
detailed form of the estimate indicates how the squared error
can be minimized by a linear combination of these extreme
estimates.

The conditional expected value of the estimate equals

Eθ^MMSE|
θ
=
σ
n
2L
σ
θ
2+
σ
n
2L
m
θ
+
σ
θ
2
σ
θ
2+
σ
n
2Lθ
θ
θ
MMSE
σ
n
2
L
σ
θ
2
σ
n
2
L
m
θ
σ
θ
2
σ
θ
2
σ
n
2
L
θ

(11)
This estimate is biased because its expected value does not
equal the value of the sought-after parameter. It is
asymptotically unbiased as the squared measurement error

σ
n
2L
σ
n
2
L
tends to zero as

LL becomes
large. The consistency of the estimator is determined by
investigating the expected value of the squared error. Note
that the variance of the

a posteriori
density is the quantity

σ2
σ
2
; as this quantity does not depend on

rr, it also equals the
unconditional variance. As the number of observations
increases, this variance tends to zero. In concert with the
estimate being asymptotically unbiased, the expected value of
the estimation error thus tends to zero, implying that we have
a consistent estimate.