By far the easiest detection problem to solve occurs when the
noise vector consists of statistically independent, identically
distributed, Gaussian random variables. In this book, a
white sequence consists of statistically
independent random variables. The white sequence's mean is
usually taken to be zero
and each component's variance is
σ2
σ
2
. The equal-variance assumption implies the noise
characteristics are unchanging throughout the entire set of
observations. The probability density of the zero-mean noise
vector evaluated at
r-si
r
s
i
equals that of Gaussian random vector having independent components (
K=σ2I
K
σ
2
I
) with mean
si
s
i
.
pnr-
s
i
=12πσ2L2ⅇ-12σ2r-siTr-si
p
n
r
s
i
1
2
σ
2
L
2
1
2
σ
2
r
s
i
r
s
i
The resulting detection problem is similar to the Gaussian
example examined so frequently in the hypothesis testing
sections, with the distinction here being a non-zero mean under
both models. The logarithm of the likelihood ratio becomes
r-s0Tr-s0-r-s1Tr-s1
≷
ℳ
1
ℳ
0
2σ2lnη
r
s
0
r
s
0
r
s
1
r
s
1
≷
ℳ
1
ℳ
0
2
σ
2
η
and the usual simplifications yield in
rTs1-s1Ts12-rTs0-s0Ts02
≷
ℳ
1
ℳ
0
σ2lnη
r
s
1
s
1
s
1
2
r
s
0
s
0
s
0
2
≷
ℳ
1
ℳ
0
σ
2
η
The quantities in parentheses express the signal processing
operations for each model. If more than two signals were assumed
possible, quantities such as these would need to be computed for
each signal and the largest selected. This decision rule is
optimum for the additive, white Gaussian noise problem.
Each term in the computations for the optimum detector has a
signal processing interpretation. When expanded, the term
siTsi
s
i
s
i
equals
∑l=0L-1
s
i
2l
l
0
L
1
s
i
l
2
, which is the signal energy
E
i
E
i
.
The remaining term -
rTsi
r
s
i
- is the only one involving the observations and hence
constitutes the sufficient statistic
ϒ
i
r
ϒ
i
r
for the additive white Gaussian noise detection problem.
ϒ
i
r=rTsi
ϒ
i
r
r
s
i
An abstract, but physically relevant, interpretation of this
important quantity comes from the theory of linear vector
spaces. There, the quantity
rTsi
r
s
i
would be termed the dot product between
rr and
si
s
i
or the projection of rr onto
si
s
i
. By employing the Schwarz inequality, the largest value of this
quantity occurs when these vectors are proportional to each
other. Thus, a dot product computation measures how much alike
two vectors are: they are completely alike when they are
parallel (proportional) and completely dissimilar when
orthogonal (the dot product is zero). More precisely, the dot
product removes those components from the observations which are
orthogonal to the signal. The dot product thereby generalizes
the familiar notion of filtering a signal contaminated by
broadband noise. In filtering, the signal-to-noise ratio of a
bandlimited signal can be drastically improved by lowpass
filtering; the output would consist only of the signal and
"in-band" noise. The dot product serves a similar role, ideally
removing those "out-of-band" components (the orthogonal ones)
and retaining the "in-band" ones (those parallel to the signal).
Expanding the dot product,
rTsi=∑l=0L-1rl
s
i
l
r
s
i
l
0
L
1
r
l
s
i
l
another signal processing interpretation emerges. The dot
product now describes a finite impulse response (FIR) filtering
operation evaluated at a specific index. To demonstrate this
interpretation, let
hl
h
l
be the unit-sample response of a linear, shift-invariant filter
where
hl=0
h
l
0
for
l<0
l
0
and
l≥L
l
L
. Letting
rl
r
l
be the filter's input sequence, the convolution sum
expresses the output.
rk*hk=∑l=k-L-1krlhk-l
r
k
h
k
l
k
L
1
k
r
l
h
k
l
Letting
k=L-1
k
L
1
, the index at which the unit-sample response's last
value overlaps the input's value at the origin, we have
rk*hk|k=L-1=∑l=0L-1rlhL-1-l
k
L
1
r
k
h
k
l
0
L
1
r
l
h
L
1
l
If we set the unit-sample response equal to the index-reversed,
then delayed signal
hl=
s
i
L-1-l
h
l
s
i
L
1
l
, we have
rk*
s
i
L-1-k|k=L-1=∑l=0L-1rl
s
i
l
k
L
1
r
k
s
i
L
1
k
l
0
L
1
r
l
s
i
l
which equals the observation-dependent component of the optimal
detector's sufficient statistic. Figure 1 depicts these computations graphically.
The sufficient statistic for the
i
th
i
th
signal is thus expressed in signal processing notation as
rk*
s
i
L-1-k|k=L-1-
E
i
2
k
L
1
r
k
s
i
L
1
k
E
i
2
. The filtering term is called a matched
filter because the observations are passed through a
filter whose unit-sample response "matches" that of the signal
being sought. We sample the matched filter's output at the
precise moment when all of the observations fall within the
filter's memory and then adjust this value by half the signal
energy. The adjusted values for the two assumed signals are
subtracted and compared to a threshold.
To compute the performance probabilities, the expressions should
be simplified in the ways discussed in the hypothesis testing
sections. As the energy terms are known a
priori they can be incorporated into the threshold
with the result
∑l=0L-1rl
s
1
l-
s
0
l
≷
ℳ
1
ℳ
0
σ2lnη+
E
1
-
E
0
2
l
0
L
1
r
l
s
1
l
s
0
l
≷
ℳ
1
ℳ
0
σ
2
η
E
1
E
0
2
The left term constitutes the sufficient statistic for the binary
detection problem. Because the additive noise is presumed Gaussian,
the sufficient statistic is a Gaussian random variable no matter
which model is assumed. Under
ℳ
i
ℳ
i
, the specifics of this probability distribution are
∑l=0L-1rl
s
1
l-
s
0
l∼∑
s
i
l
s
1
l-
s
0
lσ2∑
s
1
l-
s
0
l2
l
0
L
1
r
l
s
1
l
s
0
l
s
i
l
s
1
l
s
0
l
σ
2
s
1
l
s
0
l
2
The false-alarm probability is given by
P
F
=Qσ2lnη+
E
1
-
E
0
2-∑
s
0
l
s
1
l-
s
0
lσ∑
s
1
l-
s
0
l212
P
F
Q
σ
2
η
E
1
E
0
2
s
0
l
s
1
l
s
0
l
σ
s
1
l
s
0
l
2
1
2
The signal-related terms in the numerator of this expression can
be manipulated with the false-alarm probability (and the
detection probability) for the optimal white Gaussian noise
detector succinctly expressed by
P
F
=Qlnη+12σ2∑
s
1
l-
s
0
l21σ∑
s
1
l-
s
0
l212
P
F
Q
η
1
2
σ
2
s
1
l
s
0
l
2
1
σ
s
1
l
s
0
l
2
1
2
P
F
=Qlnη-12σ2∑
s
1
l-
s
0
l21σ∑
s
1
l-
s
0
l212
P
F
Q
η
1
2
σ
2
s
1
l
s
0
l
2
1
σ
s
1
l
s
0
l
2
1
2
Note that the only signal-related quantity
affecting this performance probability (and all of the others)
is the ratio of energy in the difference signal to
the noise variance. The larger this ratio, the better
(smaller) the performance probabilities become. Note that the
details of the signal waveforms do not greatly affect the energy
of the difference signal. For example, consider the case where
the two signal energies are equal
(
E
0
=
E
1
=E
E
0
E
1
E
); the energy of the difference signal is given by
2E-2∑
s
0
l
s
1
l
2
E
2
s
0
l
s
1
l
. The largest value of this energy occurs when the
signals are negatives of each other, with the difference-signal
energy equaling
4E
4
E
. Thus, equal-energy but opposite-signed signals such
as sine waves, square-waves, Bessel functions,
etc. all yield exactly the same performance
levels. The essential signal properties that do yield good
performance values are elucidated by an alternate
interpretation. The term
∑
s
1
l-
s
0
l2
s
1
l
s
0
l
2
equals
∥s1-s0∥2
s
1
s
0
2
, the
L2
L
2
norm of the difference signal. Geometrically, the
difference-signal energy is the same quantity as the square of
the Euclidean distance between the two signals. In these terms,
a larger distance between the two signals will mean better
performance.
A common detection problem in array processing is to determine
whether a signal is present (
ℳ
1
ℳ
1
) or not (
ℳ
0
ℳ
0
) in the array output. In this case,
s
0
l=0
s
0
l
0
The optimal detector relies on filtering the array output with
a matched filter having an impulse response based on the
assumed signal. Letting the signal under
ℳ
1
ℳ
1
be denoted simply by
sl
s
l
, the optimal detector consists of
rl*sL-1-l|l=L-1-E2
≷
ℳ
1
ℳ
0
σ2lnη
l
L
1
r
l
s
L
1
l
E
2
≷
ℳ
1
ℳ
0
σ
2
η
or
rl*sL-1-l|l=L-1
≷
ℳ
1
ℳ
0
γ
l
L
1
r
l
s
L
1
l
≷
ℳ
1
ℳ
0
γ
The false-alarm and detection probabilities are given by
P
F
=QγE12σ
P
F
Q
γ
E
1
2
σ
P
D
=QQ-1
P
F
-Eσ
P
D
Q
Q
P
F
E
σ
Figure 2 displays the probability of detection as a
function of the signal-to-noise ratio
Eσ2
E
σ
2
for several values of false-alarm probability. Given an
estimate of the expected signal-to-noise ratio, these curves
can be used to assess the trade-off between the false-alarm
and detection probabilities.
The important parameter determining detector performance derived
in this example is the signal-to-noise ratio
Eσ2
E
σ
2
: the larger it is, the smaller the false-alarm
probability is (generally speaking). Signal-to-noise ratios can be
measured in many different ways. For example, one measure might be
the ratio of the rms signal amplitude to the rms noise
amplitude. Note that the important one for the detection problem
is much different. The signal portion is the
sum of the squared signal values over the
entire set of observed values - the signal
energy; the noise portion is the variance of
each noise component - the noise power. Thus,
energy can be increased in two ways that increase the
signal-to-noise ratio: the signal can be made larger
or the observations can be extended to
encompass a larger number of values.
To illustrate this point, two signals having the same energy are
shown in Figure 3. When these signals are shown in
the presence of additive noise, the signal is visible on the
left because its amplitude is larger; the one on the right is
much more difficult to discern. The instantaneous
signal-to-noise ratio-the ratio of signal amplitude to average
noise amplitude - is the important visual cue. However, the kind
of signal-to-noise ratio that determines detection performance
belies the eye. The matched filter outputs have similar maximal
values, indicating that total signal energy rather than
amplitude determines the performance of a matched filter
detector.
The optimal detection paradigm for the additive, white
Gaussian noise problem has a relatively simple solution:
construct FIR filters whose unit-sample responses are related
to the presumed signals and compare the filtered outputs with
a threshold. We may well wonder which assumptions made in this
problem are most questionable in "real-world"
applications. noise is additive in most cases. In many
situation, the additive noise present in observed data is
Gaussian. Because of the Central Limit Theorem, if numerous
noise sources impinge on a measuring device, their
superposition will be Gaussian to a great extent. As we know
from the discussion on the Central
Limit Theorem, glibly appealing to the Central Limit
Theorem is not without hazards; the non-Gaussian detection
problem will be discussed in some detail later. Interestingly,
the weakest assumption is the "whiteness" of the noise. Note
that the observation sequence is obtained as a result of
sampling the sensor outputs. Assuming
white noise samples does not mean that
the continuous-time noise was white. White noise in continuous
time has infinite variance and cannot be sampled;
discrete-time white noise has a finite variance with a
constant power spectrum. The Sampling Theorem suggests that a
signal is represented accurately by its samples only if we
choose a sampling frequency commensurate with the signal's
bandwidth. One should note that fidelity of representation
does not mean that the sample values are
independent. In most cases, satisfying the Sampling Theorem
means that the samples are correlated. As shown in Sampling and Random Sequences, the
correlation function of sampled noise equals samples of the
original correlation function. For the sampled noise to be
white,
En
l
1
Tn
l
2
T=0
n
l
1
T
n
l
2
T
0
for
l
1
≠
l
2
l
1
l
2
: the samples of the correlation function at locations
other than the origin must all be zero. While some correlation
functions have this property, many examples satisfy
the sampling theorem but do not yield uncorrelated
samples. In many practical situations,
undersampling the noise will reduce
inter-sample correlation. Thus, we obtain uncorrelated samples
either by deliberately undersampling, which wastes signal
energy, or by imposing anti-aliasing filters that have a
bandwidth larger than the signal and sampling at the signal's
Nyquist rate. Since the noise power spectrum usually extends to
higher frequencies than the signal, this intentional
undersampling can result in larger noise variance. in either
case, by trying to make the problem at hand match the solution,
we are actually reducing performance! We need a
direct approach to attacking the correlated
noise issue that arises in virtually all
sampled-data detection problems rather than trying to work
around it.