When the additive Gaussian noise in the sensors' outputs is
colored (i.e., the noise values are
correlated in some fashion), the linearity of beamforming
algorithms means that the array processing output
rr also contains colored noise.
The solution to the colored-noise, binary detection problem
remains the likelihood ratio, but differs in the form of the
a priori densities. The noise will again be
assumed zero mean, but the noise vector has non-trivial
covariance matrix KK:
n∼0K
n
0
K
.
pnn=1det2πKⅇ-12nTK-1n
p
n
n
1
2
K
12
n
K
n
In this case, the logarithm of the likelihood ratio is
r-s1TK-1r-s1-r-s0TK-1r-s0
≷
ℳ
0
ℳ
1
2lnη
r
s
1
K
r
s
1
r
s
0
K
r
s
0
≷
ℳ
0
ℳ
1
2
η
which, after the usual simplifications, is written
rTK-1s1-s1TK-1s12-rTK-1s0-s0TK-1s02
≷
ℳ
0
ℳ
1
lnη
r
K
s
1
s
1
K
s
1
2
r
K
s
0
s
0
K
s
0
2
≷
ℳ
0
ℳ
1
η
The sufficient statistic for the colored Gaussian noise
detection problem is
ϒ
i
r=rTK-1si
ϒ
i
r
r
K
s
i
(1)
The quantities computed for each signal have a similar, but
more complicated interpretation than in the white noise case.
rTK-1si
r
K
s
i
is a dot product, but with respect to the so-called
kernel
K-1
K
. The effect of the kernel is to weight certain
components more heavily than others. A positive-definite
symmetric matrix (the covariance matrix is one such example) can
be expressed in terms of its eigenvectors and eigenvalues.
K-1=∑k=1L1
λ
k
vkvkT
K
k
1
L
1
λ
k
v
k
v
k
The sufficient statistic can thus be written as the complicated
summation
rTK-1si=∑k=1L1
λ
k
rTvkvkTsi
r
K
s
i
k
1
L
1
λ
k
r
v
k
v
k
s
i
where
λ
k λ
k and
vk v k
denote the
k th
k th
eigenvalue and eigenvector of the covariance matrix
KK. Each of the
constituent dot products is largest when the signal and the
observation vectors have strong components parallel to
vk v k
. However, the product of these dot
products is weighted by the reciprocal of the associated
eigenvalue. Thus, components in the observation vector parallel
to the signal will tend to be accentuated; those components
parallel to the eigenvectors having the
smaller eigenvalues will receive greater
accentuation than others. The usual notions of parallelism and
orthogonality become "skewed" because of the presence of the
kernel. A covariance matrix's eigenvalue has "units" of
variance; these accentuated directions thus correspond to small
noise variance. We can therefore view the weighted dot product
as a computation that is simultaneously trying to select
components in the observations similar to the signal, but
concentrating on those where the noise variance is small.
The second term in the expressions consistuting the optimal
detector are of the form
siTK-1si
s
i
K
s
i
. This quantity is a special case of the dot product
just discussed. The two vectors involved in this dot product
are identical; they are parallel by definition. The weighting
of the signal components by the reciprocal eigenvalues remains.
Recalling the units of the eigenvectors of KK,
siTK-1si
s
i
t
K
s
i
has the units of a signal-to-noise ratio, which is
computed in a way that enhances the contribution of those signal
components parallel to the "low noise" directions.
To compute the performance probabilities, we express the
detection rule in terms of the sufficient statistic.
rTK-1s1-s0
≷
ℳ
0
ℳ
1
lnη+12s1TK-1s1-s0TK-1s0
r
K
s
1
s
0
≷
ℳ
0
ℳ
1
η
12
s
1
K
s
1
s
0
K
s
0
The distribution of the sufficient statistic on the left side of
this equation is Gaussian because it consists as a linear
transformation of the Gaussian random vector rr. Assuming the
i
th
i
th
model to be true,
rTK-1s1-s0∼siTK-1s1-s0s1-s0TK-1s1-s0
r
K
s
1
s
0
s
i
K
s
1
s
0
s
1
s
0
K
s
1
s
0
The false-alarm probability for the optimal Gaussian colored
noise detector is given by
P
F
=Qlnη+12s1-s0TK-1s1-s0s1-s0TK-1s1-s012
P
F
Q
η
12
s
1
s
0
K
s
1
s
0
s
1
s
0
K
s
1
s
0
12
(2)
As in the white noise case, the important signal-related
quantity in this expression is the signal-to-noise ratio of the
difference signal. The distance interpretation of this quantity
remains, but the distance is now warped by the kernel's presence
in the dot product.
The sufficient statistic computed for each signal can be given
two signal processing interpretations in the colored noise case.
Both of these rest on considering the quantity
rTK-1si
r
K
s
i
as a simple dot product, but with different ideas on
grouping terms. The simplest is to group the kernel with the
signal so that the sufficient statistic is the dot product
between the observations and a modified
version of the signal
s
i
∼
=K-1si
s
i
∼
K
s
i
. This modified signal thus becomes the equivalent to
the unit-sample response of the matched filter. In this form,
the observed data are unaltered and passed through a matched
filter whose unit-sample response depends on both the signal and
the noise characteristics. The size of the noise covariance
matrix, equal to the number of observations used by the
detector, is usually large: hundreds if not thousands of samples
are possible. Thus, computation of the inverse of the noise
covariance matrix becomes an issue. This problem needs to be
solved only once if the noise characteristics are static; the
inverse can be precomputed on a general purpose computer using
well-established numerical algorithms. The signal-to-noise
ratio term of the sufficient statistic is the dot product of the
signal with the modified signal
s
i
∼
s
i
∼
. This view of
the receiver structure is shown in Figure 1.
A second and more theoretically powerful view of the
computations involved in the colored noise detector emerges when
we factor covariance matrix. The
Cholesky factorization of a positive-definite,
symmetric matrix (such as a covariance matrix or its inverse)
has the form
K=LDLT
K
L
D
L
. With this factorization, the sufficient statistic
can be written as
rTK-1si=D-1/2L-1rTD-1/2L-1si
r
K
s
i
D
-12
L
r
D
-12
L
s
i
The components of the dot product are multiplied by the same
matrix (
D-1/2L-1
D
-12
L
), which is lower-triangular. If
this matrix were also Toeplitz, the product of this kind between
a Toeplitz matrix and a vector would be equivalent to the
convolution of the components of the vector with the first
column of the matrix. If the matrix is not Toeplitz (which,
inconveniently, is the typical case), a convolution also
results, but with a unit-sample response that varies with the
index of the output--a time-varying, linear filtering operation.
The variation of the unit-sample response corresponds to the
different rows of the matrix
D-1/2L-1
D
-12
L
running backwards from the
main-diagonal entry. What is the physical interpretation of the
action of this filter? The covariance of the random vector
x=Ar
x
A
r
is given by
K
x
=A
K
r
AT
K
x
A
K
r
A
. Applying this result to the current situation, we
set
A=D-1/2L-1
A
D
-12
L
and
K
r
=K=LDLT
K
r
K
L
D
L
with the result that the covariance matrix
K
x
K
x
is the identity matrix! Thus, the matrix
D-1/2L-1
D
-12
L
corresponds to a (possibly time-varying)
whitening filter: we have converted the
colored-noise component of the observed data to white noise! As
the filter is always linear, the Gaussian observation noise
remains Gaussian at the output. Thus, the colored noise problem
is converted into a simpler one with the whitening filter: the
whitened observations are first match-filtered with the
"whitened" signal
si+=D-1/2L-1si
s
i
+
D
-12
L
s
i
(whitened with respect to noise characteristics only)
then half the energy of the whitened signal is subtracted
(Figure 1).
To demonstrate the interpretation of the Cholesky
factorization of the covariance unit matrix as a time-varying
whitening filter, consider the covariance matrix
K=1aa2a3a1aa2a2a1aa3a2a1
K
1
a
a
2
a
3
a
1
a
a
2
a
2
a
1
a
a
3
a
2
a
1
This covariance matrix indicates that the nosie was produced
by passing white Gaussian noise through a first-order filter
having coefficient aa:
nl=anl-1+wl
n
l
a
n
l
1
w
l
, where
wl
w
l
is unit-variance white noise. Thus, we would expect
that if a whitening filter emerged from the matrix
manipulations (derived just below), it would be a first-order
FIR filter having a unit-sample response proportional to
hl=1ifl=0-aifl=10otherwise
h
l
1
l
0
a
l
1
0
Simple arithmetic calculations of the Cholesky decomposition
suffice to show that the matrices LL and DD are given by
L=1000a100a2a10a3a2a1
L
1
0
0
0
a
1
0
0
a
2
a
1
0
a
3
a
2
a
1
D=100001-a200001-a200001-a2
D
1
0
0
0
0
1
a
2
0
0
0
0
1
a
2
0
0
0
0
1
a
2
and that their inverses are
L-1=1000-a1000-a1000-a1
L
1
0
0
0
a
1
0
0
0
a
1
0
0
0
a
1
D-1=1000011-a2000011-a2000011-a2
D
1
0
0
0
0
1
1
a
2
0
0
0
0
1
1
a
2
0
0
0
0
1
1
a
2
Because DD is
diagonal, the matrix
D-1/2
D
-12
equals the term-by-term square root of the inverse
of DD. The product
of interest here is therefore given by
D-1/2L-1=1000-a1-a211-a2000-a1-a211-a2000-a1-a211-a2
D
-12
L
1
0
0
0
a
1
a
2
1
1
a
2
0
0
0
a
1
a
2
1
1
a
2
0
0
0
a
1
a
2
1
1
a
2
Let
r
∼
r
∼
express the product
D-1/2L-1r
D
-12
L
r
. This vector's elements are given by
r
0
∼
=
r
0
r
1
∼
=11-a2
r
1
-a
r
0
…
r
0
∼
r
0
r
1
∼
1
1
a
2
r
1
a
r
0
…
Thus, the expected FIR whitening filter emerges after the
first term. The expression could not be
of this form as no observations were assumed to precede
r 0
r 0
. This edge effect is the source of
the time-varying aspect of the whitening filter. If the
system modeling the noise generation process has only poles,
this whitening filter will always stabilize - not vary with
time - once sufficient data are present within the memory of
the FIR inverse filter. In contrast, the presence of zeros in
the generation system would imply an IIR whitening filter.
With finite data, the unit-sample response would then change
on each output sample.