As a popular alternative to the Kalman filter, we will
investigate the so-called
The price we pay with LMS instead of a Kalman filter is that the
rate of convergence and adaptation to sudden changes is slower for
LMS than for the Kalman filter (with correct prior assumptions).
Most adaptive filtering alogrithms (LMS
included) are modifications of standard iterative procedures
for solving minimization problems in a real-time
or on-line fashion. Therefore, before deriving
the LMS algorithm we will look at iterative methods of
minimizing error criteria such as MSE.
Conider the following set-up:
x
k
:
observation
x
k
:
observation
y
k
:
signal to be estimated
y
k
:
signal to be estimated
ŷk=
w
1
x
k
+
w
2
x
k
1
+…+
w
p
x
k
p
1
y
k
w
1
x
k
w
2
x
k
1
…
w
p
x
k
p
1
(1)
Impulse response of the filter:
…
,
0
,
0
,
w
1
,
w
2
,
…
w
p
,
0
,
0
,
…
…
,
0
,
0
,
w
1
,
w
2
,
…
w
p
,
0
,
0
,
…
ŷk=xkTw
y
k
x
k
w
(2)
Where
xk=
x
k
x
k
1
⋮
x
k
p
1
x
k
x
k
x
k
1
⋮
x
k
p
1
and
w=
w
1
w
2
⋮
w
p
w
w
1
w
2
⋮
w
p
e
k
=
y
k
-ŷk=
y
k
-xkTw
e
k
y
k
y
k
y
k
x
k
w
(3)
(
xk
,
y
k
)
(
x
k
,
y
k
)
are jointly stationary with zero-mean.
E
e
k
2=E
y
k
-xkTw2=E
y
k
2-2wTExk
y
k
+wTExkxkTw=
R
yy
-2wTRxy+wTRxxw
e
k
2
y
k
x
k
w
2
y
k
2
2
w
x
k
y
k
w
x
k
x
k
w
R
yy
2
w
R
xy
w
R
xx
w
(4)
Where
R
yy
R
yy
is the variance of
y
k
2
y
k
2
,
Rxx
R
xx
is the covariance matrix of
xk
x
k
, and
Rxy=Exk
y
k
R
xy
x
k
y
k
is the cross-covariance between
xk
x
k
and
y
k
y
k
The MSE is quadratic in
W W which implies the MSE
surface is "bowl" shaped with a unique minimum point (
Figure 8).
Minimize MSE:
∂∂wE
e
k
2=-2Rxy+2Rxxw=0⇒wopt=Rxx-1Rxy
w
e
k
2
-2
R
xy
2
R
xx
w
0
w
opt
R
xx
R
xy
(5)
Notice that we can re-write
Equation 5 as
ExkxkTw=Exk
y
k
x
k
x
k
w
x
k
y
k
(6)
or
Exk
y
k
-xkTw=Exk
e
k
=0
x
k
y
k
x
k
w
x
k
e
k
0
(7)
Which shows that the error signal is orthogonal to the input
xk
x
k
(by the
orthogonality principle of
minimum MSE estimator).
Although we can easily determine
wopt
w
opt
by solving the system of equations
Rxxw=Rxy
R
xx
w
R
xy
(8)
Let's look at an iterative procedure for solving this
problem. This will set the stage for our adaptive filtering
algorithm.
We want to minimize the MSE. The idea is
simple. Starting at some initial weight vector
w0
w
0
, iteratively adjust the values to decrease the
MSE (Figure 9).
We want to
move
w0
w
0
towards the optimal vector
wopt
w
opt
. In order to move in the correct direction, we
must move
downhill or in the direction opposite
to the gradient of the MSE surface at the point
w0
w
0
. Thus, a natural and simple adjustment takes the form
w1=w0-12μ∂∂wE
e
k
2|w=w0
w
1
w
0
1
2
μ
w
w
0
w
e
k
2
(9)
Where
μμ is the step size and
tells us how far to move in the negative gradient direction
(
Figure 10).
Generalizing this idea to an iterative strategy, we get
wk=wk-1-12μ∂∂wE
e
k
2|w=wk-1
w
k
w
k
1
1
2
μ
w
w
k
1
w
e
k
2
(10)
and we can repeatedly update
w
w:
w0
,
w1
,
…
,
wk
w
0
,
w
1
,
…
,
w
k
. Hopefully each subsequent
wk
w
k
is closer to
wopt
w
opt
. Does the procedure converge? Can we adapt it to
an on-line, real-time, dynamic situation in which the
signals may not be stationary?