The least squares optimal filter design problem is quadratic
in the filter coefficients:
Eε2=
r
dd
0−2PTW+WTRW
ε
2
r
dd
0
2
P
W
W
R
W
If RR is positive definite, the
error surface
Eε2
w
0
w
1
…
w
M

1
ε
2
w
0
w
1
…
w
M

1
is a unimodal "bowl" in
ℝN
ℝ
N
.
The problem is to find the bottom of the bowl. In an
adaptive filter context, the shape and bottom of the bowl may
drift slowly with time; hopefully slow enough that the
adaptive algorithm can track it.
For a quadratic error surface, the bottom of the bowl can be
found in one step by computing
R1P
R
P
. Most modern nonlinear optimization methods (which
are used, for example, to solve the
LP
L
P
optimal IIR filter design problem!) locally
approximate a nonlinear function with a secondorder
(quadratic) Taylor series approximation and step to the bottom
of this quadratic approximation on each iteration. However, an
older and simpler appraoch to nonlinear optimaztion exists,
based on gradient descent.
The idea is to iteratively find the minimizer by computing
the gradient of the error function:
E
∇=∂Eε2∂
w
i
E
∇
w
i
ε
2
. The gradient is a vector in
ℝM
ℝ
M
pointing in the steepest uphill direction on the
error surface at a given point
Wi
W
i
, with
∇∇ having a
magnitude proportional to the slope of the error surface in
this steepest direction.
By updating the coefficient vector by taking a step
opposite the gradient direction :
Wi+1=Wi−μ∇i
W
i
1
W
i
μ
∇
i
, we go (locally) "downhill" in the steepest
direction, which seems to be a sensible way to iteratively
solve a nonlinear optimization problem. The performance
obviously depends on μμ; if
μμ is too large, the
iterations could bounce back and forth up out of the
bowl. However, if μμ is too
small, it could take many iterations to approach the
bottom. We will determine criteria for choosing
μμ later.
In summary, the gradient descent algorithm for solving the
Weiner filter problem is:
Guess
W0
Guess
W
0
do
i=1
,
∞
do
i
1
,
∇i=−(2P)+2RWi
∇
i
2
P
2
R
W
i
Wi+1=Wi−μ∇i
W
i
1
W
i
μ
∇
i
repeat
repeat
W
opt
=W∞
W
opt
W
The gradient descent idea is used in the LMS adaptive fitler
algorithm. As presented, this alogrithm costs
OM2
O
M
2
computations per iteration and doesn't appear very
attractive, but LMS only requires
OM
O
M
computations and is stable, so it is very attractive
when computation is an issue, even thought it converges more
slowly then the RLS algorithms we have discussed so far.
"A good introduction in adaptive filters, a major DSP application."