The adaptive filter,
W
W, is adapted using the least mean-square
algorithm, which is the most widely used adaptive filtering
algorithm. First the error signal,
en
e
n
, is computed as
en=dn−yn
e
n
d
n
y
n
, which measures the difference between the output
of the adaptive filter and the output of the unknown system.
On the basis of this measure, the adaptive filter will
change its coefficients in an attempt to reduce the error.
The coefficient update relation is a function of the error
signal squared and is given by

h
n
+
1
i=
h
n
i+μ2(−∂|e|2∂
h
n
i
)
h
n
+
1
i
h
n
i
μ
2
h
n
i
e
2

(1)
The term inside the parentheses represents the gradient of
the squared-error with respect to the
i
th
i
th
coefficient. The gradient is a vector pointing in
the direction of the change in filter coefficients that will
cause the greatest increase in the error signal. Because
the goal is to minimize the error, however, Equation 1 updates the filter coefficients in the
direction opposite the gradient; that is why the gradient
term is negated. The constant
μ
μ is a step-size, which controls the amount of
gradient information used to update each coefficient. After
repeatedly adjusting each coefficient in the direction
opposite to the gradient of the error, the adaptive filter
should converge; that is, the difference between the unknown
and adaptive systems should get smaller and smaller.

To express the gradient decent coefficient update equation
in a more usable manner, we can rewrite the derivative of the
squared-error term as

∂|e|2∂
hi
=2∂e∂
hi
e=2∂(d−y)∂
hi
e=(2∂(d−∑
i
=0N−1hixn−i)∂
hi
)e
h
i
e
2
2
h
i
e
e
2
h
i
d
y
e
2
h
i
d
i
0
N
1
h
i
x
n
i
e

(2)
∂|e|2∂
hi
=2(−xn−i)e
h
i
e
2
2
x
n
i
e

(3)which in turn gives us the final LMS coefficient
update,

h
n+1
i=
h
n
i+μexn−i
h
n+1
i
h
n
i
μ
e
x
n
i

(4) The step-size

μ
μ directly affects how quickly the adaptive filter
will converge toward the unknown system. If

μ
μ is very small, then the coefficients change only a
small amount at each update, and the filter converges
slowly. With a larger step-size, more gradient information
is included in each update, and the filter converges more
quickly; however, when the step-size is too large, the
coefficients may change too quickly and the filter will
diverge. (It is possible in some cases to determine
analytically the largest value of

μ
μ ensuring convergence.)

Comments:"Real-Time DSP with MATLAB"