The adaptive filter,
W
W,
is adapted using the least mean-square algorithm, which is the most widely used adaptive filtering algorithm.
First the error signal,
en
e
n
,
is computed as
en=yn−
y
^
n
e
n
y
n
y
^
n
,
which measures the difference between the output of the adaptive filter
and the output of the unknown system. On the basis of this measure, the
adaptive filter will change its coefficients in an attempt to reduce the error.
The coefficient update relation is a function of the
error signal squared and is given by
h
n+1
i=
h
n
i+μ2(−∂|e|2∂
h
n
i
)
h
n+1
i
h
n
i
μ
2
h
n
i
e
2
(1)
The term inside the parentheses represents the gradient of the squared-error with respect to the iith coefficient. The gradient is a vector pointing in the direction of the change in filter
coefficients that will cause the greatest increase in the error signal.
Because the goal is to minimize the error, however, Equation 1 updates the filter coefficients in the
direction opposite the gradient; that is why the gradient term is negated.
The constant μμ is a step-size, which controls the
amount of gradient information used to update each coefficient.
After repeatedly adjusting each coefficient in the direction
opposite to the gradient of the error, the adaptive filter
should converge; that is, the difference between the
unknown and adaptive systems should get smaller and smaller.
To express the gradient decent coefficient update equation
in a more usable manner, we can rewrite the derivative of the
squared-error term as
∂|e|2∂
hi
=2∂e∂
hi
e
h
i
e
2
2
h
i
e
e
∂|e|2∂
hi
=2∂(y−
y
^
)∂
hi
e
h
i
e
2
2
h
i
y
y
^
e
∂|e|2∂
hi
=(2∂(y−∑
i
=0N−1hixn−i)∂
hi
)e
h
i
e
2
2
h
i
y
i
0
N
1
h
i
x
n
i
e
∂|e|2∂
hi
=2(−xn−i)e
h
i
e
2
2
x
n
i
e
(2)which in turn gives us the final LMS coefficient
update,
h
n+1
i=
h
n
i+μexn−i
h
n+1
i
h
n
i
μ
e
x
n
i
(3)The step-size
μ
μ directly affects how quickly the adaptive filter will converge toward the unknown system. If
μ
μ is very small, then the coefficients
change only a small amount at each update, and the filter converges slowly. With a larger step-size, more gradient
information is included in each update, and the filter converges more quickly; however, when the step-size is too large,
the coefficients may change too quickly and the filter will diverge. (It is possible in some cases to determine
analytically the largest value of
μμ ensuring convergence.)