Connexions

You are here: Home » Content » Second-order Convergence Analysis of the LMS Algorithm and Misadjustment Error
Content Actions
Lenses

What is a lens?

Lenses

A lens is a custom view of Connexions content. You can think of it as a fancy kind of list that will let you see Connexions through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to Connexions materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual Connexions member, a community, or a respected organization.

This content is ...
In these lenses
  • This module is included inLens: richb's DSP resources
    By: Richard BaraniukAs a part of collection:"Adaptive Filters"

    Comments:

    "A good introduction in adaptive filters, a major DSP application."

    Click the "richb's DSP" link to see all content selected in this lens.

    richb's DSP
Tags

(?)

These tags come from the endorsement, affiliation, and other lenses that include this content.

Second-order Convergence Analysis of the LMS Algorithm and Misadjustment Error

Module by: Douglas L. Jones

Convergence of the mean (first-order analysis) is insufficient to guarantee desirable behavior of the LMS algorithm; the variance could still be infinite. It is important to show that the variance of the filter coefficients is finite, and to determine how close the average squared error is to the minimum possible error using an exact Wiener filter.
E ε k 2=E d k - W k T X k 2=E d k 2-2 d k X k T W k - W k T X k X k T W k = r dd 0-2 W k TP+ W k TR W k ε k 2 d k W k X k 2 d k 2 2 d k X k W k W k X k X k W k r dd 0 2 W k P W k R W k (1)
The minimum error is obtained using the Wiener filter W opt =R-1P W opt R P
ε min 2=Eε2= r dd 0-2PTR-1P+PTR-1RR-1P= r dd 0-PTR-1P ε min 2 ε 2 r dd 0 2 P R P P R R R P r dd 0 P R P (2)
To analyze the average error in LMS, write Equation 1 in terms of V ' =QW- W opt V ' Q W W opt , where Q-1ΛQ=R Q Λ Q R
E ε k 2= r dd 0-2 W k TP+ W k TR W k +- W k TR W opt - W opt TR W k + W opt TR W opt + W k TR W opt + W opt TR W k - W opt TR W opt = r dd 0+ V k TR V k -PTR-1P= ε min 2+ V k TR V k = ε min 2+ V k TQ-1QRQ-1Q V k = ε min 2+ V 'k TΛ V 'k ε k 2 r dd 0 2 W k P W k R W k W k R W opt W opt R W k W opt R W opt W k R W opt W opt R W k W opt R W opt r dd 0 V k R V k P R P ε min 2 V k R V k ε min 2 V k Q Q R Q Q V k ε min 2 V 'k Λ V 'k (3)
E ε k 2= ε min 2+j=0N-1 λ j E v j 'k 2 ε k 2 ε min 2 j 0 N 1 λ j v j 'k 2 So we need to know E v j 'k 2 v j 'k 2 , which are the diagonal elements of the covariance matrix of V 'k V 'k , or E V 'k V 'k T V 'k V 'k .
From the LMS update equation W k + 1 = W k +2μ ε k X k W k + 1 W k 2 μ ε k X k we get V 'k + 1 = W 'k +2μ ε k Q X k V 'k + 1 W 'k 2 μ ε k Q X k
𝒱 k + 1 =E V ' k + 1 V ' k + 1 T=E4μ2 ε k 2Q X k X k TQT= 𝒱 k +2μ ε k Q X k V 'k T+2μ ε k V 'k X k TQT+4μ2E ε k 2Q X k X k TQT 𝒱 k + 1 V ' k + 1 V ' k + 1 V 'k V 'k 2 μ ε k Q X k V 'k 2 μ ε k V 'k X k Q 4 μ 2 ε k 2 Q X k X k Q 𝒱 k 2 μ ε k Q X k V 'k 2 μ ε k V 'k X k Q 4 μ 2 ε k 2 Q X k X k Q (4)
Note that ε k = d k - W k T X k = d k - W opt T- V 'k TQ X k ε k d k W k X k d k W opt V 'k Q X k so
E ε k Q X k V 'k T=E d k Q X k V 'k T- W opt T X k Q X k V 'k T- V 'k TQ X k V 'k T=0+0-Q X k X k TQT V 'k V 'k T=-QE X k X k TQTE V 'k V 'k T=-Λ 𝒱 k ε k Q X k V 'k d k Q X k V 'k W opt X k Q X k V 'k V 'k Q X k V 'k 0 0 Q X k X k Q V 'k V 'k Q X k X k Q V 'k V 'k Λ 𝒱 k (5)
Note that the Patently False independence Assumption was invoked here.
To analyze E ε k 2Q X k X k TQT ε k 2 Q X k X k Q , we make yet another obviously false assumptioon that ε k 2 ε k 2 and X k X k are statistically independent. This is obviously false, since ε k = d k - W k T X k ε k d k W k X k . Otherwise, we get 4th-order terms in XX in the product. These can be dealt with, at the expense of a more complicated analysis, if a particular type of distribution (such as Gaussian) is assumed. See, for example Gardner. A questionable justification for this assumption is that as W k W opt W k W opt , W k W k becomes uncorrelated with X k X k (if we invoke the original independence assumption), which tends to randomize the error signal relative to X k X k . With this assumption, E ε k 2Q X k X k TQT=E ε k 2EQ X k X k TQT=E ε k 2Λ ε k 2 Q X k X k Q ε k 2 Q X k X k Q ε k 2 Λ Now ε k 2= ε min 2+ V 'k TΛ V 'k ε k 2 ε min 2 V 'k Λ V 'k so
E ε k 2= ε min 2+E λ j V j 'k 2= ε min 2+ λ j 𝒱 jj k ε k 2 ε min 2 j λ j V j 'k 2 ε min 2 j λ j 𝒱 jj k (6)
Thus, Equation 4 becomes
𝒱 k + 1 =I-4μΛ 𝒱 k +4μ2 λ j 𝒱 jj k Λ+4μ2 ε min 2Λ 𝒱 k + 1 I 4 μ Λ 𝒱 k 4 μ 2 j λ j 𝒱 jj k Λ 4 μ 2 ε min 2 Λ (7)
Now if this system is stable and converges, it converges to 𝒱 = 𝒱 + 1 𝒱 𝒱 + 1 4μΛ 𝒱 =4μ2 λ j 𝒱 jj + ε min 2Λ 4 μ Λ 𝒱 4 μ 2 j λ j 𝒱 jj ε min 2 Λ 𝒱 =μ λ j 𝒱 jj + ε min 2I 𝒱 μ j λ j 𝒱 jj ε min 2 I So it is a diagonal matrix with all elements on the diagonal equal:
Then 𝒱 ii =μ 𝒱 ii λ j + ε min 2 𝒱 ii μ 𝒱 ii j λ j ε min 2 𝒱 ii 1-μ λ j =μ ε min 2 𝒱 ii 1 μ j λ j μ ε min 2 𝒱 ii =μ ε min 21-μ λ j 𝒱 ii μ ε min 2 1 μ j λ j Thus the error in the LMS adaptive filter after convergence is
E ε 2= ε min 2+E V '∞ λ V '∞ = ε min 2+μ ε min 2 λ j 1-μ λ j = ε min 211-μ λ j = ε min 211-μtrR= ε min 211-μ r xx 0N ε 2 ε min 2 V '∞ λ V '∞ ε min 2 μ ε min 2 j λ j 1 μ j λ j ε min 2 1 1 μ j λ j ε min 2 1 1 μ tr R ε min 2 1 1 μ r xx 0 N (8)
E ε 2= ε min 211-μN σ x 2 ε 2 ε min 2 1 1 μ N σ x 2 (9)
1-μN σ x 2 1 μ N σ x 2 is called the misadjustment factor. Oftern, one chooses μμ to select a desired misadjustment factor, such as an error 10% higher than the Wiener filter error.

2nd-Order Convergence (Stability)

To determine the range for μμ for which Equation 7 converges, we must determine the μμ for which the matrix difference equation converges. 𝒱 k + 1 =I-4μΛ 𝒱 k +4μ2 λ j 𝒱 jj k Λ+4μ2 ε min 2Λ 𝒱 k + 1 I 4 μ Λ 𝒱 k 4 μ 2 j λ j 𝒱 jj k Λ 4 μ 2 ε min 2 Λ The off-diagonal elements each evolve independently according to 𝒱 ij k + 1 =1-4μ λ i 𝒱 ij k 𝒱 ij k + 1 1 4 μ λ i 𝒱 ij k These terms will decay to zero if i:4μ λ i <2 i 4 μ λ i 2 , or μ<12 λ max μ 1 2 λ max
The diagonal terms evolve according to 𝒱 ii k + 1 =1-4μ λ i 𝒱 ii k +4μ2 λ i λ j 𝒱 jj k +4μ2 ε min 2 λ i 𝒱 ii k + 1 1 4 μ λ i 𝒱 ii k 4 μ 2 λ i j λ j 𝒱 jj k 4 μ 2 ε min 2 λ i For the homoegeneous equation 𝒱 ii k + 1 =1-4μ λ i 𝒱 ii k +4μ2 λ i λ j 𝒱 jj k 𝒱 ii k + 1 1 4 μ λ i 𝒱 ii k 4 μ 2 λ i j λ j 𝒱 jj k for 1-4μ λ i 1 4 μ λ i positive,
𝒱 ii k + 1 1-4μ λ i 𝒱 iimax k +4μ2 λ i λ j 𝒱 jjmax k =1-4μ λ i +4μ2 λ i λ j 𝒱 jjmax k 𝒱 ii k + 1 1 4 μ λ i 𝒱 iimax k 4 μ 2 λ i j λ j 𝒱 jjmax k 1 4 μ λ i 4 μ 2 λ i j λ j 𝒱 jjmax k (10)
𝒱 ii k + 1 𝒱 ii k + 1 will be strictly less than 𝒱 jjmax k 𝒱 jjmax k for 1-4μ λ i +4μ2 λ i λ j <1 1 4 μ λ i 4 μ 2 λ i j λ j 1 or 4μ2 λ i λ j <4μ λ i 4 μ 2 λ i j λ j 4 μ λ i or
μ<1 λ j =1trR=1N r xx 0=1N σ x 2 μ 1 j λ j 1 tr R 1 N r xx 0 1 N σ x 2 (11)
This is a more rigorous bound than the first-order bounds. Ofter engineers choose μμ a few times smaller than this, since more rigorous analyses yield a slightly smaller bound. μ=μ3N σ x 2 μ μ 3 N σ x 2 is derived in some analyses assuming Gaussian x k x k , d k d k .