Skip to content Skip to navigation

Connexions

You are here: Home » Content » Second-order Convergence Analysis of the LMS Algorithm and Misadjustment Error

Navigation

Content Actions

  • Download module PDF
  • Add to ...
    Add the module to:
    • My Favorites
    • A lens
    • An external social bookmarking service
    • My Favorites (What is 'My Favorites'?)
      'My Favorites' is a special kind of lens which you can use to bookmark modules and collections directly in Connexions. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need a Connexions account to use 'My Favorites'.
    • A lens (What is a lens?)

      Definition of a lens

      Lenses

      A lens is a custom view of Connexions content. You can think of it as a fancy kind of list that will let you see Connexions through the eyes of organizations and people you trust.

      What is in a lens?

      Lens makers point to Connexions materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

      Who can create a lens?

      Any individual Connexions member, a community, or a respected organization.

      What are tags? tag icon

      Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

    • External bookmarks
  • E-mail the author
  • Rate this module (How does the rating system work?)

    Rating system

    Ratings

    Ratings allow you to judge the quality of modules. If other users have ranked the module then its average rating is displayed below. Ratings are calculated on a scale from one star (Poor) to five stars (Excellent).

    How to rate a module

    Hover over the star that corresponds to the rating you wish to assign. Click on the star to add your rating. Your rating should be based on the quality of the content. You must have an account and be logged in to rate content.

    (0 ratings)

Lenses

What is a lens?

Definition of a lens

Lenses

A lens is a custom view of Connexions content. You can think of it as a fancy kind of list that will let you see Connexions through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to Connexions materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual Connexions member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

This content is ...

In these lenses

  • richb's DSP display tagshide tags

    This module is included inLens: richb's DSP resources
    By: Richard BaraniukAs a part of collection:"Adaptive Filters"

    Comments:

    "A good introduction in adaptive filters, a major DSP application."

    Click the "richb's DSP" link to see all content selected in this lens.

    Click the tag icon tag icon to display tags associated with this content.

Recently Viewed

This feature requires Javascript to be enabled.

Tags

(What is a tag?)

These tags come from the endorsement, affiliation, and other lenses that include this content.

Second-order Convergence Analysis of the LMS Algorithm and Misadjustment Error

Module by: Douglas L. Jones

Note: Your browser may not currently support MathML. See our browser support page for additional details. You can always view the correct math in the PDF version.

Convergence of the mean (first-order analysis) is insufficient to guarantee desirable behavior of the LMS algorithm; the variance could still be infinite. It is important to show that the variance of the filter coefficients is finite, and to determine how close the average squared error is to the minimum possible error using an exact Wiener filter.

E ε k 2=E d k W k T X k 2=E d k 22 d k X k T W k W k T X k X k T W k = r dd 02 W k TP+ W k TR W k ε k 2 d k W k X k 2 d k 2 2 d k X k W k W k X k X k W k r dd 0 2 W k P W k R W k (1)
The minimum error is obtained using the Wiener filter W opt =R-1P W opt R P
ε min 2=Eε2= r dd 02PTR-1P+PTR-1RR-1P= r dd 0PTR-1P ε min 2 ε 2 r dd 0 2 P R P P R R R P r dd 0 P R P (2)
To analyze the average error in LMS, write Equation 1 in terms of V ' =QW W opt V ' Q W W opt , where Q-1ΛQ=R Q Λ Q R
E ε k 2= r dd 02 W k TP+ W k TR W k +- W k TR W opt W opt TR W k + W opt TR W opt + W k TR W opt + W opt TR W k W opt TR W opt = r dd 0+ V k TR V k PTR-1P= ε min 2+ V k TR V k = ε min 2+ V k TQ-1QRQ-1Q V k = ε min 2+ V 'k TΛ V 'k ε k 2 r dd 0 2 W k P W k R W k W k R W opt W opt R W k W opt R W opt W k R W opt W opt R W k W opt R W opt r dd 0 V k R V k P R P ε min 2 V k R V k ε min 2 V k Q Q R Q Q V k ε min 2 V 'k Λ V 'k (3)
E ε k 2= ε min 2+j=0N1 λ j E v j 'k 2 ε k 2 ε min 2 j 0 N 1 λ j v j 'k 2 So we need to know E v j 'k 2 v j 'k 2 , which are the diagonal elements of the covariance matrix of V 'k V 'k , or E V 'k V 'k T V 'k V 'k .

From the LMS update equation W k + 1 = W k +2μ ε k X k W k + 1 W k 2 μ ε k X k we get V 'k + 1 = W 'k +2μ ε k Q X k V 'k + 1 W 'k 2 μ ε k Q X k

𝒱 k + 1 =E V ' k + 1 V ' k + 1 T=E4μ2 ε k 2Q X k X k TQT= 𝒱 k +2μ ε k Q X k V 'k T+2μ ε k V 'k X k TQT+4μ2E ε k 2Q X k X k TQT 𝒱 k + 1 V ' k + 1 V ' k + 1 V 'k V 'k 2 μ ε k Q X k V 'k 2 μ ε k V 'k X k Q 4 μ 2 ε k 2 Q X k X k Q 𝒱 k 2 μ ε k Q X k V 'k 2 μ ε k V 'k X k Q 4 μ 2 ε k 2 Q X k X k Q (4)
Note that ε k = d k W k T X k = d k W opt T V 'k TQ X k ε k d k W k X k d k W opt V 'k Q X k so
E ε k Q X k V 'k T=E d k Q X k V 'k T W opt T X k Q X k V 'k T V 'k TQ X k V 'k T=0+0Q X k X k TQT V 'k V 'k T=-QE X k X k TQTE V 'k V 'k T=-Λ 𝒱 k ε k Q X k V 'k d k Q X k V 'k W opt X k Q X k V 'k V 'k Q X k V 'k 0 0 Q X k X k Q V 'k V 'k Q X k X k Q V 'k V 'k Λ 𝒱 k (5)
Note that the Patently False independence Assumption was invoked here.

To analyze E ε k 2Q X k X k TQT ε k 2 Q X k X k Q , we make yet another obviously false assumptioon that ε k 2 ε k 2 and X k X k are statistically independent. This is obviously false, since ε k = d k W k T X k ε k d k W k X k . Otherwise, we get 4th-order terms in XX in the product. These can be dealt with, at the expense of a more complicated analysis, if a particular type of distribution (such as Gaussian) is assumed. See, for example Gardner. A questionable justification for this assumption is that as W k W opt W k W opt , W k W k becomes uncorrelated with X k X k (if we invoke the original independence assumption), which tends to randomize the error signal relative to X k X k . With this assumption, E ε k 2Q X k X k TQT=E ε k 2EQ X k X k TQT=E ε k 2Λ ε k 2 Q X k X k Q ε k 2 Q X k X k Q ε k 2 Λ Now ε k 2= ε min 2+ V 'k TΛ V 'k ε k 2 ε min 2 V 'k Λ V 'k so

E ε k 2= ε min 2+E λ j V j 'k 2= ε min 2+ λ j 𝒱 jj k ε k 2 ε min 2 j λ j V j 'k 2 ε min 2 j λ j 𝒱 jj k (6)
Thus, Equation 4 becomes
𝒱 k + 1 =I4μΛ 𝒱 k +4μ2 λ j 𝒱 jj k Λ+4μ2 ε min 2Λ 𝒱 k + 1 I 4 μ Λ 𝒱 k 4 μ 2 j λ j 𝒱 jj k Λ 4 μ 2 ε min 2 Λ (7)
Now if this system is stable and converges, it converges to 𝒱 = 𝒱 + 1 𝒱 𝒱 + 1 4μΛ 𝒱 =4μ2 λ j 𝒱 jj + ε min 2Λ 4 μ Λ 𝒱 4 μ 2 j λ j 𝒱 jj ε min 2 Λ 𝒱 =μ λ j 𝒱 jj + ε min 2I 𝒱 μ j λ j 𝒱 jj ε min 2 I So it is a diagonal matrix with all elements on the diagonal equal:

Then 𝒱 ii =μ 𝒱 ii λ j + ε min 2 𝒱 ii μ 𝒱 ii j λ j ε min 2 𝒱 ii 1μ λ j =μ ε min 2 𝒱 ii 1 μ j λ j μ ε min 2 𝒱 ii =μ ε min 21μ λ j 𝒱 ii μ ε min 2 1 μ j λ j Thus the error in the LMS adaptive filter after convergence is

E ε 2= ε min 2+E V '∞ λ V '∞ = ε min 2+μ ε min 2 λ j 1μ λ j = ε min 211μ λ j = ε min 211μtrR= ε min 211μ r xx 0N ε 2 ε min 2 V '∞ λ V '∞ ε min 2 μ ε min 2 j λ j 1 μ j λ j ε min 2 1 1 μ j λ j ε min 2 1 1 μ tr R ε min 2 1 1 μ r xx 0 N (8)
E ε 2= ε min 211μN σ x 2 ε 2 ε min 2 1 1 μ N σ x 2 (9)
1μN σ x 2 1 μ N σ x 2 is called the misadjustment factor. Oftern, one chooses μμ to select a desired misadjustment factor, such as an error 10% higher than the Wiener filter error.

2nd-Order Convergence (Stability)

To determine the range for μμ for which Equation 7 converges, we must determine the μμ for which the matrix difference equation converges. 𝒱 k + 1 =I4μΛ 𝒱 k +4μ2 λ j 𝒱 jj k Λ+4μ2 ε min 2Λ 𝒱 k + 1 I 4 μ Λ 𝒱 k 4 μ 2 j λ j 𝒱 jj k Λ 4 μ 2 ε min 2 Λ The off-diagonal elements each evolve independently according to 𝒱 ij k + 1 =14μ λ i 𝒱 ij k 𝒱 ij k + 1 1 4 μ λ i 𝒱 ij k These terms will decay to zero if i:4μ λ i <2 i 4 μ λ i 2 , or μ<12 λ max μ 1 2 λ max

The diagonal terms evolve according to 𝒱 ii k + 1 =14μ λ i 𝒱 ii k +4μ2 λ i λ j 𝒱 jj k +4μ2 ε min 2 λ i 𝒱 ii k + 1 1 4 μ λ i 𝒱 ii k 4 μ 2 λ i j λ j 𝒱 jj k 4 μ 2 ε min 2 λ i For the homoegeneous equation 𝒱 ii k + 1 =14μ λ i 𝒱 ii k +4μ2 λ i λ j 𝒱 jj k 𝒱 ii k + 1 1 4 μ λ i 𝒱 ii k 4 μ 2 λ i j λ j 𝒱 jj k for 14μ λ i 1 4 μ λ i positive,

𝒱 ii k + 1 14μ λ i 𝒱 iimax k +4μ2 λ i λ j 𝒱 jjmax k =14μ λ i +4μ2 λ i λ j 𝒱 jjmax k 𝒱 ii k + 1 1 4 μ λ i 𝒱 iimax k 4 μ 2 λ i j λ j 𝒱 jjmax k 1 4 μ λ i 4 μ 2 λ i j λ j 𝒱 jjmax k (10)
𝒱 ii k + 1 𝒱 ii k + 1 will be strictly less than 𝒱 jjmax k 𝒱 jjmax k for 14μ λ i +4μ2 λ i λ j <1 1 4 μ λ i 4 μ 2 λ i j λ j 1 or 4μ2 λ i λ j <4μ λ i 4 μ 2 λ i j λ j 4 μ λ i or
μ<1 λ j =1trR=1N r xx 0=1N σ x 2 μ 1 j λ j 1 tr R 1 N r xx 0 1 N σ x 2 (11)
This is a more rigorous bound than the first-order bounds. Ofter engineers choose μμ a few times smaller than this, since more rigorous analyses yield a slightly smaller bound. μ=μ3N σ x 2 μ μ 3 N σ x 2 is derived in some analyses assuming Gaussian x k x k , d k d k .

References

  1. W.A. Gardner. (1984). Learning Characteristics of Stochastic-Gradient-Descent Algorithms: A General Study, Analysis, and Critique. Signal Processing, 6, 113-133.

Comments, questions, feedback, criticisms?

Send feedback