Skip to content Skip to navigation

Connexions

You are here: Home » Content » LMS Algorithm Analysis

Navigation

Content Actions

  • Download module PDF
  • Add to ...
    Add the module to:
    • My Favorites
    • A lens
    • An external social bookmarking service
    • My Favorites (What is 'My Favorites'?)
      'My Favorites' is a special kind of lens which you can use to bookmark modules and collections directly in Connexions. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need a Connexions account to use 'My Favorites'.
    • A lens (What is a lens?)

      Definition of a lens

      Lenses

      A lens is a custom view of Connexions content. You can think of it as a fancy kind of list that will let you see Connexions through the eyes of organizations and people you trust.

      What is in a lens?

      Lens makers point to Connexions materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

      Who can create a lens?

      Any individual Connexions member, a community, or a respected organization.

      What are tags? tag icon

      Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

    • External bookmarks
  • E-mail the authors
  • Rate this module (How does the rating system work?)

    Rating system

    Ratings

    Ratings allow you to judge the quality of modules. If other users have ranked the module then its average rating is displayed below. Ratings are calculated on a scale from one star (Poor) to five stars (Excellent).

    How to rate a module

    Hover over the star that corresponds to the rating you wish to assign. Click on the star to add your rating. Your rating should be based on the quality of the content. You must have an account and be logged in to rate content.

    (0 ratings)

Recently Viewed

This feature requires Javascript to be enabled.

LMS Algorithm Analysis

Module by: Clayton Scott, Robert Nowak

Note: Your browser may not currently support MathML. See our browser support page for additional details. You can always view the correct math in the PDF version.

Objective

Minimize instantaneous squared error

e k 2w= y k xkTw2 e k w 2 y k x k w 2 (1)

LMS Algorithm

ŵk=ŵk1+μxk e k w k w k 1 μ x k e k (2)
Where wk w k is the new weight vector, wk1 w k 1 is the old weight vector, and μxk e k μ x k e k is a small step in the instantaneous error gradient direction.

Interpretation in Terms of Weight Error Vector

Define

vk=wkwopt v k w k w opt (3)
Where wopt w opt is the optimal weight vector and
ε k = y k xkTwopt ε k y k x k w opt (4)
where ε k ε k is the minimum error. The stochastic difference equation is:
vk=IμxkxkTvk1+μxk ε k v k I μ x k x k v k 1 μ x k ε k (5)

Convergence/Stability Analysis

Show that (tightness)

limBmax{PrvkB}=0 B v k B 0 (6)
With probability 1, the weight error vector is bounded for all kk.

Chebyshev's inequality is

PrvkBEvk2B2 v k B v k 2 B 2 (7)
and
PrvkB=1B2Evk2+σvk2 v k B 1 B 2 v k 2 v k (8)
where Evk2 v k 2 is the squared bias. If Evk2+σvk2 v k 2 v k is finite for all kk, then limBPrvkB=0 B v k B 0 for all kk.

Also,

σvk2=trEvkvkT v k tr v k v k (9)
Therefore σvk2 v k is finite if the diagonal elements of Γ k EvkvkT Γ k v k v k are bounded.

Convergence in Mean

Evk0 v k 0 as k k . Take expectation of Equation 5 using smoothing property to simplify the calculation. We have convergence in mean if

  1. Rxx R xx is positive definite (invertible).
  2. μ<2 λ max Rxx μ 2 λ max R xx .

Bounded Variance

Show that Γ k =EvkvkT Γ k v k v k , the weight vector error covariance is bounded for all kk.

note:

We could have Evk0 v k 0 , but σvk2 v k ; in which case the algorithm would not be stable.
Recall that it is fairly straightforward to show that the diagonal elements of the transformed covariance Ck=UΓkUT C k U Γ k U tend to zero if μ<1 λ max Rxx μ 1 λ max R xx ( U U is the eigenvector matrix of Rxx R xx ; Rxx=UDUT R xx U D U ). The diagonal elements of Ck C k were denoted by γ k , i i,i=1p γ k , i i i 1 p .

Note:

σvk2=trΓk=trUTCkU=trCk=i=1p γ k , i v k tr Γ k tr U C k U tr C k i 1 p γ k , i
Thus, to guarantee boundedness of σvk2 v k we need to show that the "steady-state" values γ k , i γ i < γ k , i γ i .

We showed that

γ i =μα+ σ ε 221μ λ i γ i μ α σ ε 2 2 1 μ λ i (10)
where σ ε 2=E ε k 2 σ ε 2 ε k 2 , λ i λ i is the i th i th eigenvalue of Rxx R xx ( Rxx=U λ 1 00 λ p UT R xx U λ 1 0 0 λ p U ), and α=c σ ε 21c α c σ ε 2 1 c .
0<c=12i=1pμ λ i 1μ λ i <1 0 c 1 2 i 1 p μ λ i 1 μ λ i 1 (11)
We found a sufficient condition for μμ that guaranteed that the steady-state γ i γ i 's (and hence σvk2 v k ) are bounded: μ<23i=1p λ i μ 2 3 i 1 p λ i Where i=1p λ i =trRxx i 1 p λ i tr R xx is the input vector energy.

With this choice of μμ we have:

  1. convergence in mean
  2. bounded steady-state variance
This implies
limBmax{PrvkB}=0 B v k B 0 (12)
In other words, the LMS algorithm is stable about the optimum weight vector wopt w opt .

Learning Curve

Recall that

e k = y k xkTwk1 e k y k x k w k 1 (13)
and Equation 4. These imply
e k = ε k xkTvk1 e k ε k x k v k 1 (14)
where vk=wkwopt v k w k w opt . So the MSE
E e k 2= σ ε 2+Evk1TxkxkTvk1= σ ε 2+EEvk1TxkxkTvk1| x n ε n n,n<k= σ ε 2+Evk1TRxxvk1= σ ε 2+EtrRxxvk1vk1T= σ ε 2+trRxxΓk1 e k 2 σ ε 2 v k 1 x k x k v k 1 σ ε 2 x n ε n n n k v k 1 x k x k v k 1 σ ε 2 v k 1 R xx v k 1 σ ε 2 tr R xx v k 1 v k 1 σ ε 2 tr R xx Γ k 1 (15)
Where trRxxΓk1 α k 1 α=c σ ε 21c tr R xx Γ k 1 α k 1 α c σ ε 2 1 c . So the limiting MSE is
ε =limkE e k 2= σ ε 2+c σ ε 21c= σ ε 21c ε k e k 2 σ ε 2 c σ ε 2 1 c σ ε 2 1 c (16)
Since 0<c<1 0 c 1 was required for convergence, ε > σ ε 2 ε σ ε 2 so that we see noisy adaptation leads to an MSE larger than the optimal
E ε k 2=E y k xkTwopt2= σ ε 2 ε k 2 y k x k w opt 2 σ ε 2 (17)
To quantify the increase in the MSE, define the so-called misadjustment:
M= ε σ ε 2 σ ε 2= ε σ ε 21=α σ ε 2=c1c M ε σ ε 2 σ ε 2 ε σ ε 2 1 α σ ε 2 c 1 c (18)
We would of course like to keep MM as small as possible.

Learning Speed and Misadjustment Trade-off

Fast adaptation and quick convergence require that we take steps as large as possible. In other words, learning speed is proportional to μμ; larger μμ means faster convergence. How does μμ affect the misadjustment?

To guarantee convergence/stability we require μ<23i=1p λ i Rxx μ 2 3 i 1 p λ i R xx Let's assume that in fact μ1i=1p λ i μ 1 i 1 p λ i so that there is no problem with convergence. This condition implies μ1 λ i μ 1 λ i or μ λ i 1 i,i=1p μ λ i 1 i i 1 p . From here we see that

c=12i=1pμ λ i 1μ λ i 12μi=1p λ i 1 c 1 2 i 1 p μ λ i 1 μ λ i 1 2 μ i 1 p λ i 1 (19)
This misadjustment
M=c1cc=12μi=1p λ i M c 1 c c 1 2 μ i 1 p λ i (20)
This shows that larger step size μμ leads to larger misadjustment.

Since we still have convergence in mean, this essentially means that with a larger step size we "converge" faster but have a larger variance (rattling) about wopt w opt .

Summary

small μμ implies

  • small misadjustment in steady-state
  • slow adaptation/tracking
large μμ implies
  • large misadjustment in steady-state
  • fast adaptation/tracking

Example 1

wopt=11 w opt 1 1 xk01001 x k 0 1 0 0 1 y k =xkTwopt+ ε k y k x k w opt ε k ε k 00.01 ε k 0 0.01

LMS Algorithm

initialization w0=00 w 0 0 0 and wk=wk1+μxk e k k,k1 w k w k 1 μ x k e k k k 1 , where e k = y k xkTwk1 e k y k x k w k 1

Figure 1: μ=0.05 μ 0.05
Learning Curve
Learning Curve (.png)
Figure 2: μ=0.3 μ 0.3
LMS Learning Curve
LMS Learning Curve (.png)
Figure 3
Comparison of Learning Curves
Comparison of Learning Curves (.png)

Comments, questions, feedback, criticisms?

Send feedback