Skip to content Skip to navigation

OpenStax-CNX

You are here: Home » Content » Least Squares

Navigation

Lenses

What is a lens?

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

This content is ...

Affiliated with (What does "Affiliated with" mean?)

This content is either by members of the organizations listed or about topics related to the organizations listed. Click each link to see a list of all content affiliated with the organization.
  • Rice Digital Scholarship

    This module is included in aLens by: Digital Scholarship at Rice UniversityAs a part of collection: "Matrix Analysis"

    Click the "Rice Digital Scholarship" link to see all content affiliated with them.

Also in these lenses

  • Lens for Engineering

    This module is included inLens: Lens for Engineering
    By: Sidney Burrus

    Click the "Lens for Engineering" link to see all content selected in this lens.

Recently Viewed

This feature requires Javascript to be enabled.
 

Least Squares

Module by: CJ Ganier. E-mail the author

Summary: Describes the use of the least squares method for matrix analysis.

Introduction

We learned in the previous chapter that Ax=b Ax b need not possess a solution when the number of rows of A A exceeds its rank, i.e., r<m rm. As this situation arises quite often in practice, typically in the guise of 'more equations than unknowns,' we establish a rationale for the absurdity Ax=b Ax b .

The Normal Equations

The goal is to choose xx such that AxAx is as close as possible to bb. Measuring closeness in terms of the sum of the squares of the components we arrive at the 'least squares' problem of minimizing

res

Axb2=(Axb)T(Axb) Ax b 2 Ax b Ax b
(1)
over all xR x. The path to the solution is illuminated by the Fundamental Theorem. More precisely, we write bR , bN ,bRAbNAT:b=bR+bN bR bN bR A bN A b bR bN . On noting that (i) bR ,x n :(AxbR)A bR x n Ax bR A and (ii) AAT A A we arrive at the Pythagorean Theorem.

Pythagorean Theorem

norm2Axb=Ax bR + bN 2=Ax bR 2+bN2 norm Ax b 2 Ax bR bN 2 Ax bR 2 bN 2
(2)
It is now clear from the Pythagorean Theorem that the best xx is the one that satisfies
Ax=bR Ax bR
(3)
As bRA bR A this equation indeed possesses a solution. We have yet however to specify how one computes bR bR given bb. Although an explicit expression for bR bR, the so called orthogonal projection of bb onto A A, in terms of AA and bb is within our grasp we shall, strictly speaking, not require it. To see this, let us note that if xx satisfies the above equation then
Axb=AxbR+bN=bN Ax b Ax bR bN bN
(4)
As bN bN is no more easily computed than bR bR you may claim that we are just going in circles. The 'practical' information in the above equation however is that (Axb)AT Ax b A , i.e., AT(Axb)=0 A Ax b 0 , i.e.,
ATAx=ATb A Ax A b
(5)
As ATbAT A b A regardless of bb this system, often referred to as the normal equations, indeed has a solution. This solution is unique so long as the columns of ATA A A are linearly independent, i.e., so long as ATA=0 A A 0 . Recalling Chapter 2, Exercise 2, we note that this is equivalent to A=0 A 0 . We summarize our findings in

Theorem 1

The set of xbR x bR for which the misfit Axb2 Ax b 2 is smallest is composed of those xx for which ATAx=ATb A A x A b . There is always at least one such xx. There is exactly one such xx if A=0 A 0 .

As a concrete example, suppose with reference to Figure 1 that A=( 11 01 00 ) A 11 01 00 and b=( 1 1 1) b 1 1 1 .

Figure 1: The decomposition of bb.
Figure 1 (lsqfig.jpg)

As bA b A there is no xx such that Ax=b Ax b. Indeed, Axb2=x1+x2+-12+x212+11 Ax b 2 x1 x2 -1 2 x2 1 2 1 1 , with the minimum uniquely attained at x=( 0 1 ) x 0 1 , in agreement with the unique solution of the above equation, for ATA=( 11 12 ) A A 1 1 1 2 and ATb=( 1 2 ) A b 1 2 . We now recognize, a posteriori, that bR=Ax=( 1 1 0 ) bR Ax 1 1 0 is the orthogonal projection of bb onto the column space of AA.

Applying Least Squares to the Biaxial Test Problem

We shall formulate the identification of the 20 fiber stiffnesses in this previous figure, as a least squares problem. We envision loading, ff, the 9 nodes and measuring the associated 18 displacements, xx. From knowledge of xx and ff we wish to infer the components of K=diagk K diagk where kk is the vector of unknown fiber stiffnesses. The first step is to recognize that

ATKAx=f A K A x f
(6)
may be written as
B,B=ATdiagAx:Bk=f B B A diag Ax Bk f
(7)
Though conceptually simple this is not of great use in practice, for BB is 18-by-20 and hence the above equation possesses many solutions. The way out is to compute kk as the result of more than one experiment. We shall see that, for our small sample, 2 experiments will suffice. To be precise, we suppose that x1 x1 is the displacement produced by loading f1 f1 while x2 x2 is the displacement produced by loading f2 f2. We then piggyback the associated pieces in B=( ATdiagA x1 ATdiagA x2 ) B A diag A x1 A diag A x2 and f=( f1 f2 ) f f1 f2 This BB is 36-by-20 and so the system Bk=f Bk f is overdetermined and hence ripe for least squares.

We proceed then to assemble BB and ff. We suppose f1 f1 and f2 f2 to correspond to horizontal and vertical stretching

f1=( -100010-100010-100010 )T f1 -1 0 0 0 1 0 -1 0 0 0 1 0 -1 0 0 0 1 0
(8)
f2=( 0101010000000-10-10-1 )T f2 0 1 0 1 0 1 0 0 0 0 0 0 0 -1 0 -1 0 -1
(9)
respectively. For the purpose of our example we suppose that each kj=1 kj 1 except k8=5 k8 5 . We assemble ATKA A K A as in Chapter 2 and solve
ATKAxj=fj A K A xj fj
(10)
with the help of the pseudoinverse. In order to impart some `reality' to this problem we taint each xj xj with 10 percent noise prior to constructing BB. Regarding
BTBk=BTf B B k B f
(11)
we note that Matlab solves this system when presented with k=B\f when BB is rectangular. We have plotted the results of this procedure in the Figure 2. The stiff fiber is readily identified.

Figure 2: Results of a successful biaxial test.
Figure 2 (lsqtst.jpg)

Projections

From an algebraic point of view Equation 5)is an elegant reformulation of the least squares problem. Though easy to remember it unfortunately obscures the geometric content, suggested by the word 'projection,' of Equation 4. As projections arise frequently in many applications we pause here to develop them more carefully. With respect to the normal equations we note that if A=0 A 0 then

x=ATA-1ATb x A A -1 A b
(12)
and so the orthogonal projection of bb onto A A is:
bR=Ax=AATA-1ATb bR Ax A A A -1 A b
(13)
Defining
P=AATA-1AT P A A A -1 A
(14)
Equation 13 takes the form bR=Pb bR P b . Commensurate with our notion of what a 'projection' should be we expect that PP map vectors not in A A onto A A while leaving vectors already in A A unscathed. More succinctly, we expect that PbR=bR P bR bR , i.e., PbR=PbR P bR P bR . As the latter should hold for all bRm b Rm we expect that
P2=P P2 P
(15)
With respect to Equation 14 we find that indeed
P2=AATA-1ATAATA-1AT=AATA-1AT=P P2 A A A -1 A A A A -1 A A A A -1 A P
(16)
We also note that the PP in Equation 14 is symmetric. We dignify these properties through
Definition 1: orthogonal projection
A matrix PP that satisfies P2=P P2 P is called a projection. A symmetric projection is called an orthogonal projection.
We have taken some pains to motivate the use of the word 'projection.' You may be wondering however what symmetry has to do with orthogonality. We explain this in terms of the tautology
b=PbIb b Pb IP b
(17)
Now, if PP is a projection then so too is IP IP. Moreover, if PP is symmetric then the dot product of bb's two constituents is
(Pb)T(IP)b=bTPT(IP)b=bT(PP2)b=bT0b=0 Pb IP b b P IP b b P P2 b b 0 b 0
(18)
i.e., Pb P b is orthogonal to (IP)b IP b . As examples of a nonorthogonal projections we offer P=( 100 -1200 -14-121 ) P 1 0 0 -12 0 0 -14 -12 1 and IP IP. Finally, let us note that the central formula, P=AATA-1=AT P A A A -1 A , is even a bit more general than advertised. It has been billed as the orthogonal projection onto the column space of AA. The need often arises however for the orthogonal projection onto some arbitrary subspace MM. The key to using the old PP is simply to realize that every subspace is the column space of some matrix. More precisely, if
x1...xm x1 ... xm
(19)
is a basis for MM then clearly if these xj xj are placed into the columns of a matrix called AA then A=M A M . For example, if MM is the line through ( 11 )T 11 then
P=( 1 1 )12( 11 )=12( 11 11 ) P 1 1 12 11 12 1 1 1 1
(20)
is orthogonal projection onto MM.

Exercises

  1. Gilbert Strang was stretched on a rack to lengths =6 6 , 77, and 88 feet under applied forces of f=1 f1 , 22, and 44 tons. Assuming Hooke's law L=cf L cf , find his compliance, cc, and original height, LL, by least squares.
  2. With regard to the example of § 3 note that, due to the the random generation of the noise that taints the displacements, one gets a different 'answer' every time the code is invoked.
    1. Write a loop that invokes the code a statistically significant number of times and submit bar plots of the average fiber stiffness and its standard deviation for each fiber, along with the associated M--file.
    2. Experiment with various noise levels with the goal of determining the level above which it becomes difficult to discern the stiff fiber. Carefully explain your findings.
  3. Find the matrix that projects R3 3 onto the line spanned by ( 101 )T 1 0 1 .
  4. Find the matrix that projects R3 3 onto the plane spanned by ( 101 )T 1 0 1 and ( 11-1 )T 1 1 -1 .
  5. If PP is the projection of Rmm onto a kk--dimensional subspace MM, what is the rank of PP and what is RP P?

Content actions

Download module as:

Add module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks