We learned in the previous chapter that
Ax=b
Ax
b
need not possess a solution when the number of rows of
A
A exceeds its rank, i.e.,
r<m
rm. As this situation arises quite often in practice,
typically in the guise of 'more equations than unknowns,' we
establish a rationale for the absurdity
Ax=b
Ax
b
.
The goal is to choose xx such
that AxAx is as
close as possible to
bb. Measuring closeness in terms
of the sum of the squares of the components we arrive at the
'least squares' problem of minimizing
∥Ax−b∥2=Ax−bTAx−b
Ax
b
2
Ax
b
Ax
b
(1)
over all
x∈ℝ
x. The path to the solution is illuminated by the
Fundamental Theorem. More precisely, we write
∀bR,bN,bR∈ℝA∧bN∈ℕAT:b=bR+bN
bR
bN
bR
ℝA
bN
ℕ
A
b
bR
bN
. On noting that (i)
∀bR,x∈ℝn:Ax−bR∈ℝA
bR
x
ℝn
Ax
bR
ℝA
and (ii)
ℝA⊥ℕAT
⊥
ℝA
ℕ
A
we arrive at the Pythagorean Theorem.
norm2Ax−b=∥Ax−bR+bN∥2=∥Ax−bR∥2+∥bN∥2
norm
Ax
b
2
Ax
bR
bN
2
Ax
bR
2
bN
2
(2)
It is now clear from
the Pythagorean Theorem that
the best
xx is the one that
satisfies
Ax=bR
Ax
bR
(3)
As
bR∈ℝA
bR
ℝA
this equation indeed possesses a solution.
We have yet however to specify how one computes
bR
bR
given
bb.
Although an explicit expression for
bR
bR, the so called
orthogonal projection
of
bb onto
ℝA
ℝA,
in terms of
AA and
bb is within our grasp we shall,
strictly speaking, not require it. To see this, let us note
that if
xx satisfies
the above equation then
Ax−b=Ax−bR+bN=-bN
Ax
b
Ax
bR
bN
bN
(4)
As
bN
bN is no more easily computed than
bR
bR you may claim that we are just going in circles. The
'practical' information in
the
above equation however is that
Ax−b∈AT
Ax
b
A
,
i.e.,
ATAx−b=0
A
Ax
b
0
,
i.e.,
ATAx=ATb
A
Ax
A
b
(5)
As
ATb∈ℝAT
A
b
ℝ
A
regardless of
bb this
system, often referred to as the
normal
equations, indeed has a solution. This solution is
unique so long as the columns of
ATA
A
A
are linearly independent,
i.e.,
so long as
ℕATA=0
ℕ
A
A
0
. Recalling Chapter 2, Exercise
2, we note that this is equivalent to
ℕA=0
ℕA
0 . We
summarize our findings in
The set of
x∈bR
x
bR
for which the misfit
∥Ax−b∥2
Ax
b
2
is smallest is composed of those
xx for which
ATAx=ATb
A
A
x
A
b
. There is always at least one such
xx. There is exactly one
such xx if
ℕA=0
ℕA
0
.
As a concrete example, suppose with reference to Figure 1 that
A=110100
A
11
01
00
and
b=111
b
1
1
1
.
As
b≠ℝA
b
ℝA
there is no
xx such that
Ax=b
Ax
b. Indeed,
∥Ax−b∥2=x1+x2+-12+x2−12+1≥1
Ax
b
2
x1
x2
-1
2
x2
1
2
1
1
,
with the minimum uniquely attained at
x=01
x
0
1
,
in agreement with the unique solution of the above equation, for
ATA=1112
A
A
1 1
1 2
and
ATb=12
A
b
1
2
.
We now recognize, a posteriori, that
bR=Ax=110
bR
Ax
1
1
0
is the orthogonal projection of
bb onto the column space of
AA.
We shall formulate the identification of the 20 fiber
stiffnesses in this previous figure, as
a least squares problem. We envision loading,
ff, the 9 nodes and measuring the
associated 18 displacements,
xx. From knowledge of
xx and
ff we wish to infer the
components of K=diagk K
diagk
where kk is
the vector of unknown fiber stiffnesses. The first step is to
recognize that
ATKAx=f
A
K
A
x
f
(6)
may be written as
∀B,B=ATdiagAx:Bk=f
B
B
A
diag
Ax
Bk
f
(7)
Though conceptually simple this is not of great use in
practice, for
BB is 18-by-20 and
hence
the above
equation possesses many solutions. The way out is to
compute
kk as the result of more
than one experiment. We shall see that, for our small sample,
2 experiments will suffice.
To be precise, we suppose that
x1
x1
is the displacement produced by loading
f1
f1
while
x2
x2
is the displacement produced by loading
f2
f2. We
then piggyback the associated pieces in
B=ATdiagAx1ATdiagAx2
B
A
diag
A
x1
A
diag
A
x2
and
f=f1f2
f
f1
f2
This
BB is 36-by-20 and so the system
Bk=f
Bk
f
is overdetermined and hence ripe for least squares.
We proceed then to assemble BB
and ff. We suppose
f1
f1
and
f2
f2
to correspond to horizontal and vertical stretching
f1=-100010-100010-100010T
f1
-1 0
0 0 1
0 -1 0
0 0 1
0 -1 0
0 0 1
0
(8)
f2=0101010000000-10-10-1T
f2
0 1
0 1 0
1 0 0
0 0 0
0 0 -1
0 -1 0
-1
(9)
respectively. For the purpose of our example we suppose that each
kj=1
kj
1
except
k8=5
k8
5
. We assemble
ATKA
A
K
A
as in Chapter 2 and
solve
ATKAxj=fj
A
K
A
xj
fj
(10)
with the help of the pseudoinverse. In order to impart some
`reality' to this problem we taint each
xj
xj
with 10 percent noise prior to constructing
BB. Regarding
BTBk=BTf
B
B
k
B
f
(11)
we note that Matlab solves
this system when presented with
k=B\f when
BB is rectangular. We have
plotted the results of this procedure in the
Figure 2. The stiff fiber is readily identified.
From an algebraic point of view Equation 5)is an elegant reformulation of the least
squares problem. Though easy to remember it unfortunately
obscures the geometric content, suggested by the word
'projection,' of Equation 4. As
projections arise frequently in many applications we pause
here to develop them more carefully.
With respect to the normal equations we note that if
ℕA=0
ℕA
0
then
x=ATA-1ATb
x
A
A
-1
A
b
(12)
and so the orthogonal projection of
bb
onto
ℝA
ℝA
is:
bR=Ax=AATA-1ATb
bR
Ax
A
A
A
-1
A
b
(13)
Defining
P=AATA-1AT
P
A
A
A
-1
A
(14)
Equation 13 takes the form
bR=Pb
bR
P b
. Commensurate with our
notion of what a 'projection' should be we expect that
PP map vectors not in
ℝA
ℝA
onto
ℝA
ℝA
while leaving vectors already in
ℝA
ℝA
unscathed. More succinctly, we expect that
PbR=bR
P
bR
bR
,
i.e.,
PbR=PbR
P
bR
P
bR
. As the latter should hold for all
b∈Rm
b
Rm
we expect that
P2=P
P2
P
(15)
With respect to
Equation 14 we find that indeed
P2=AATA-1ATAATA-1AT=AATA-1AT=P
P2
A
A
A
-1
A
A
A
A
-1
A
A
A
A
-1
A
P
(16)
We also note that the
PP in
Equation 14 is symmetric. We dignify these
properties through
- Definition 1: orthogonal projection
A matrix PP that satisfies
P2=P
P2
P
is called a projection. A symmetric
projection is called an orthogonal projection.
We have taken some pains to motivate the use of the word
'projection.' You may be wondering however what symmetry has
to do with orthogonality. We explain this in terms of the
tautology
b=Pb+I−Pb
b
Pb
IP
b
(17)
Now, if
PP is a projection then
so too is
I−P
IP. Moreover, if
PP is
symmetric then the dot product of
bb's two constituents is
PbTI−Pb=bTPTI−Pb=bTP−P2b=bT0b=0
Pb
IP
b
b
P
IP
b
b
P
P2
b
b
0
b
0
(18)
i.e.,
Pb
P
b
is orthogonal to
I−Pb
IP
b
.
As examples of a nonorthogonal projections we offer
P=100-1200-14-121
P
1 0 0
-12
0 0
-14
-12
1
and
I−P
IP.
Finally, let us note that the central formula,
P=AATA-1=AT
P
A
A
A
-1
A
, is even a bit more general than advertised. It has
been billed as the orthogonal projection onto the column space
of
AA. The need often arises
however for the orthogonal projection onto some arbitrary
subspace
MM. The key to using the
old
PP is simply to realize that
every subspace is the column space of
some matrix. More precisely, if
x1...xm
x1
...
xm
(19)
is a basis for
MM then clearly if these
xj
xj
are placed into the columns of a matrix called
AA then
ℝA=M
ℝA
M
. For example, if
MM is
the line through
11T
11
then
P=111211=121111
P
1
1
12
11
12
1 1
1 1
(20)
is orthogonal projection onto
MM.
- Gilbert Strang was stretched on a rack to lengths
ℓ=6
ℓ
6
, 77, and
88 feet under applied forces of
f=1
f1
, 22,
and 44 tons. Assuming Hooke's
law
ℓ−L=cf
ℓL
cf
, find his compliance,
cc, and original height,
LL, by least squares.
- With regard to the example of § 3 note that, due
to the the random generation of the noise that taints the
displacements, one gets a different 'answer' every time the
code is invoked.
- Write a loop that invokes the code a statistically
significant number of times and submit bar plots of the
average fiber stiffness and its standard deviation for
each fiber, along with the associated M--file.
- Experiment with various noise levels with the
goal of determining the level above which it becomes
difficult to discern the stiff fiber. Carefully explain
your findings.
-
Find the matrix that projects
ℝ3
3
onto the line spanned by
101T
1
0
1
.
- Find the matrix that projects
ℝ3
3
onto the plane spanned by
101T
1 0
1
and
11-1T
1
1
-1
.
- If
PP is the projection of
ℝmm
onto a kk--dimensional subspace
MM, what is the rank of
PP and what is
ℝP
P?