From an algebraic point of view
Equation 5)is an elegant reformulation of the least
squares problem. Though easy to remember it unfortunately
obscures the geometric content, suggested by the word
'projection,' of
Equation 4. As
projections arise frequently in many applications we pause
here to develop them more carefully.
With respect to the normal equations we note that if
ℕA=0
ℕA
0
then
x=ATA-1ATb
x
A
A
-1
A
b
(12)
and so the orthogonal projection of
bb
onto
ℝA
ℝA
is:
bR=Ax=AATA-1ATb
bR
Ax
A
A
A
-1
A
b
(13)
Defining
P=AATA-1AT
P
A
A
A
-1
A
(14)
Equation 13 takes the form
bR=Pb
bR
P b
. Commensurate with our
notion of what a 'projection' should be we expect that
PP map vectors not in
ℝA
ℝA
onto
ℝA
ℝA
while leaving vectors already in
ℝA
ℝA
unscathed. More succinctly, we expect that
PbR=bR
P
bR
bR
,
i.e.,
PbR=PbR
P
bR
P
bR
. As the latter should hold for all
b∈Rm
b
Rm
we expect that
P2=P
P2
P
(15)
With respect to
Equation 14 we find that indeed
P2=AATA-1ATAATA-1AT=AATA-1AT=P
P2
A
A
A
-1
A
A
A
A
-1
A
A
A
A
-1
A
P
(16)
We also note that the
PP in
Equation 14 is symmetric. We dignify these
properties through
Definition 1:
orthogonal projection
A matrix PP that satisfies
P2=P
P2
P
is called a projection. A symmetric
projection is called an orthogonal projection.
We have taken some pains to motivate the use of the word
'projection.' You may be wondering however what symmetry has
to do with orthogonality. We explain this in terms of the
tautology
b=Pb+I-Pb
b
Pb
IP
b
(17)
Now, if
PP is a projection then
so too is
I-P
IP. Moreover, if
PP is
symmetric then the dot product of
bb's two constituents is
PbTI-Pb=bTPTI-Pb=bTP-P2b=bT0b=0
Pb
IP
b
b
P
IP
b
b
P
P2
b
b
0
b
0
(18)
i.e.,
Pb
P
b
is orthogonal to
I-Pb
IP
b
.
As examples of a nonorthogonal projections we offer
P=100-1200-14-121
P
1 0 0
-12
0 0
-14
-12
1
and
I-P
IP.
Finally, let us note that the central formula,
P=AATA-1=AT
P
A
A
A
-1
A
, is even a bit more general than advertised. It has
been billed as the orthogonal projection onto the column space
of
AA. The need often arises
however for the orthogonal projection onto some arbitrary
subspace
MM. The key to using the
old
PP is simply to realize that
every subspace is the column space of
some matrix. More precisely, if
x1...xm
x1
...
xm
(19)
is a basis for
MM then clearly if these
xj
xj
are placed into the columns of a matrix called
AA then
ℝA=M
ℝA
M
. For example, if
MM is
the line through
11T
11
then
P=111211=121111
P
1
1
12
11
12
1 1
1 1
(20)
is orthogonal projection onto
MM.