A linear transformation P:X→XP:X→X is called a projection if
P(x)=xP(x)=x∀x∈R(P)∀x∈R(P), i.e, P(P(x))=P(x)P(P(x))=P(x)∀x∈X.∀x∈X.
P:R3→R3P:R3→R3, P(x1,x2,x3)=(x1,x2,0)P(x1,x2,x3)=(x1,x2,0)
If PP is a projection operator on an inner product space VV, we say that
PP is an orthogonal projection if R(P)⊥N(P)R(P)⊥N(P) , i.e., 〈x,y〉=0〈x,y〉=0∀x∈R(P),y∈N(P).∀x∈R(P),y∈N(P).
If PP is an orthogonal projection, then for any x∈Vx∈V we can write:
x
=
P
x
+
(
I

P
)
x
x
=
P
x
+
(
I

P
)
x
(1)
where Px∈R(P)Px∈R(P) and (IP)x∈N(P)(IP)x∈N(P) (since P(IP)x=PxP(Px)=PxPx=0P(IP)x=PxP(Px)=PxPx=0.)
Now we see that the solution to our “best approximation in a linear
subspace” problem is an orthogonal projection: we wish to find a PP such that R(P)=AR(P)=A.
The question is now, how can we design such a projection operator?