So far, our approximation problem has been posed in an inner product space, and we have thus measured our approximation error using norms that are induced by an inner product such as the L2/ℓ2L2/ℓ2 norms (or weighted L2/ℓ2L2/ℓ2 norms). Sometimes this is a natural choice – it can be interpreted as the “energy” in the error and arises often in the case of signals corrupted by Gaussian noise. However, more often than not, it is used simply because it is easy to deal with.
In some cases we might be interested in approximating with respect to other norms – in particular we will consider approximation with respect to
ℓpℓp-norms for p≠0p≠0. First, we introduce the concept of a “unit ball”. Any norm gives us rise to a unit ball, i.e., x:x=1.x:x=1. Some important examples of unit balls for the ℓpℓp norms in R2R2 are depicted below.
We now consider an example of approximating a point in R2R2 with a point
in a 1-D subspace while measuring error using the ℓpℓp norm for p=1,2,∞p=1,2,∞.
Suppose V=R2V=R2,
A
=
span
2
-
1
,
and
x
=
2
1
.
A
=
span
2
-
1
,
and
x
=
2
1
.
(1)
We will want to find x^∈Ax^∈A that minimizes x-x^px-x^p. Since x^∈Ax^∈A, we can write
x
^
=
2
α
-
α
x
^
=
2
α
-
α
(2)
and thus
e
=
x
-
x
^
=
2
-
2
α
1
+
α
.
e
=
x
-
x
^
=
2
-
2
α
1
+
α
.
(3)
While we can solve for α∈Rα∈R to minimize epep
directly in some cases, a geometric interpretation is also useful. In each
case, on can imagine growing an ℓpℓp ball centered on xx until the ball
intersects with AA. This will be the point x^∈Ax^∈A. that is
closest to xx in the ℓpℓp norm. We first illustrate this for the ℓ2ℓ2 norm below:
In order to calculate x^x^ we can apply the orthogonality principle. Since 〈e,[21]T〉=0〈e,[21]T〉=0 we obtain a solution defined by α=35α=35.
We now observe that in the case of the ℓ∞ℓ∞ norm the picture changes somewhat. The closest point in ℓ∞ℓ∞ is illustrated below:
Note that the error is no longer orthogonal to the subspace AA. In this case we can still calculate x^x^ from the observation that the two terms in the error should be equal, which yields α=13α=13.
The situation is even more different for the case of the ℓ1ℓ1 norm, which is illustrated below:
We now observe that x^x^ corresponds to α=1α=1. Note that in this case the error term is [02]T[02]T. This punctuates a general trend: for large values of pp, the ℓpℓp norm tends to spread error evenly across all terms, while for small values of pp the error is more highly concentrated.
When is it useful to approximate in ℓpℓp or LpLp norms for p≠0p≠0?
- Filter Design: In some cases we will want the best fit to a specified frequency response in an L∞L∞ sense rather than the L2L2 sense. This minimizes the maximum error rather than total energy in the error. In the figure below we illustrate a desired frequency response. If the L∞L∞ norm of the error is small, then we are guaranteed that the approximation to our desired frequency response will lie within the illustrated bounds.
- Geometry representation: In compressing 3D geometry, can be useful to bound the L∞L∞ error to ensure
that basic shapes of narrow features (like poles, power lines, etc.) are preserved.
- Sparsity: In the case where the error is known to be sparse (i.e., zero on most indices) it can be useful to measure the error in the ℓ1ℓ1 norm.