For much of its history, signal processing has focused on signals produced by physical systems. Many natural and manmade systems can be modeled as linear. Thus, it is natural to consider signal models that complement this kind of linear structure. This notion has been incorporated into modern signal processing by modeling signals as vectors living in an appropriate vector space. This captures the linear structure that we often desire, namely that if we add two signals together then we obtain a new, physically meaningful signal. Moreover, vector spaces allow us to apply intuitions and tools from geometry in R3R3, such as lengths, distances, and angles, to describe and compare signals of interest. This is useful even when our signals live in highdimensional or infinitedimensional spaces.
Throughout this course, we will treat signals as realvalued functions having domains that are either continuous or discrete, and either infinite or finite. These assumptions will be made clear as necessary in each chapter. In this course, we will assume that the reader is relatively comfortable with the key concepts in vector spaces. We now provide only a brief review of some of the key concepts in vector spaces that will be required in developing the theory of compressive sensing (CS). For a more thorough review of vector spaces see this introductory course in Digital Signal Processing.
We will typically be concerned with normed vector spaces, i.e., vector spaces endowed with a norm. In the case of a discrete, finite domain, we can view our signals as vectors in an NNdimensional Euclidean space, denoted by RNRN. When dealing with vectors in RNRN, we will make frequent use of the ℓpℓp norms, which are defined for p∈[1,∞]p∈[1,∞] as
x
p
=
∑
i
=
1
N

x
i

p
1
p
,
p
∈
[
1
,
∞
)
;
max
i
=
1
,
2
,
...
,
N

x
i

,
p
=
∞
.
x
p
=
∑
i
=
1
N

x
i

p
1
p
,
p
∈
[
1
,
∞
)
;
max
i
=
1
,
2
,
...
,
N

x
i

,
p
=
∞
.
(1)
In Euclidean space we can also consider the standard inner product in RNRN, which we denote
〈
x
,
z
〉
=
z
T
x
=
∑
i
=
1
N
x
i
z
i
.
〈
x
,
z
〉
=
z
T
x
=
∑
i
=
1
N
x
i
z
i
.
(2)
This inner product leads to the ℓ2ℓ2 norm: x2=〈x,x〉x2=〈x,x〉.
In some contexts it is useful to extend the notion of ℓpℓp norms to the case where p<1p<1. In this case, the “norm” defined in Equation 1 fails to satisfy the triangle inequality, so it is actually a quasinorm. We will also make frequent use of the notation x0:= supp (x)x0:= supp (x), where supp (x)={i:xi≠0} supp (x)={i:xi≠0} denotes the support of xx and  supp (x) supp (x) denotes the cardinality of supp (x) supp (x). Note that ·0·0 is not even a quasinorm, but one can easily show that
lim
p
→
0
x
0
x
p
=

supp
(
x
)

,
lim
p
→
0
x
0
x
p
=

supp
(
x
)

,
(3)
justifying this choice of notation. The ℓpℓp (quasi)norms have notably different properties for different values of pp. To illustrate this, in Figure 1 we show the unit sphere, i.e., {x:xp=1},{x:xp=1}, induced by each of these norms in R2R2. Note that for p<1p<1 the corresponding unit sphere is nonconvex (reflecting the quasinorm's violation of the triangle inequality).
We typically use norms as a measure of the strength of a signal, or the size of an error. For example, suppose we are given a signal x∈R2x∈R2 and wish to approximate it using a point in a onedimensional affine space AA. If we measure the approximation error using an ℓpℓp norm, then our task is to find the x^∈Ax^∈A that minimizes xx^pxx^p. The choice of pp will have a significant effect on the properties of the resulting approximation error. An example is illustrated in Figure 2. To compute the closest point in AA to xx using each ℓpℓp norm, we can imagine growing an ℓpℓp sphere centered on xx until it intersects with AA. This will be the point x^∈Ax^∈A that is closest to xx in the corresponding ℓpℓp norm. We observe that larger pp tends to spread out the error more evenly among the two coefficients, while smaller pp leads to an error that is more unevenly distributed and tends to be sparse. This intuition generalizes to higher dimensions, and plays an important role in the development of CS theory.