We now show how to exploit the concentration of measure properties of sub-Gaussian distributions to provide a simple proof that sub-Gaussian matrices satisfy the restricted isometry property (RIP). Specifically, we wish to show that by constructing an M×NM×N matrix ΦΦ at random with MM sufficiently large, then with high probability there exists a δK∈(0,1)δK∈(0,1) such that
(
1
-
δ
K
)
x
2
2
≤
Φ
x
2
2
≤
(
1
+
δ
K
)
x
2
2
(
1
-
δ
K
)
x
2
2
≤
Φ
x
2
2
≤
(
1
+
δ
K
)
x
2
2
(1)
holds for all x∈ΣKx∈ΣK (where ΣKΣK denotes the set of all signals xx with at most KK nonzeros).
We begin by observing that if all we require is that δ2K>0δ2K>0, then we may set M=2KM=2K and draw a ΦΦ according to a Gaussian distribution, or indeed any continuous univariate distribution. In this case, with probability 1, any subset of 2K2K columns will be linearly independent, and hence all subsets of 2K2K columns will be bounded below by 1-δ2K1-δ2K where δ2K>0δ2K>0. However, suppose we wish to know the constant δ2Kδ2K. In order to find the value of the constant we must consider all possible NKNKKK-dimensional subspaces of RNRN. From a computational perspective, this is impossible for any realistic values of NN and KK. Moreover, in light of the lower bounds we described earlier in this course, the actual value of δ2Kδ2K in this case is likely to be very close to 1. Thus, we focus instead on the problem of achieving the RIP of order 2K2K for a specified constant δ2Kδ2K.
To ensure that the matrix will satisfy the RIP, we will impose two conditions on the random distribution. First, we require that the distribution is sub-Gaussian. In order to simplify our argument, we will use the simpler results stated in Corollary 2 from "Concentration of measure for sub-Gaussian random variables", which we briefly recall.
Suppose that ΦΦ is an M×NM×N matrix whose entries φijφij are i.i.d. with φij∼ SSub (1/M)φij∼ SSub (1/M). Let Y=ΦxY=Φx for x∈RNx∈RN. Then for any ϵ>0ϵ>0, and any x∈RNx∈RN,
E
∥
Y
∥
2
2
=
∥
x
∥
2
2
E
∥
Y
∥
2
2
=
∥
x
∥
2
2
(2)and
P
∥
Y
∥
2
2
-
∥
x
∥
2
2
≥
ϵ
∥
x
∥
2
2
≤
2
exp
-
M
ϵ
2
κ
*
P
∥
Y
∥
2
2
-
∥
x
∥
2
2
≥
ϵ
∥
x
∥
2
2
≤
2
exp
-
M
ϵ
2
κ
*
(3)with κ*=2/(1-log(2))≈6.52κ*=2/(1-log(2))≈6.52.
By exploiting this theorem, we assume that the distribution used to construct ΦΦ is strictly sub-Gaussian. This is done simply to yield more concrete constants. The argument could easily be modified to establish a similar result for general sub-Gaussian distributions by instead using Theorem 2 from "Concentration of measure for sub-Gaussian random variables".
Our second condition is that we require that the distribution yield a matrix that is approximately norm-preserving, which will require that
E
(
φ
i
j
2
)
=
1
M
,
E
(
φ
i
j
2
)
=
1
M
,
(4)
and hence the variance is 1/M1/M.
We shall now show how the concentration of measure inequality in Corollary 1 can be used together with covering arguments to prove the RIP for sub-Gaussian random matrices. Our general approach will be to construct nets of points in each KK-dimensional subspace, apply Equation 3 to all of these points through a union bound, and then extend the result from our finite set of points to all possible KK-dimensional signals. Thus, in order to prove the result, we will require the following upper bound on the number of points required to construct the nets of points. (For an overview of results similar to Lemma 1 and of various related concentration of measure results, we refer the reader to the excellent introduction of [1].)
Let ϵ∈(0,1)ϵ∈(0,1) be given. There exists a set of points QQ such that |Q|≤(3/ϵ)K|Q|≤(3/ϵ)K and for any x∈RKx∈RK with ∥x∥2≤1∥x∥2≤1 there is a point q∈Qq∈Q satisfying ∥x-q∥2≤ϵ∥x-q∥2≤ϵ.
We construct QQ in a greedy fashion. We first select an arbitrary point q1∈RKq1∈RK with ∥q1∥2≤1∥q1∥2≤1. We then continue adding points to QQ so that at step ii we add a point qi∈RKqi∈RK with ∥qi∥2≤1∥qi∥2≤1 which satisfies ∥qi-qj∥2>ϵ∥qi-qj∥2>ϵ for all j<ij<i. This continues until we can add no more points (and hence for any x∈RKx∈RK with ∥x∥2≤1∥x∥2≤1 there is a point q∈Qq∈Q satisfying ∥x-q∥2≤ϵ∥x-q∥2≤ϵ.) Now we wish to bound |Q||Q|. Observe that if we center balls of radius ϵ/2ϵ/2 at each point in QQ, then these balls are disjoint and lie within a ball of radius 1+ϵ/21+ϵ/2. Thus, if BK(r)BK(r) denotes a ball of radius rr in RKRK, then
|
Q
|
·
Vol
B
K
(
ϵ
/
2
)
≤
Vol
B
K
(
1
+
ϵ
/
2
)
|
Q
|
·
Vol
B
K
(
ϵ
/
2
)
≤
Vol
B
K
(
1
+
ϵ
/
2
)
(5)and hence
|
Q
|
≤
Vol
B
K
(
1
+
ϵ
/
2
)
Vol
B
K
(
ϵ
/
2
)
=
(
1
+
ϵ
/
2
)
K
(
ϵ
/
2
)
K
≤
3
/
ϵ
K
.
|
Q
|
≤
Vol
B
K
(
1
+
ϵ
/
2
)
Vol
B
K
(
ϵ
/
2
)
=
(
1
+
ϵ
/
2
)
K
(
ϵ
/
2
)
K
≤
3
/
ϵ
K
.
(6)
We now turn to our main theorem, which is based on the proof given in [2].
Fix δ∈(0,1)δ∈(0,1). Let ΦΦ be an M×NM×N random matrix whose entries φijφij are i.i.d. with φij∼ SSub (1/M)φij∼ SSub (1/M). If
M
≥
κ
1
K
log
N
K
,
M
≥
κ
1
K
log
N
K
,
(7)then ΦΦ satisfies the RIP of order KK with the prescribed δδ with probability exceeding 1-2e-κ2M1-2e-κ2M, where κ1κ1 is arbitrary and κ2=δ2/2κ*-log(42e/δ)/κ1κ2=δ2/2κ*-log(42e/δ)/κ1.
First note that it is enough to prove Equation 1 in the case ∥x∥2=1∥x∥2=1, since ΦΦ is linear. Next, fix an index set T⊂{1,2,...,N}T⊂{1,2,...,N} with |T|=K|T|=K, and let XTXT denote the KK-dimensional subspace spanned by the columns of ΦTΦT. We choose a finite set of points QTQT such that QT⊆XTQT⊆XT, ∥q∥2≤1∥q∥2≤1 for all q∈QTq∈QT, and for all x∈XTx∈XT with ∥x∥2≤1∥x∥2≤1 we have
min
q
∈
Q
T
∥
x
-
q
∥
2
≤
δ
/
14
.
min
q
∈
Q
T
∥
x
-
q
∥
2
≤
δ
/
14
.
(8)From Lemma 1, we can choose such a set QTQT with |QT|≤(42/δ)K|QT|≤(42/δ)K. We then repeat this process for each possible index set TT, and collect all the sets QTQT together:
Q
=
⋃
T
:
|
T
|
=
K
Q
T
.
Q
=
⋃
T
:
|
T
|
=
K
Q
T
.
(9)There are NKNK possible index sets TT. We can bound this number by
N
K
=
N
(
N
-
1
)
(
N
-
2
)
⋯
(
N
-
K
+
1
)
K
!
≤
N
K
K
!
≤
e
N
K
K
,
N
K
=
N
(
N
-
1
)
(
N
-
2
)
⋯
(
N
-
K
+
1
)
K
!
≤
N
K
K
!
≤
e
N
K
K
,
(10)where the last inequality follows since from Sterling's approximation we have K!≥(K/e)KK!≥(K/e)K. Hence |Q|≤(42eN/δK)K|Q|≤(42eN/δK)K. Since the entries of ΦΦ are drawn according to a strictly sub-Gaussian distribution, from Corollary 1 we have Equation 3. We next use the union bound to apply Equation 3 to this set of points with ϵ=δ/2ϵ=δ/2, with the result that, with probability exceeding
1
-
2
(
42
e
N
/
δ
K
)
K
e
-
M
δ
2
/
2
κ
*
,
1
-
2
(
42
e
N
/
δ
K
)
K
e
-
M
δ
2
/
2
κ
*
,
(11)we have
(
1
-
δ
/
2
)
∥
q
∥
2
2
≤
∥
Φ
q
∥
2
2
≤
(
1
+
δ
/
2
)
∥
q
∥
2
2
,
for
all
q
∈
Q
.
(
1
-
δ
/
2
)
∥
q
∥
2
2
≤
∥
Φ
q
∥
2
2
≤
(
1
+
δ
/
2
)
∥
q
∥
2
2
,
for
all
q
∈
Q
.
(12)We observe that if MM satisfies Equation 7 then
log
42
e
N
δ
K
K
≤
K
log
N
K
log
42
e
δ
≤
M
log
(
42
e
/
δ
)
κ
1
log
42
e
N
δ
K
K
≤
K
log
N
K
log
42
e
δ
≤
M
log
(
42
e
/
δ
)
κ
1
(13)and thus Equation 11 exceeds 1-2e-κ2M1-2e-κ2M as desired.
We now define AA as the smallest number such that
∥
Φ
x
∥
2
≤
1
+
A
∥
x
∥
2
,
for
all
x
∈
Σ
K
,
∥
x
∥
2
≤
1
.
∥
Φ
x
∥
2
≤
1
+
A
∥
x
∥
2
,
for
all
x
∈
Σ
K
,
∥
x
∥
2
≤
1
.
(14)Our goal is to show that A≤δA≤δ. For this, we recall that for any x∈ΣKx∈ΣK with ∥x∥2≤1∥x∥2≤1, we can pick a q∈Qq∈Q such that ∥x-q∥2≤δ/14∥x-q∥2≤δ/14 and such that x-q∈ΣKx-q∈ΣK (since if x∈XTx∈XT, we can pick q∈QT⊂XTq∈QT⊂XT satisfying ∥x-q∥2≤δ/14∥x-q∥2≤δ/14). In this case we have
∥
Φ
x
∥
2
≤
∥
Φ
q
∥
2
+
∥
Φ
(
x
-
q
)
∥
2
≤
1
+
δ
/
2
+
1
+
A
·
δ
/
14
.
∥
Φ
x
∥
2
≤
∥
Φ
q
∥
2
+
∥
Φ
(
x
-
q
)
∥
2
≤
1
+
δ
/
2
+
1
+
A
·
δ
/
14
.
(15)Since by definition AA is the smallest number for which Equation 14 holds, we obtain 1+A≤1+δ/2+1+A·δ/141+A≤1+δ/2+1+A·δ/14. Therefore
1
+
A
≤
1
+
δ
/
2
1
-
δ
/
14
≤
1
+
δ
,
1
+
A
≤
1
+
δ
/
2
1
-
δ
/
14
≤
1
+
δ
,
(16)as desired. We have proved the upper inequality in Equation 1. The lower inequality follows from this since
∥
Φ
x
∥
2
≥
∥
Φ
q
∥
2
-
∥
Φ
(
x
-
q
)
∥
2
≥
1
-
δ
/
2
-
1
+
δ
·
δ
/
14
≥
1
-
δ
,
∥
Φ
x
∥
2
≥
∥
Φ
q
∥
2
-
∥
Φ
(
x
-
q
)
∥
2
≥
1
-
δ
/
2
-
1
+
δ
·
δ
/
14
≥
1
-
δ
,
(17)which completes the proof.
Above we prove above that the RIP holds with high probability when the matrix ΦΦ is drawn according to a strictly sub-Gaussian distribution. However, we are often interested in signals that are sparse or compressible in some orthonormal basis Ψ≠IΨ≠I, in which case we would like the matrix ΦΨΦΨ to satisfy the RIP. In this setting it is easy to see that by choosing our net of points in the KK-dimensional subspaces spanned by sets of KK columns of ΨΨ, Theorem 1 will establish the RIP for ΦΨΦΨ for ΦΦ again drawn from a sub-Gaussian distribution. This universality of ΦΦ with respect to the sparsity-inducing basis is an attractive property that was initially observed for the Gaussian distribution (based on symmetry arguments), but we can now see is a property of more general sub-Gaussian distributions. Indeed, it follows that with high probability such a ΦΦ will simultaneously satisfy the RIP with respect to an exponential number of fixed bases.
-
Ball, K. (1997). Flavors of Geometry: Vol. 31. An Elementary Introduction to Modern Convex Geometry. (). Cambridge, England: MSRI Publ., Cambridge Univ. Press.
-
Baraniuk, R. and Davenport, M. and DeVore, R. and Wakin, M. (2008). A simple proof of the restricted isometry property for random matrices. Const. Approx., 28(3), 253–263.