Finally, we also examine the performance of these approaches in the presence of Gaussian noise. The case of Gaussian noise was first considered in [4], which examined the performance of ℓ0ℓ0 minimization with noisy measurements. We now see that Theorem 1 and Theorem 2 can be leveraged to provide similar guarantees for ℓ1ℓ1 minimization. To simplify our discussion we will restrict our attention to the case where x∈ΣK =
x
:
x
0
≤
K
x∈ΣK =
x
:
x
0
≤
K
, so that σK(x)1=0σK(x)1=0 and the error bounds in Theorem 1 and Theorem 2 depend only on the noise ee.
To begin, suppose that the coefficients of e∈RMe∈RM are i.i.d. according to a Gaussian distribution with mean zero and variance σ2σ2. Since the Gaussian distribution is itself sub-Gaussian, we can apply results such as Corollary 1 from "Concentration of measure for sub-Gaussian random variables" to show that there exists a constant c0>0c0>0 such that for any ϵ>0ϵ>0,
P
e
2
≥
(
1
+
ϵ
)
M
σ
≤
exp
-
c
0
ϵ
2
M
.
P
e
2
≥
(
1
+
ϵ
)
M
σ
≤
exp
-
c
0
ϵ
2
M
.
(17)Applying this result to Theorem 1 with ϵ=1ϵ=1, we obtain the following result for the special case of Gaussian noise.
Suppose that ΦΦ satisfies the RIP of order 2K2K with δ2K<2-1δ2K<2-1. Furthermore, suppose that x∈ΣKx∈ΣK and that we obtain measurements of the form y=Φx+ey=Φx+e where the entries of ee are i.i.d. N(0,σ2)N(0,σ2). Then when B(y)={z:Φz-y2≤2Mσ}B(y)={z:Φz-y2≤2Mσ}, the solution x^x^ to Equation 1 obeys
x
^
-
x
2
≤
8
1
+
δ
2
K
1
-
(
1
+
2
)
δ
2
K
M
σ
x
^
-
x
2
≤
8
1
+
δ
2
K
1
-
(
1
+
2
)
δ
2
K
M
σ
(18)with probability at least 1-exp(-c0M)1-exp(-c0M).
We can similarly consider Theorem 2 in the context of Gaussian noise. If we assume that the columns of ΦΦ have unit norm, then each coefficient of ΦTeΦTe is a Gaussian random variable with mean zero and variance σ2σ2. Using standard tail bounds for the Gaussian distribution (see Theorem 1 from "Sub-Gaussian random variables"), we have that
P
Φ
T
e
i
≥
t
σ
≤
exp
-
t
2
/
2
P
Φ
T
e
i
≥
t
σ
≤
exp
-
t
2
/
2
(19)for i=1,2,...,ni=1,2,...,n. Thus, using the union bound over the bounds for different ii, we obtain
P
Φ
T
e
∞
≥
2
log
N
σ
≤
N
exp
-
2
log
N
=
1
N
.
P
Φ
T
e
∞
≥
2
log
N
σ
≤
N
exp
-
2
log
N
=
1
N
.
(20)Applying this to Theorem 2, we obtain the following result, which is a simplified version of Theorem 1.1 of [3].
Suppose that ΦΦ has unit-norm columns and satisfies the RIP of order 2K2K with δ2K<2-1δ2K<2-1. Furthermore, suppose that x∈ΣKx∈ΣK and that we obtain measurements of the form y=Φx+ey=Φx+e where the entries of ee are i.i.d. N(0,σ2)N(0,σ2). Then when B(y)={z:ΦT(Φz-y)∞≤2logNσ}B(y)={z:ΦT(Φz-y)∞≤2logNσ}, the solution x^x^ to Equation 1 obeys
x
^
-
x
2
≤
4
2
1
+
δ
2
K
1
-
(
1
+
2
)
δ
2
K
K
log
N
σ
x
^
-
x
2
≤
4
2
1
+
δ
2
K
1
-
(
1
+
2
)
δ
2
K
K
log
N
σ
(21)with probability at least 1-1N1-1N.
Ignoring the precise constants and the probabilities with which the bounds hold (which we have made no effort to optimize), we observe that if M=O(KlogN)M=O(KlogN) then these results appear to be essentially the same. However, there is a subtle difference. Specifically, if MM and NN are fixed and we consider the effect of varying KK, we can see that Corollary 2 yields a bound that is adaptive to this change, providing a stronger guarantee when KK is small, whereas the bound in Corollary 1 does not improve as KK is reduced. Thus, while they provide very similar guarantees, there are certain circumstances where the Dantzig selector is preferable. See [3] for further discussion of the comparative advantages of these approaches.