Skip to content Skip to navigation Skip to collection information

OpenStax_CNX

You are here: Home » Content » Introduction to Compressive Sensing » Signal recovery in noise

Navigation

Recently Viewed

This feature requires Javascript to be enabled.
 

Signal recovery in noise

Module by: Mark A. Davenport. E-mail the author

Summary: This module establishes a number of results concerning various L1 minimization algorithms designed for sparse signal recovery from noisy measurements. The results in this module apply to both bounded noise as well as Gaussian (or more generally, sub-Gaussian) noise.

The ability to perfectly reconstruct a sparse signal from noise-free measurements represents a promising result. However, in most real-world systems the measurements are likely to be contaminated by some form of noise. For instance, in order to process data in a computer we must be able to represent it using a finite number of bits, and hence the measurements will typically be subject to quantization error. Moreover, systems which are implemented in physical hardware will be subject to a variety of different types of noise depending on the setting.

Perhaps somewhat surprisingly, one can show that it is possible to modify

x ^ = arg min z z 1 subject to z B ( y ) . x ^ = arg min z z 1 subject to z B ( y ) .
(1)

to stably recover sparse signals under a variety of common noise models [2], [3], [4]. As might be expected, the restricted isometry property (RIP) is extremely useful in establishing performance guarantees in noise.

In our analysis we will make repeated use of Lemma 1 from "Noise-free signal recovery", so we repeat it here for convenience.

Lemma 1

Suppose that ΦΦ satisfies the RIP of order 2K2K with δ2K<2-1δ2K<2-1. Let x,x^RNx,x^RN be given, and define h=x^-xh=x^-x. Let Λ0Λ0 denote the index set corresponding to the KK entries of xx with largest magnitude and Λ1Λ1 the index set corresponding to the KK entries of hΛ0chΛ0c with largest magnitude. Set Λ=Λ0Λ1Λ=Λ0Λ1. If x^1x1x^1x1, then

h 2 C 0 σ K ( x ) 1 K + C 1 Φ h Λ , Φ h h Λ 2 . h 2 C 0 σ K ( x ) 1 K + C 1 Φ h Λ , Φ h h Λ 2 .
(2)

where

C 0 = 2 1 - ( 1 - 2 ) δ 2 K 1 - ( 1 + 2 ) δ 2 K , C 1 = 2 1 - ( 1 + 2 ) δ 2 K . C 0 = 2 1 - ( 1 - 2 ) δ 2 K 1 - ( 1 + 2 ) δ 2 K , C 1 = 2 1 - ( 1 + 2 ) δ 2 K .
(3)

Bounded noise

We first provide a bound on the worst-case performance for uniformly bounded noise, as first investigated in [2].

Theorem 1: (Theorem 1.2 of [1])

Suppose that ΦΦ satisfies the RIP of order 2K2K with δ2K<2-1δ2K<2-1 and let y=Φx+ey=Φx+e where e2ϵe2ϵ. Then when B(y)={z:Φz-y2ϵ}B(y)={z:Φz-y2ϵ}, the solution x^x^ to Equation 1 obeys

x ^ - x 2 C 0 σ K ( x ) 1 K + C 2 ϵ , x ^ - x 2 C 0 σ K ( x ) 1 K + C 2 ϵ ,
(4)

where

C 0 = 2 1 - ( 1 - 2 ) δ 2 K 1 - ( 1 + 2 ) δ 2 K , C 2 = 4 1 + δ 2 K 1 - ( 1 + 2 ) δ 2 K . C 0 = 2 1 - ( 1 - 2 ) δ 2 K 1 - ( 1 + 2 ) δ 2 K , C 2 = 4 1 + δ 2 K 1 - ( 1 + 2 ) δ 2 K .
(5)

Proof

We are interested in bounding h2=x^-x2h2=x^-x2. Since e2ϵe2ϵ, xB(y)xB(y), and therefore we know that x^1x1x^1x1. Thus we may apply Lemma 1, and it remains to bound ΦhΛ,ΦhΦhΛ,Φh. To do this, we observe that

Φ h 2 = Φ ( x ^ - x ) 2 = Φ x ^ - y + y - Φ x 2 Φ x ^ - y 2 + y - Φ x 2 2 ϵ Φ h 2 = Φ ( x ^ - x ) 2 = Φ x ^ - y + y - Φ x 2 Φ x ^ - y 2 + y - Φ x 2 2 ϵ
(6)

where the last inequality follows since x,x^B(y)x,x^B(y). Combining this with the RIP and the Cauchy-Schwarz inequality we obtain

Φ h Λ , Φ h Φ h Λ 2 Φ h 2 2 ϵ 1 + δ 2 K h Λ 2 . Φ h Λ , Φ h Φ h Λ 2 Φ h 2 2 ϵ 1 + δ 2 K h Λ 2 .
(7)

Thus,

h 2 C 0 σ K ( x ) 1 K + C 1 2 ϵ 1 + δ 2 K = C 0 σ K ( x ) 1 K + C 2 ϵ , h 2 C 0 σ K ( x ) 1 K + C 1 2 ϵ 1 + δ 2 K = C 0 σ K ( x ) 1 K + C 2 ϵ ,
(8)

completing the proof.

In order to place this result in context, consider how we would recover a sparse vector xx if we happened to already know the KK locations of the nonzero coefficients, which we denote by Λ0Λ0. This is referred to as the oracle estimator. In this case a natural approach is to reconstruct the signal using a simple pseudoinverse:

x ^ Λ 0 = Φ Λ 0 y = ( Φ Λ 0 T Φ Λ 0 ) - 1 Φ Λ 0 T y x ^ Λ 0 c = 0 . x ^ Λ 0 = Φ Λ 0 y = ( Φ Λ 0 T Φ Λ 0 ) - 1 Φ Λ 0 T y x ^ Λ 0 c = 0 .
(9)

The implicit assumption in Equation 9 is that ΦΛ0ΦΛ0 has full column-rank (and hence we are considering the case where ΦΛ0ΦΛ0 is the M×KM×K matrix with the columns indexed by Λ0cΛ0c removed) so that there is a unique solution to the equation y=ΦΛ0xΛ0y=ΦΛ0xΛ0. With this choice, the recovery error is given by

x ^ - x 2 = ( Φ Λ 0 T Φ Λ 0 ) - 1 Φ Λ 0 T ( Φ x + e ) - x 2 = ( Φ Λ 0 T Φ Λ 0 ) - 1 Φ Λ 0 T e 2 . x ^ - x 2 = ( Φ Λ 0 T Φ Λ 0 ) - 1 Φ Λ 0 T ( Φ x + e ) - x 2 = ( Φ Λ 0 T Φ Λ 0 ) - 1 Φ Λ 0 T e 2 .
(10)

We now consider the worst-case bound for this error. Using standard properties of the singular value decomposition, it is straightforward to show that if ΦΦ satisfies the RIP of order 2K2K (with constant δ2Kδ2K), then the largest singular value of ΦΛ0ΦΛ0 lies in the range [1/1+δ2K,1/1-δ2K][1/1+δ2K,1/1-δ2K]. Thus, if we consider the worst-case recovery error over all ee such that e2ϵe2ϵ, then the recovery error can be bounded by

ϵ 1 + δ 2 K x ^ - x 2 ϵ 1 - δ 2 K . ϵ 1 + δ 2 K x ^ - x 2 ϵ 1 - δ 2 K .
(11)

Therefore, if xx is exactly KK-sparse, then the guarantee for the pseudoinverse recovery method, which is given perfect knowledge of the true support of xx, cannot improve upon the bound in Theorem 1 by more than a constant value.

We now examine a slightly different noise model. Whereas Theorem 1 assumed that the noise norm e2e2 was small, the theorem below analyzes a different recovery algorithm known as the Dantzig selector in the case where ΦTeΦTe is small [3]. We will see below that this will lead to a simple analysis of the performance of this algorithm in Gaussian noise.

Theorem 2

Suppose that ΦΦ satisfies the RIP of order 2K2K with δ2K<2-1δ2K<2-1 and we obtain measurements of the form y=Φx+ey=Φx+e where ΦTeλΦTeλ. Then when B(y)={z:ΦT(Φz-y)λ}B(y)={z:ΦT(Φz-y)λ}, the solution x^x^ to Equation 1 obeys

x ^ - x 2 C 0 σ K ( x ) 1 K + C 3 K λ , x ^ - x 2 C 0 σ K ( x ) 1 K + C 3 K λ ,
(12)

where

C 0 = 2 1 - ( 1 - 2 ) δ 2 K 1 - ( 1 + 2 ) δ 2 K , C 3 = 4 2 1 - ( 1 + 2 ) δ 2 K . C 0 = 2 1 - ( 1 - 2 ) δ 2 K 1 - ( 1 + 2 ) δ 2 K , C 3 = 4 2 1 - ( 1 + 2 ) δ 2 K .
(13)

Proof

The proof mirrors that of Theorem 1. Since ΦTeλΦTeλ, we again have that xB(y)xB(y), so x^1x1x^1x1 and thus Lemma 1 applies. We follow a similar approach as in Theorem 1 to bound ΦhΛ,ΦhΦhΛ,Φh. We first note that

Φ T Φ h Φ T ( Φ x ^ - y ) + Φ T ( y - Φ x ) 2 λ Φ T Φ h Φ T ( Φ x ^ - y ) + Φ T ( y - Φ x ) 2 λ
(14)

where the last inequality again follows since x,x^B(y)x,x^B(y). Next, note that ΦhΛ=ΦΛhΛΦhΛ=ΦΛhΛ. Using this we can apply the Cauchy-Schwarz inequality to obtain

Φ h Λ , Φ h = h Λ , Φ Λ T Φ h h Λ 2 Φ Λ T Φ h 2 . Φ h Λ , Φ h = h Λ , Φ Λ T Φ h h Λ 2 Φ Λ T Φ h 2 .
(15)

Finally, since ΦTΦh2λΦTΦh2λ, we have that every coefficient of ΦTΦhΦTΦh is at most 2λ2λ, and thus ΦΛTΦh22K(2λ)ΦΛTΦh22K(2λ). Thus,

h 2 C 0 σ K ( x ) 1 K + C 1 2 2 K λ = C 0 σ K ( x ) 1 K + C 3 K λ , h 2 C 0 σ K ( x ) 1 K + C 1 2 2 K λ = C 0 σ K ( x ) 1 K + C 3 K λ ,
(16)

as desired.

Gaussian noise

Finally, we also examine the performance of these approaches in the presence of Gaussian noise. The case of Gaussian noise was first considered in [4], which examined the performance of 00 minimization with noisy measurements. We now see that Theorem 1 and Theorem 2 can be leveraged to provide similar guarantees for 11 minimization. To simplify our discussion we will restrict our attention to the case where xΣK = x : x 0 K xΣK = x : x 0 K , so that σK(x)1=0σK(x)1=0 and the error bounds in Theorem 1 and Theorem 2 depend only on the noise ee.

To begin, suppose that the coefficients of eRMeRM are i.i.d. according to a Gaussian distribution with mean zero and variance σ2σ2. Since the Gaussian distribution is itself sub-Gaussian, we can apply results such as Corollary 1 from "Concentration of measure for sub-Gaussian random variables" to show that there exists a constant c0>0c0>0 such that for any ϵ>0ϵ>0,

P e 2 ( 1 + ϵ ) M σ exp - c 0 ϵ 2 M . P e 2 ( 1 + ϵ ) M σ exp - c 0 ϵ 2 M .
(17)

Applying this result to Theorem 1 with ϵ=1ϵ=1, we obtain the following result for the special case of Gaussian noise.

Corollary 1

Suppose that ΦΦ satisfies the RIP of order 2K2K with δ2K<2-1δ2K<2-1. Furthermore, suppose that xΣKxΣK and that we obtain measurements of the form y=Φx+ey=Φx+e where the entries of ee are i.i.d. N(0,σ2)N(0,σ2). Then when B(y)={z:Φz-y22Mσ}B(y)={z:Φz-y22Mσ}, the solution x^x^ to Equation 1 obeys

x ^ - x 2 8 1 + δ 2 K 1 - ( 1 + 2 ) δ 2 K M σ x ^ - x 2 8 1 + δ 2 K 1 - ( 1 + 2 ) δ 2 K M σ
(18)

with probability at least 1-exp(-c0M)1-exp(-c0M).

We can similarly consider Theorem 2 in the context of Gaussian noise. If we assume that the columns of ΦΦ have unit norm, then each coefficient of ΦTeΦTe is a Gaussian random variable with mean zero and variance σ2σ2. Using standard tail bounds for the Gaussian distribution (see Theorem 1 from "Sub-Gaussian random variables"), we have that

P Φ T e i t σ exp - t 2 / 2 P Φ T e i t σ exp - t 2 / 2
(19)

for i=1,2,...,ni=1,2,...,n. Thus, using the union bound over the bounds for different ii, we obtain

P Φ T e 2 log N σ N exp - 2 log N = 1 N . P Φ T e 2 log N σ N exp - 2 log N = 1 N .
(20)

Applying this to Theorem 2, we obtain the following result, which is a simplified version of Theorem 1.1 of [3].

Corollary 2

Suppose that ΦΦ has unit-norm columns and satisfies the RIP of order 2K2K with δ2K<2-1δ2K<2-1. Furthermore, suppose that xΣKxΣK and that we obtain measurements of the form y=Φx+ey=Φx+e where the entries of ee are i.i.d. N(0,σ2)N(0,σ2). Then when B(y)={z:ΦT(Φz-y)2logNσ}B(y)={z:ΦT(Φz-y)2logNσ}, the solution x^x^ to Equation 1 obeys

x ^ - x 2 4 2 1 + δ 2 K 1 - ( 1 + 2 ) δ 2 K K log N σ x ^ - x 2 4 2 1 + δ 2 K 1 - ( 1 + 2 ) δ 2 K K log N σ
(21)

with probability at least 1-1N1-1N.

Ignoring the precise constants and the probabilities with which the bounds hold (which we have made no effort to optimize), we observe that if M=O(KlogN)M=O(KlogN) then these results appear to be essentially the same. However, there is a subtle difference. Specifically, if MM and NN are fixed and we consider the effect of varying KK, we can see that Corollary 2 yields a bound that is adaptive to this change, providing a stronger guarantee when KK is small, whereas the bound in Corollary 1 does not improve as KK is reduced. Thus, while they provide very similar guarantees, there are certain circumstances where the Dantzig selector is preferable. See [3] for further discussion of the comparative advantages of these approaches.

References

  1. Candès, E. (2008). The restricted isometry property and its implications for compressed sensing. Comptes rendus de l'Académie des Sciences, Série I, 346(9-10), 589–592.
  2. Candès, E. and Romberg, J. and Tao, T. (2006). Stable signal recovery from incomplete and inaccurate measurements. Comm. Pure Appl. Math., 59(8), 1207–1223.
  3. Candès, E. and Tao, T. (2007). The Dantzig selector: Statistical estimation when is much larger than. Ann. Stat., 35(6), 2313–2351.
  4. Haupt, J. and Nowak, R. (2006). Signal Reconstruction from Noisy Random Projections. IEEE Trans. Inform. Theory, 52(9), 4036–4048.

Collection Navigation

Content actions

Download:

Collection as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Module as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Add:

Collection to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks

Module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks