Connexions

You are here: Home » Content » Signal recovery via ℓ_1 minimization

Navigation

Recently Viewed

This feature requires Javascript to be enabled.

Signal recovery via ℓ_1 minimization

Module by: Mark A. Davenport. E-mail the author

Summary: This module introduces and motivates ℓ_1 minimization as a framework for sparse recovery.

As we will see later in this course, there now exist a wide variety of approaches to recover a sparse signal xx from a small number of linear measurements. We begin by considering a natural first approach to the problem of sparse recovery.

Given measurements y=Φxy=Φx and the knowledge that our original signal xx is sparse or compressible, it is natural to attempt to recover xx by solving an optimization problem of the form

x ^ = arg min z z 0 subject to z B ( y ) , x ^ = arg min z z 0 subject to z B ( y ) ,
(1)

where B(y)B(y) ensures that x^x^ is consistent with the measurements yy. Recall that z0=| supp (z)|z0=| supp (z)| simply counts the number of nonzero entries in zz, so Equation 1 simply seeks out the sparsest signal consistent with the observed measurements. For example, if our measurements are exact and noise-free, then we can set B(y)={z:Φz=y}B(y)={z:Φz=y}. When the measurements have been contaminated with a small amount of bounded noise, we could instead set B(y)={z:Φz-y2ϵ}B(y)={z:Φz-y2ϵ}. In both cases, Equation 1 finds the sparsest xx that is consistent with the measurements yy.

Note that in Equation 1 we are inherently assuming that xx itself is sparse. In the more common setting where x=Ψαx=Ψα, we can easily modify the approach and instead consider

α ^ = arg min z z 0 subject to z B ( y ) α ^ = arg min z z 0 subject to z B ( y )
(2)

where B(y)={z:ΦΨz=y}B(y)={z:ΦΨz=y} or B(y)={z:ΦΨz-y2ϵ}B(y)={z:ΦΨz-y2ϵ}. By setting Φ˜=ΦΨΦ˜=ΦΨ we see that Equation 1 and Equation 2 are essentially identical. Moreover, as noted in "Matrices that satisfy the RIP", in many cases the introduction of ΨΨ does not significantly complicate the construction of matrices ΦΦ such that Φ˜Φ˜ will satisfy the desired properties. Thus, for most of the remainder of this course we will restrict our attention to the case where Ψ=IΨ=I. It is important to note, however, that this restriction does impose certain limits in our analysis when ΨΨ is a general dictionary and not an orthonormal basis. For example, in this case x^-x2=Ψc^-Ψc2α^-α2x^-x2=Ψc^-Ψc2α^-α2, and thus a bound on c^-c2c^-c2 cannot directly be translated into a bound on x^-xx^-x, which is often the metric of interest.

Although it is possible to analyze the performance of Equation 1 under the appropriate assumptions on ΦΦ, we do not pursue this strategy since the objective function ·0·0 is nonconvex, and hence Equation 1 is potentially very difficult to solve. In fact, one can show that for a general matrix ΦΦ, even finding a solution that approximates the true minimum is NP-hard. One avenue for translating this problem into something more tractable is to replace ·0·0 with its convex approximation ·1·1. Specifically, we consider

x ^ = arg min z z 1 subject to z B ( y ) . x ^ = arg min z z 1 subject to z B ( y ) .
(3)

Provided that B(y)B(y) is convex, Equation 3 is computationally feasible. In fact, when B(y)={z:Φz=y}B(y)={z:Φz=y}, the resulting problem can be posed as a linear program [2].

It is clear that replacing Equation 1 with Equation 3 transforms a computationally intractable problem into a tractable one, but it may not be immediately obvious that the solution to Equation 3 will be at all similar to the solution to Equation 1. However, there are certainly intuitive reasons to expect that the use of 11 minimization will indeed promote sparsity. As an example, recall the example we discussed earlier shown in Figure 1. In this case the solutions to the 11 minimization problem coincided exactly with the solution to the pp minimization problem for any p<1p<1, and notably, is sparse. Moreover, the use of 11 minimization to promote or exploit sparsity has a long history, dating back at least to the work of Beurling on Fourier transform extrapolation from partial observations [1].

Additionally, in a somewhat different context, in 1965 Logan [4] showed that a bandlimited signal can be perfectly recovered in the presence of arbitrary corruptions on a small interval. Again, the recovery method consists of searching for the bandlimited signal that is closest to the observed signal in the 11 norm. This can be viewed as further validation of the intuition gained from Figure 1 — the 11 norm is well-suited to sparse errors.

Historically, the use of 11 minimization on large problems finally became practical with the explosion of computing power in the late 1970's and early 1980's. In one of its first applications, it was demonstrated that geophysical signals consisting of spike trains could be recovered from only the high-frequency components of these signals by exploiting 11 minimization [3], [6], [8]. Finally, in the 1990's there was renewed interest in these approaches within the signal processing community for the purpose of finding sparse approximations to signals and images when represented in overcomplete dictionaries or unions of bases [2], [5]. Separately, 11 minimization received significant attention in the statistics literature as a method for variable selection in linear regression, known as the Lasso [7].

Thus, there are a variety of reasons to suspect that 11 minimization will provide an accurate method for sparse signal recovery. More importantly, this also provides a computationally tractable approach to the sparse signal recovery problem. We now provide an overview of 11 minimization in both the noise-free and noisy settings from a theoretical perspective. We will then further discuss algorithms for performing 11 minimization later in this course.

References

1. Beurling, A. (1938). Sur les intégrales de Fourier absolument convergentes et leur application à une transformation fonctionelle. In Proc. Scandinavian Math. Congress. Helsinki, Finland
2. Chen, S. and Donoho, D. and Saunders, M. (1998). Atomic decomposition by basis pursuit. SIAM J. Sci. Comp., 20(1), 33–61.
3. Levy, S. and Fullagar, P. (1981). Reconstruction of a sparse spike train from a portion of its spectrum and application to high-resolution deconvolution. Geophysics, 46(9), 1235–1243.
4. Logan, B. (1965). Properties of High-Pass Signals. Ph. D. Thesis. Columbia Universuty.
5. Mallat, S. (1999). A Wavelet Tour of Signal Processing. San Diego, CA: Academic Press.
6. Taylor, H. and Banks, S. and McCoy, J. (1979). Deconvolution with the norm. Geophysics, 44(1), 39–52.
7. Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. J. Royal Statist. Soc B, 58(1), 267–288.
8. Walker, C. and Ulrych, T. (1983). Autoregressive recovery of the acoustic impedance. Geophysics, 48(10), 1338–1350.

Content actions

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Add module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags?

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks