Skip to content Skip to navigation

OpenStax-CNX

You are here: Home » Content » Sparse representations

Navigation

Recently Viewed

This feature requires Javascript to be enabled.
 

Sparse representations

Module by: Marco F. Duarte, Mark A. Davenport. E-mail the authors

Summary: This module provides an overview of sparsity and sparse representations, giving examples for both 1-D and 2-D signals.

Transforming a signal to a new basis or frame may allow us to represent a signal more concisely. The resulting compression is useful for reducing data storage and data transmission, which can be quite expensive in some applications. Hence, one might wish to simply transmit the analysis coefficients obtained in our basis or frame expansion instead of its high-dimensional correlate. In cases where the number of non-zero coefficients is small, we say that we have a sparse representation. Sparse signal models allow us to achieve high rates of compression and in the case of compressive sensing, we may use the knowledge that our signal is sparse in a known basis or frame to recover our original signal from a small number of measurements. For sparse data, only the non-zero coefficients need to be stored or transmitted in many cases; the rest can be assumed to be zero).

Mathematically, we say that a signal xx is KK-sparse when it has at most KK nonzeros, i.e., x0Kx0K. We let

Σ K = x : x 0 K Σ K = x : x 0 K
(1)

denote the set of all KK-sparse signals. Typically, we will be dealing with signals that are not themselves sparse, but which admit a sparse representation in some basis ΨΨ. In this case we will still refer to xx as being KK-sparse, with the understanding that we can express xx as x=Ψαx=Ψα where α0Kα0K.

Sparsity has long been exploited in signal processing and approximation theory for tasks such as compression [1], [7], [9] and denoising [2], and in statistics and learning theory as a method for avoiding overfitting [10]. Sparsity also figures prominently in the theory of statistical estimation and model selection [4], [8], in the study of the human visual system [6], and has been exploited heavily in image processing tasks, since the multiscale wavelet transform [5] provides nearly sparse representations for natural images. Below, we briefly describe some one-dimensional (1-D) and two-dimensional (2-D) examples.

1-D signal models

We will now present an example of three basis expansions that yield different levels of sparsity for the same signal. A simple periodic signal is sampled and represented as a periodic train of weighted impulses (see Figure 1). One can interpret sampling as a basis expansion where our elements in our basis are impulses placed at periodic points along the time axis. We know that in this case, our dual basis consists of sinc functions used to reconstruct our signal from discrete-time samples. This representation contains many non-zero coefficients, and due to the signal's periodicity, there are many redundant measurements. Representing the signal in the Fourier basis, on the other hand, requires only two non-zero basis vectors, scaled appropriately at the positive and negative frequencies (see Figure 1). Driving the number of coefficients needed even lower, we may apply the discrete cosine transform (DCT) to our signal, thereby requiring only a single non-zero coefficient in our expansion (see Figure 1). The DCT equation is Xk=n=0N-1xncos(πN(n+12)k)Xk=n=0N-1xncos(πN(n+12)k) with k=0,,N-1k=0,,N-1, xnxn the input signal, and N the length of the transform.

Figure 1: Cosine signal in three representations: (a) Train of impulses (b) Fourier basis (c) DCT basis
(a) (b) (c)
Figure 1(a) (cos.png)Figure 1(b) (cosFourier.png)Figure 1(c) (cosDCT.png)

2-D signal models

This same concept can be extended to 2-D signals as well. For instance, a binary picture of a nighttime sky is sparse in the standard pixel domain because most of the pixels are zero-valued black pixels. Likewise, natural images are characterized by large smooth or textured regions and relatively few sharp edges. Signals with this structure are known to be very nearly sparse when represented using a multiscale wavelet transform [5]. The wavelet transform consists of recursively dividing the image into its low- and high-frequency components. The lowest frequency components provide a coarse scale approximation of the image, while the higher frequency components fill in the detail and resolve edges. What we see when we compute a wavelet transform of a typical natural image, as shown in Figure 2, is that most coefficients are very small. Hence, we can obtain a good approximation of the signal by setting the small coefficients to zero, or thresholding the coefficients, to obtain a KK-sparse representation. When measuring the approximation error using an pp norm, this procedure yields the best KK-term approximation of the original signal, i.e., the best approximation of the signal using only KK basis elements.1

Figure 2: Sparse representation of an image via a multiscale wavelet transform. (a) Original image. (b) Wavelet representation. Large coefficients are represented by light pixels, while small coefficients are represented by dark pixels. Observe that most of the wavelet coefficients are close to zero.
(a) (b)
Figure 2(a) (Cameraman.png)Figure 2(b) (CameramanWavelet.png)

Sparsity results through this decomposition because in most natural images most pixel values vary little from their neighbors. Areas with little contrast difference can be represent with low frequency wavelets. Low frequency wavelets are created through stretching a mother wavelet and thus expanding it in space. On the other hand, discontinuities, or edges in the picture, require high frequency wavelets, which are created through compacting a mother wavelet. At the same time, the transitions are generally limited in space, mimicking the properties of the high frequency compacted wavelet. See "Compressible signals" for an example.

Footnotes

  1. Thresholding yields the best KK-term approximation of a signal with respect to an orthonormal basis. When redundant frames are used, we must rely on sparse approximation algorithms like those described later in this course [3], [5].

References

  1. DeVore, R. (1998). Nonlinear Approximation. Acta Numerica, 7, 51–150.
  2. Donoho, D. (1995). Denoising by soft-thresholding. IEEE Trans. Inform. Theory, 41(3), 613–627.
  3. Elad, M. (2010). Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing. New York, NY: Springer.
  4. Hastie, T. and Tibshirani, R. and Friedman, J. (2001). The Elements of Statistical Learning. New York, NY: Springer.
  5. Mallat, S. (1999). A Wavelet Tour of Signal Processing. San Diego, CA: Academic Press.
  6. Olshausen, B. and Field, D. (1996). Emergence of simple-cell receptive field properties by learning a sparse representation. Nature, 381, 607–609.
  7. Pennebaker, W. and Mitchell, J. (1993). JPEG Still Image Data Compression Standard. Van Nostrand Reinhold.
  8. Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. J. Royal Statist. Soc B, 58(1), 267–288.
  9. Taubman, D. and Marcellin, M. (2001). JPEG 2000: Image Compression Fundamentals, Standards and Practice. Kluwer.
  10. Vapnik, V. (1999). The Nature of Statistical Learning Theory. New York, NY: Springer-Verlag.

Content actions

Download module as:

Add module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks