Skip to content Skip to navigation Skip to collection information

Connexions

You are here: Home » Content » A Wavelet Tour of Signal Processing, The Sparse Way » Sparsity in Redundant Dictionaries

Navigation

Lenses

What is a lens?

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

This content is ...

Endorsed by Endorsed (What does "Endorsed by" mean?)

This content has been endorsed by the organizations listed. Click each link for a list of all content endorsed by the organization.

Affiliated with (What does "Affiliated with" mean?)

This content is either by members of the organizations listed or about topics related to the organizations listed. Click each link to see a list of all content affiliated with the organization.
  • NSF Partnership display tagshide tags

    This collection is included inLens: NSF Partnership in Signal Processing
    By: Sidney Burrus

    Click the "NSF Partnership" link to see all content affiliated with them.

    Click the tag icon tag icon to display tags associated with this content.

  • Featured Content display tagshide tags

    This collection is included inLens: Connexions Featured Content
    By: Connexions

    Click the "Featured Content" link to see all content affiliated with them.

    Click the tag icon tag icon to display tags associated with this content.

Also in these lenses

  • UniqU content

    This collection is included inLens: UniqU's lens
    By: UniqU, LLC

    Click the "UniqU content" link to see all content selected in this lens.

  • Lens for Engineering

    This module and collection are included inLens: Lens for Engineering
    By: Sidney Burrus

    Click the "Lens for Engineering" link to see all content selected in this lens.

Recently Viewed

This feature requires Javascript to be enabled.

Tags

(What is a tag?)

These tags come from the endorsement, affiliation, and other lenses that include this content.
 

Sparsity in Redundant Dictionaries

Module by: Stephane Mallat. E-mail the author

Summary: This collection comprises Chapter 1 of the book A Wavelet Tour of Signal Processing, The Sparse Way (third edition, 2009) by Stéphane Mallat. The book's website at Academic Press is http://www.elsevier.com/wps/find/bookdescription.cws_home/714561/description#description The book's complementary materials are available at http://wavelet-tour.com

In natural languages, large dictionaries are needed to refine ideas with short sentences, and they evolve with usage. Eskimos have eight different words to describe snow quality, whereas a single word is typically sufficient in a Parisian dictionary.Similarly, large signal dictionaries of vectors are needed to construct sparse representations of complex signals. However, computing and optimizing a signal approximation by choosing the best M dictionary vectors is much more difficult.

Figure 1: A local cosine basis divides the time axis with smooth windows gp(t)gp(t) and translates these windows into frequency.
Figure 1 (newfig8.png)

Frame Analysis and Synthesis

Suppose that a sparse family of vectors {φp}pΛ{φp}pΛ has been selected to approximate a signal ff. An approximation can be recovered as an orthogonal projection in the space Vλ generated by these vectors. We then face one of the following twoproblems.

  1. In a dual-synthesis problem, the orthogonal projection fλfλ of ff in Vλ must be computed from dictionary coefficients, {f,φp}pλ{f,φp}pλ, provided by an analysis operator.  This is the case when a signal transform {f,φp}pΓ{f,φp}pΓ is calculated in some large dictionary and a subset of inner products are selected. Such inner products may correspond to coefficients above a threshold or local maxima values.
  2. In a dual-analysis problem, the decomposition coefficients of fλfλ must be computed on a family of selected vectors {φp}pΛ{φp}pΛ. This problem appears when sparse representation algorithms select vectors as opposed to inner products. This is the case for pursuit algorithms, which compute approximation supports in highly redundant dictionaries.

The frame theory gives energy equivalence conditions to solve both problems with stable operators. A family {φp}pΛ{φp}pΛ is a frame of the space V it generates if there exists BA>0BA>0 such that

h V , A h 2 m λ | h , φ p | 2 B h 2 . h V , A h 2 m λ | h , φ p | 2 B h 2 .
(1)

The representation is stable since any perturbation of frame coefficients implies a modification of similar magnitude on h. Chapter 5 proves that the existence of a dual frame {φ˜p}pΛ{φ˜p}pΛ that solves both the dual-synthesis and dual-analysisproblems:

f λ = p λ f , φ p φ ˜ p = p λ f , φ ˜ p φ p . f λ = p λ f , φ p φ ˜ p = p λ f , φ ˜ p φ p .
(2)

Algorithms are provided to calculate these decompositions. The dual frame is also stable:

f V , B - 1 f 2 m γ | f , φ ˜ p | 2 B - 1 f 2 . f V , B - 1 f 2 m γ | f , φ ˜ p | 2 B - 1 f 2 .
(3)

The frame bounds A and B are redundancy factors. If the vectors {φp}pΓ{φp}pΓ are normalized and linearly independent, then A1BA1B. Such a dictionary is called a Riesz basis of V and the dual frame is biorthogonal:

( p , p ' ) λ 2 , φ p , φ ˜ p ' = δ [ p - p ' ] . ( p , p ' ) λ 2 , φ p , φ ˜ p ' = δ [ p - p ' ] .
(4)

When the basis is orthonormal, then both bases are equal. Analysis and synthesis problems are then identical.

The frame theory is also used to construct redundant dictionaries that define complete, stable, and redundant signal representations, where V is then the whole signal space. The frame bounds measure the redundancy of such dictionaries. Chapter 5 studies the construction of windowed Fourier and wavelet frame dictionaries by sampling their time, frequency, and scaling parameters, while controlling frame bounds. In two dimensions, directional wavelet frames include wavelets sensitive to directional image structures such as textures or edges.

To improve the sparsity of images having edges along regular geometric curves, Candès and Donoho (CandesD:99) introduced curvelet frames, with elongated waveforms having different directions, positions, and scales. Images with piecewise regular edges have representations that are asymptotically more sparse by thresholding curvelet coefficients than wavelet coefficients.

Ideal Dictionary Approximations

In a redundant dictionary D={φp}pγD={φp}pγ,  we would like to find the best approximation support λ with M=|λ|M=|λ| vectors, which minimize the error f-fλ2f-fλ2. Chapter 12 proves that it is equivalent to find λT, which minimizes the corresponding approximation Lagrangian

L 0 ( T , f , λ ) = f - f λ 2 + T 2 | λ | , L 0 ( T , f , λ ) = f - f λ 2 + T 2 | λ | ,
(5)

for some multiplier T.

Compression and denoising are two applications of redundant dictionary approximations. When compressing signals by quantizing dictionary coefficients, the distortion rate varies, like the Lagrangian Equation 5, with a multiplier T that depends on the quantization step. Optimizing the coder is thus equivalent to minimizing this approximation Lagrangian. For sparse representations, most of the bits are devoted to coding the geometry of the sparse approximation set λT in γ.

Estimators reducing noise from observations X=f+WX=f+W are also optimized by finding a best orthogonal projector over a set of dictionary vectors. The model selection theory of Barron, Birgé, and Massart (massart-birge-barron) proves that finding λ˜Tλ˜T, which minimizes this same Lagrangian L0(T,X,λ)L0(T,X,λ), defines an estimator that has a risk on the same order as the minimum approximation error f-fΛT2f-fΛT2 up to a logarithmic factor.  This is similar to the optimality result obtained for thresholding estimators in an orthonormal basis.

The bad news is that minimizing the approximation Lagrangian L0 is an NP-hard problem and is therefore computationally intractable. It is necessary therefore to find algorithms that are sufficiently fast to compute suboptimal, but “good enough,” solutions.

Dictionaries of Orthonormal Bases

To reduce the complexity of optimal approximations, the search can be reduced to subfamilies of orthogonal dictionary vectors. In a dictionary of orthonormal bases, any family of orthogonal dictionary vectors can be complemented to form an orthogonal basis BB included in DD. As a result, the best approximation of ff from orthogonal vectors in BB is obtained by thresholding the coefficients of ff in a “best basis” in DD.

For tree dictionaries of orthonormal bases obtained by a recursive split of orthogonal vector spaces, the fast, dynamic programming algorithm of Coifman and Wickerhauser (CoifmanMW:92) finds such a best basis with O(P)O(P) operations, where P is the dictionary size.

Wavelet packet and local cosine bases are examples of tree dictionaries of time-frequency orthonormal bases of size P=Nlog2NP=Nlog2N. A best basis is a time-frequency tiling that is the best match to the signal time-frequency structures.

To approximate geometrically regular edges, wavelets are not as efficient as curvelets, but wavelets provide more sparse representations of singularities that are not distributed along geometrically regular curves. Bandlet dictionaries, introduced by Le Pennec, Mallat, and Peyré (bandelets-siam, bandlets-peyre), are dictionaries of orthonormal bases that can adapt to the variability of images' geometric regularity. Minimax optimal asymptotic rates are derived for compression and denoising.

Pursuit in Dictionaries

Approximating signals only from orthogonal vectors brings rigidity that limits the ability to optimize the representation. Pursuit algorithms remove this constraint with flexible procedures that search for sparse, although not necessarily optimal, dictionary approximations. Such approximations are computed by optimizing the choice of dictionary vectors {φp}pΛ{φp}pΛ.

Matching Pursuit

Matching pursuit algorithms introduced by Mallat and Zhang (MallatZ:93) are greedy algorithms that optimize approximations by selecting dictionary vectors one by one. The vector in φp0Dφp0D that best approximates a signal ff is

Φ p 0 = argmax p є Ґ | f , Φ p | Φ p 0 = argmax p є Ґ |f,Φp|
(6)

and the residual approximation error is

R f = f - f , φ p 0 φ p 0 . R f = f - f , φ p 0 φ p 0 .
(7)

A matching pursuit further approximates the residue RfRf by selecting another best vector φp1φp1 from the dictionary and continues this process over next-order residues RmfRmf, which produces a signal decomposition:

f = m = 0 M - 1 R m f , φ p m φ p m + R M f . f = m = 0 M - 1 R m f , φ p m φ p m + R M f .
(8)

The approximation from the M-selected vectors {φpm}0m<M{φpm}0m<M can be refined with an orthogonal back projection on the space generated by these vectors. An orthogonal matching pursuit further improves this decomposition by orthogonalizing progressively the projection directions φpmφpm during the decompositon. The resulting decompositions are applied to compression, denoising, and pattern recognition of various types of signals, images, and videos.

Basis Pursuit

Approximating ff with a minimum number of nonzero coefficients a[p]a[p] in a dictionary DD is equivalent to minimizing the 1010 norm a0a0, which gives the number of nonzero coefficients. This 1010 norm is highly nonconvex, which explains why the resulting minimization is NP-hard. Donoho and Chen (DonohoC:95) thus proposed replacing the 1010 norm by the 1111 norm a1=pγ|a[p]|a1=pγ|a[p]|, which is convex. The resulting basis pursuit algorithm computes a synthesis operator

f = p γ a [ p ] φ p , which minimizes a 1 = p γ | a [ p ] | . f = p γ a [ p ] φ p , which minimizes a 1 = p γ | a [ p ] | .
(9)

This optimal solution is calculated with a linear programming algorithm. A basis pursuit is computationally more intense than a matching pursuit, but it is a more global optimization that yields representations that can be moresparse.

In approximation, compression, or denoising applications, ff is recovered with an error bounded by a precision parameter ϵ. The optimization Equation 10 is thus relaxed by finding a synthesis such that

f - p γ a [ p ] φ p ϵ , which minimizes a 1 = p γ | a [ p ] | . f - p γ a [ p ] φ p ϵ , which minimizes a 1 = p γ | a [ p ] | .
(10)

This is a convex minimization problem, with a solution calculated by minimizing the corresponding 1111 Lagrangian

L 1 ( T , f , a ) = f - p γ a [ p ] φ p 2 + T a 1 , L 1 ( T , f , a ) = f - p γ a [ p ] φ p 2 + T a 1 ,
(11)

where T is a Lagrange multiplier that depends on ϵ. This is called an 1111 Lagrangian pursuit in this book. A solution a˜[p]a˜[p] is computed with iterative algorithms that are guaranteed to converge. The number of nonzero coordinates of a˜a˜ typically decrea-ses as T increases.

Incoherence for Support Recovery

Matching pursuit and 1111 Lagrangian pursuits are optimal if they recover the approx-imation support λT, which minimizes the approximation Lagrangian

L 0 ( T , f , λ ) = f - f λ 2 + T 2 | λ | , L 0 ( T , f , λ ) = f - f λ 2 + T 2 | λ | ,
(12)

where fλfλ is the orthogonal projection of ff in the space Vλ generated by {φp}pΛ{φp}pΛ. This is not always true and depends on λT. An Exact Recovery Criteria proved byTropp (tropp-multi-omp) guarantees that pursuit algorithms do recover the optimal supportλT if

E R C ( λ T ) = max q / λ T p λ T | φ ˜ p , φ q | < 1 , E R C ( λ T ) = max q / λ T p λ T | φ ˜ p , φ q | < 1 ,
(13)

where {φ˜p}pΛT{φ˜p}pΛT is the biorthogonal basis of {φp}pΛT{φp}pΛT in VΛTVΛT. This criterion implies that dictionary vectors φq outside λT should have a small inner product with vectors in λT.

This recovery is stable relative to noise perturbations if {φp}pΛ{φp}pΛ has Riesz bounds that are not too far from 1. These vectors should be nearly orthogonal and hence have small inner products. These small inner-product conditions are interpreted as a form of incoherence. A stable recovery of λT is possible if vectors in λT are incoherent with respect to other dictionary vectors and are incoherent between themselves. It depends on the geometric configuration of λT in γ.

Collection Navigation

Content actions

Download:

Collection as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Module as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Add:

Collection to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks

Module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks