Skip to content Skip to navigation

OpenStax_CNX

You are here: Home » Content » Time-Frequency Dictionaries

Navigation

Lenses

What is a lens?

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

This content is ...

Affiliated with (What does "Affiliated with" mean?)

This content is either by members of the organizations listed or about topics related to the organizations listed. Click each link to see a list of all content affiliated with the organization.
  • NSF Partnership display tagshide tags

    This module is included inLens: NSF Partnership in Signal Processing
    By: Sidney BurrusAs a part of collection: "A Wavelet Tour of Signal Processing, The Sparse Way"

    Click the "NSF Partnership" link to see all content affiliated with them.

    Click the tag icon tag icon to display tags associated with this content.

  • Featured Content display tagshide tags

    This module is included inLens: Connexions Featured Content
    By: ConnexionsAs a part of collection: "A Wavelet Tour of Signal Processing, The Sparse Way"

    Click the "Featured Content" link to see all content affiliated with them.

    Click the tag icon tag icon to display tags associated with this content.

Also in these lenses

  • UniqU content

    This module is included inLens: UniqU's lens
    By: UniqU, LLCAs a part of collection: "A Wavelet Tour of Signal Processing, The Sparse Way"

    Click the "UniqU content" link to see all content selected in this lens.

  • Lens for Engineering

    This module is included inLens: Lens for Engineering
    By: Sidney Burrus

    Click the "Lens for Engineering" link to see all content selected in this lens.

Recently Viewed

This feature requires Javascript to be enabled.

Tags

(What is a tag?)

These tags come from the endorsement, affiliation, and other lenses that include this content.
 

Time-Frequency Dictionaries

Module by: Stephane Mallat. E-mail the author

Summary: This collection comprises Chapter 1 of the book A Wavelet Tour of Signal Processing, The Sparse Way (third edition, 2009) by Stéphane Mallat. The book's website at Academic Press is http://www.elsevier.com/wps/find/bookdescription.cws_home/714561/description#description The book's complementary materials are available at http://wavelet-tour.com

Motivated by quantum mechanics, in 1946 the physicist Gabor (Gabor:46) proposed decomposing signals over dictionaries of elementary waveforms which he called time-frequency atoms that have a minimal spread in a time-frequency plane. By showing that such decompositions are closely related to our perception of sounds, and that they exhibit important structures in speech and music recordings, Gabor demonstrated the importance of localized time-frequency signal processing. Beyond sounds, large classes of signals have sparse decompositions as sums of time-frequency atoms selected from appropriate dictionaries. The key issue is to understand how to construct dictionaries with time-frequency atoms adapted to signal properties.

Heisenberg Uncertainty

Figure 1: Heisenberg box representing an atom φγ.
Figure 1 (newfig3.png)

A time-frequency dictionary D={φγ}γΓD={φγ}γΓ is composed of waveforms of unit norm φγ=1φγ=1, which have a narrow localization in time and frequency. The time localization u of φγ and its spread around u, are defined by

u = t | φ γ ( t ) | 2 d t and σ t , γ 2 = | t - u | 2 | φ γ ( t ) | 2 d t . u = t | φ γ ( t ) | 2 d t and σ t , γ 2 = | t - u | 2 | φ γ ( t ) | 2 d t .
(1)

Similarly, the frequency localization and spread of φ^γφ^γ are defined by

ξ = ( 2 π ) - 1 ω | φ ^ γ ( ω ) | 2 d ω and σ ω , γ 2 = ( 2 π ) - 1 | ω - ξ | 2 | φ ^ γ ( ω ) | 2 d ω . ξ = ( 2 π ) - 1 ω | φ ^ γ ( ω ) | 2 d ω and σ ω , γ 2 = ( 2 π ) - 1 | ω - ξ | 2 | φ ^ γ ( ω ) | 2 d ω .
(2)

The Fourier Parseval formula

f , φ γ = - + f ( t ) φ γ * ( t ) d t = 1 2 π - + f ^ ( ω ) φ ^ γ * ( ω ) d ω f , φ γ = - + f ( t ) φ γ * ( t ) d t = 1 2 π - + f ^ ( ω ) φ ^ γ * ( ω ) d ω
(3)

shows that f,φγf,φγ depends mostly on the values f(t)f(t) and f^(ω)f^(ω), where φγ(t)φγ(t) and φ^γ(ω)φ^γ(ω) are nonnegligible , and hence for (t,ω)(t,ω) in a rectangle centered at (u,ξ)(u,ξ), of size σt,γ×σω,γσt,γ×σω,γ. This rectangle is illustrated by Figure 1 in this time-frequency plane (t,ω)(t,ω). It can be interpreted as a “quantum of information” over an elementary resolution cell. The uncertainty principle theorem proves (see Chapter 2) that this rectangle has a minimum surface that limits the joint time-frequency resolution:

σ t , γ σ ω , γ 1 2 . σ t , γ σ ω , γ 1 2 .
(4)

Constructing a dictionary of time-frequency atoms can thus be thought of as covering the time-frequency plane with resolution cells having a time width σt,γσt,γ anda frequency width σω,γσω,γ which may vary but with a surface larger than one-half. Windowed Fourier and wavelet transforms are two important examples.

Windowed Fourier Transform

A windowed Fourier dictionary is constructed by translating in time and frequency a time window g(t)g(t), of unit norm g=1g=1, centered at t=0t=0:

D = g u , ξ ( t ) = g ( t - u ) e i ξ t ( u , ξ ) R 2 . D = g u , ξ ( t ) = g ( t - u ) e i ξ t ( u , ξ ) R 2 .
(5)

The atom gu,ξgu,ξ is translated by u in time and by ξ in frequency. The time-and-frequency spread of gu,ξgu,ξ is independent of u and ξ. This means that each atom gu,ξgu,ξ corresponds to a Heisenberg rectangle that has a size σt×σωσt×σω independent of its position (u,ξ)(u,ξ), as shown by Figure 2.

Figure 2: Time-frequency boxes (“Heisenberg rectangles”) representing the energy spread of two windowed Fourier atoms.
Figure 2 (newfig4.png)

The windowed Fourier transform projects ff on each dictionary atom gu,ξgu,ξ:

S f ( u , ξ ) = f , g u , ξ = - + f ( t ) g ( t - u ) e - i ξ t d t . S f ( u , ξ ) = f , g u , ξ = - + f ( t ) g ( t - u ) e - i ξ t d t .
(6)

It can be interpreted as a Fourier transform of ff at the frequency ξ, localized by the window g(t-u)g(t-u) in the neighborhood of u. This windowed Fourier transform is highly redundant and represents one-dimensional signals by a time-frequency image in (u,ξ)(u,ξ). It is thus necessary to understand how to select many fewer time-frequency coefficients that represent the signal efficiently.

When listening to music, we perceive sounds that have a frequency that varies in time. Chapter 4 shows that a spectral line of ff creates high-amplitude windowed Fourier coefficients Sf(u,ξ)Sf(u,ξ) at frequencies ξ(u)ξ(u) that depend on time u. These spectral components are detected and characterized by ridge points, which are local maxima in this time-frequency plane. Ridge points define a time-frequency approximation support λ of ff with a geometry that depends on the time-frequency evolution of the signal spectral components. Modifying the sound duration or audio transpositions are implemented by modifying the geometry of the ridge support in time frequency.

A windowed Fourier transform decomposes signals over waveforms that have the same time and frequency resolution. It is thus effective as long as the signal does not include structures having different time-frequency resolutions, some being very localized in time and others very localized in frequency.  Wavelets address this issue by changing the time and frequency resolution.

Continuous Wavelet Transform

In reflection seismology, Morlet knew that the waveforms sent underground have a duration that is too long at high frequencies to separate the returns of fine, closely spaced geophysical layers. Such waveforms are called wavelets in geophysics. Instead of emitting pulses of equal duration, he thought of sending shorter waveforms at high frequencies. These waveforms were obtained by scaling the mother wavelet, hence the name of this transform. Although Grossmann was working in theoretical physics, he recognized in Morlet's approach some ideas that were close to his own work on coherent quantum states.

Nearly forty years after Gabor, Morlet and Grossmann reactivated a fundamental collaboration between theoretical physics and signal processing, which led to the formalization of the continuous wavelet transform (GrossmannM:84). These ideas were not totally new to mathematicians working in harmonic analysis, or to computer vision researchers studying multiscale image processing. It was thus only the beginning of a rapid catalysis that brought together scientists with very different backgrounds.

A wavelet dictionary is constructed from a mother wavelet ψ of zero average

- + ψ ( t ) d t = 0 , - + ψ ( t ) d t = 0 ,
(7)

which is dilated with a scale parameter s, and translated by u:

D = ψ u , s ( t ) = 1 s ψ t - u s u R , s > 0 . D = ψ u , s ( t ) = 1 s ψ t - u s u R , s > 0 .
(8)

The continuous wavelet transform of ff at any scale s and position u is the projection of ff on the corresponding wavelet atom:

W f ( u , s ) = f , ψ u , s = - + f ( t ) 1 s ψ * t - u s d t . W f ( u , s ) = f , ψ u , s = - + f ( t ) 1 s ψ * t - u s d t .
(9)

It represents one-dimensional signals by highly redundant time-scale images in (u,s)(u,s).

Varying Time-Frequency Resolution

As opposed to windowed Fourier atoms, wavelets have a time-frequency resolution that changes. The wavelet ψu,sψu,s has a time support centered at u and proportional to s. Let us choose a wavelet ψ whose Fourier transform ψ^(ω)ψ^(ω) is nonzero in a positive frequency interval centered at η. The Fourier transform ψ^u,s(ω)ψ^u,s(ω) is dilated by 1/s1/s and thus is localized in a positive frequency interval centered at ξ=η/sξ=η/s; its size is scaled by 1/s1/s. In the time-frequency plane, the Heisenberg box of a wavelet atom ψu,sψu,s is therefore a rectangle centered at (u,η/s)(u,η/s), with time and frequency widths, respectively, proportional to s and 1/s1/s. When s varies, the time and frequency width of this time-frequency resolution cell changes, but its area remains constant, as illustrated by Figure 3.

Large-amplitude wavelet coefficients can detect and measure short high-frequency variations because they have a narrow time localization at high frequencies. At low frequencies their time resolution is lower, but they have a better frequency resolution. This modification of time and frequency resolution is adapted to represent sounds with sharp attacks, or radar signals having a frequency that may vary quickly at high frequencies.

Multiscale Zooming

A wavelet dictionary is also adapted to analyze the scaling evolution of transients with zooming procedures across scales. Suppose now that ψ is real. Since it has a zero average, a wavelet coefficient Wf(u,s)Wf(u,s) measures the variation of ff in a neighborhood of u that has a size proportional to s. Sharp signal transitions create large-amplitude wavelet coefficients.

Figure 3: Heisenberg time-frequency boxes of two wavelets, ψu,sψu,s and ψu0,s0ψu0,s0. When the scale s decreases, the time support is reduced but the frequency spread increases and covers an interval that is shifted toward high frequencies.
Figure 3 (newfig5.png)

Signal singularities have specific scaling invariance characterized by Lipschitz exponents. Chapter 6 relates the pointwise regularity of ff to the asymptotic decay of the wavelet transform amplitude |Wf(u,s)||Wf(u,s)| when s goes to zero. Singularities are detected by following the local maxima of the wavelet transform acrossscales.

In images, wavelet local maxima indicate the position of edges, which are sharp variations of image intensity. It defines scale–space approximation support of ff from which precise image approximations are reconstructed. At different scales, the geometry of this local maxima support provides contours of image structures of varying sizes. This multiscale edge detection is particularly effective for pattern recognition in computer vision (Canny:86).

The zooming capability of the wavelet transform not only locates isolated singular events, but can also characterize more complex multifractal signals having nonisolated singularities. Mandelbrot (Mandelbrot:82) was the first to recognize the existence of multifractals in most corners of nature. Scaling one part of a multifractal produces a signal that is statistically similar to the whole. This self-similarity appears in the continuous wavelet transform, which modifies the analyzing scale. From global measurements of the wavelet transform decay, Chapter 6 measures the singularity distribution of multifractals. This is particularly important in analyzing their properties and testing multifractal models in physics or in financial time series.

Time-Frequency Orthonormal Bases

Orthonormal bases of time-frequency atoms remove all redundancy and define stable representations. A wavelet orthonormal basis is an example of the time-frequency basis obtained by scaling a wavelet ψ with dyadic scales s=2js=2j and translating it by 2jn2jn, which is written ψj,nψj,n. In the time-frequency plane, the Heisenberg resolution box of ψj,nψj,n is a dilation by 2j2j and translation by 2jn2jn of the Heisenberg box of ψ. A wavelet orthonormal is thus a subdictionary of the continuous wavelet transform dictionary, which yields a perfect tiling of the time-frequency plane illustrated in Figure 4.

Figure 4: The time-frequency boxes of a wavelet basis define a tiling of the time-frequency plane.
Figure 4 (newfig6.png)

One can construct many other orthonormal bases of time-frequency atoms, corresponding to different tilings of the time-frequency plane. Wavelet packet and local cosine bases are two important examples constructed in Chapter 8, with time-frequency atoms that split the frequency and the time axis, respectively, in intervals of varying sizes.

Wavelet Packet Bases

Wavelet bases divide the frequency axis into intervals of 1 octave bandwidth. Coifman, Meyer, and Wickerhauser (CoifmanMW:92) have generalized this construction with bases that split the frequency axis in intervals of bandwidth that may be adjusted. Each frequency interval is covered by the Heisenberg time-frequency boxes of wavelet packet functions translated in time, in order to cover the whole plane, as shown by Figure 5.

As for wavelets, wavelet-packet coefficients are obtained with a filter bank of conjugate mirror filters that split the frequency axis in several frequency intervals. Different frequency segmentations correspond to different wavelet packet bases. For images, a filter bank divides the image frequency support in squares of dyadic sizes that can be adjusted.

Figure 5: A wavelet packet basis divides the frequency axis in separate intervals of varying sizes. A tiling is obtained by translating in time the wavelet packets covering each frequency interval.
Figure 5 (newfig7.png)

Local Cosine Bases

Local cosine orthonormal bases are constructed by dividing the time axis instead of the frequency axis. The time axis is segmented in successive intervals [ap,ap+1][ap,ap+1]. The local cosine bases of Malvar (Malvar:88) are obtained by designing smooth windows gp(t)gp(t) that cover each interval [ap,ap+1][ap,ap+1], and by multiplying them by cosine functions cos(ξt+φ)cos(ξt+φ) of different frequencies. This is yet another idea that has been independently studied in physics, signal processing, and mathematics. Malvar's original construction was for discrete signals. At the same time, the physicist Wilson (Wilson:87) was designing a local cosine basis, with smooth windows of infinite support, to analyze the properties of quantum coherent states. Malvar bases were also rediscovered and generalized by the harmonic analysts Coifman and Meyer (CoifmanM:91). These different views of the same bases brought to light mathematical and algorithmic properties that opened new applications.

A multiplication by cos(ξt+φ)cos(ξt+φ) translates the Fourier transform g^p(ω)g^p(ω) of gp(t)gp(t) by ±ξ±ξ. Over positive frequencies, the time-frequency box of the modulated window gp(t)cos(ξt+φ)gp(t)cos(ξt+φ) is therefore equal to the time-frequency box of gp translated by ξ along frequencies. (Reference) shows the time-frequency tiling corresponding to such a local cosine basis. For images, a two-dimensional cosine basis is constructed by dividing the image support in squares of varying sizes.

Content actions

Download module as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Add module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks