Skip to content Skip to navigation


You are here: Home » Content » Wavelet-Based Signal Processing and Applications


Recently Viewed

This feature requires Javascript to be enabled.

Wavelet-Based Signal Processing and Applications

Module by: C. Sidney Burrus, Haitao Guo. E-mail the authors

This chapter gives a brief discussion of several areas of application. It is intended to show what areas and what tools are being developed and to give some references to books, articles, and conference papers where the topics can be further pursued. In other words, it is a sort of annotated bibliography that does not pretend to be complete. Indeed, it is impossible to be complete or up-to-date in such a rapidly developing new area and in an introductory book.

In this chapter, we briefly consider the application of wavelet systems from two perspectives. First, we look at wavelets as a tool for denoising and compressing a wide variety of signals. Second, we very briefly list several problems where the application of these tools shows promise or has already achieved significant success. References will be given to guide the reader to the details of these applications, which are beyond the scope of this book.

Wavelet-Based Signal Processing

To accomplish frequency domain signal processing, one can take the Fourier transform (or Fourier series or DFT) of a signal, multiply some of the Fourier coefficients by zero (or some other constant), then take the inverse Fourier transform. It is possible to completely remove certain components of a signal while leaving others completely unchanged. The same can be done by using wavelet transforms to achieve wavelet-based, wavelet domain signal processing, or filtering. Indeed, it is sometimes possible to remove or separate parts of a signal that overlap in both time and frequency using wavelets, something impossible to do with conventional Fourier-based techniques.

Figure 1: Transform-Based Signal Processor
Transform-Based Signal Processor

The classical paradigm for transform-based signal processing is illustrated in Figure 1 where the center “box" could be either a linear or nonlinear operation. The “dynamics" of the processing are all contained in the transform and inverse transform operation, which are linear. The transform-domain processing operation has no dynamics; it is an algebraic operation. By dynamics, we mean that a process depends on the present and past, and by algebraic, we mean it depends only on the present. An FIR (finite impulse response) filter such as is part of a filter bank is dynamic. Each output depends on the current and a finite number of past inputs (see (Reference)). The process of operating point-wise on the DWT of a signal is static or algebraic. It does not depend on the past (or future) values, only the present. This structure, which separates the linear, dynamic parts from the nonlinear static parts of the processing, allows practical and theoretical results that are impossible or very difficult using a completely general nonlinear dynamic system.

Linear wavelet-based signal processing consists of the processor block in Figure 1 multiplying the DWT of the signal by some set of constants (perhaps by zero). If undesired signals or noise can be separated from the desired signal in the wavelet transform domain, they can be removed by multiplying their coefficients by zero. This allows a more powerful and flexible processing or filtering than can be achieved using Fourier transforms. The result of this total process is a linear, time-varying processing that is far more versatile than linear, time-invariant processing. The next section gives an example of using the concentrating properties of the DWT to allow a faster calculation of the FFT.

Approximate FFT using the Discrete Wavelet Transform

In this section, we give an example of wavelet domain signal processing. Rather than computing the DFT from the time domain signal using the FFT algorithm, we will first transform the signal into the wavelet domain, then calculate the FFT, and finally go back to the signal domain which is now the Fourier domain.

Most methods of approximately calculating the discrete Fourier transform (DFT) involve calculating only a few output points (pruning), using a small number of bits to represent the various calculations, or approximating the kernel, perhaps by using cordic methods. Here we use the characteristics of the signal being transformed to reduce the amount of arithmetic. Since the wavelet transform concentrates the energy of many classes of signals onto a small number of wavelet coefficients, this can be used to improve the efficiency of the DFT [44], [48], [55], [43] and convolution [45].


The DFT is probably the most important computational tool in signal processing. Because of the characteristics of the basis functions, the DFT has enormous capacity for the improvement of its arithmetic efficiency [15]. The classical Cooley-Tukey fast Fourier transform (FFT) algorithm has the complexity of O(Nlog2N)O(Nlog2N). Thus the Fourier transform and its fast algorithm, the FFT, are widely used in many areas, including signal processing and numerical analysis. Any scheme to speed up the FFT would be very desirable.

Although the FFT has been studied extensively, there are still some desired properties that are not provided by the classical FFT. Here are some of the disadvantages of the FFT algorithm:

  1. Pruning is not easy. When the number of input points or output points are small compared to the length of the DWT, a special technique called pruning [90] is often used. However, this often requires that the nonzero input data are grouped together. Classical FFT pruning algorithms do not work well when the few nonzero inputs are randomly located. In other words, a sparse signal may not necessarily give rise to faster algorithm.
  2. No speed versus accuracy tradeoff. It is common to have a situation where some error would be allowed if there could be a significant increase in speed. However, this is not easy with the classical FFT algorithm. One of the main reasons is that the twiddle factors in the butterfly operations are unit magnitude complex numbers. So all parts of the FFT structure are of equal importance. It is hard to decide which part of the FFT structure to omit when error is allowed and the speed is crucial. In other words, the FFT is a single speed and single accuracy algorithm.
  3. No built-in noise reduction capacity. Many real world signals are noisy. What people are really interested in are the DFT of the signals without the noise. The classical FFT algorithm does not have built-in noise reduction capacity. Even if other denoising algorithms are used, the FFT requires the same computational complexity on the denoised signal. Due to the above mentioned shortcomings, the fact that the signal has been denoised cannot be easily used to speed up the FFT.

Review of the Discrete Fourier Transform and FFT

The discrete Fourier transform (DFT) is defined for a length-N complex data sequence by

X ( k ) = n = 0 N - 1 x ( n ) e - j 2 π n k / N , k = 0 , ... , N - 1 X ( k ) = n = 0 N - 1 x ( n ) e - j 2 π n k / N , k = 0 , ... , N - 1

where we use j=-1j=-1. There are several ways to derive the different fast Fourier transform (FFT) algorithms. It can be done by using index mapping [15], by matrix factorization, or by polynomial factorization. In this chapter, we only discuss the matrix factorization approach, and only discuss the so-called radix-2 decimation in time (DIT) variant of the FFT.

Instead of repeating the derivation of the FFT algorithm, we show the block diagram and matrix factorization, in an effort to highlight the basic idea and gain some insight. The block diagram of the last stage of a length-8 radix-2 DIT FFT is shown in Figure 2. First, the input data are separated into even and odd groups. Then, each group goes through a length-4 DFT block. Finally, butterfly operations are used to combine the shorter DFTs into longer DFTs.

Figure 2: Last Stage of a Length-8 Radix-2 DIT FFT
Last Stage of a Length-8 Radix-2 DIT FFT

The details of the butterfly operations are shown in Figure 3, where WNi=e-j2πi/NWNi=e-j2πi/N is called the twiddle factor. All the twiddle factors are of magnitude one on the unit circle. This is the main reason that there is no complexity versus accuracy tradeoff for the classical FFT. Suppose some of the twiddle factors had very small magnitude, then the corresponding branches of the butterfly operations could be dropped (pruned) to reduce complexity while minimizing the error to be introduced. Of course the error also depends on the value of the data to be multiplied with the twiddle factors. When the value of the data is unknown, the best way is to cutoff the branches with small twiddle factors.

The computational complexity of the FFT algorithm can be easily established. If we let CFFT(N)CFFT(N) be the complexity for a length-N FFT, we can show

C F F T ( N ) = O ( N ) + 2 C F F T ( N / 2 ) , C F F T ( N ) = O ( N ) + 2 C F F T ( N / 2 ) ,

where O(N)O(N) denotes linear complexity. The solution to Equation Equation 2 is well known:

C F F T ( N ) = O ( N log 2 N ) . C F F T ( N ) = O ( N log 2 N ) .

This is a classical case where the divide and conquer approach results in very effective solution.

Figure 3: Butterfly Operations in a Radix-2 DIT FFT
Butterfly Operations in a Radix-2 DIT FFT

The matrix point of view gives us additional insight. Let FNFN be the N×NN×N DFT matrix; i.e., FN(m,n)=e-j2πmn/NFN(m,n)=e-j2πmn/N, where m,n{0,1,...,N-1}m,n{0,1,...,N-1}. Let SNSN be the N×NN×N even-odd separation matrix; e.g.,

S 4 = 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 . S 4 = 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 .

Clearly SN'SN=INSN'SN=IN, where ININ is the N×NN×N identity matrix. Then the DIT FFT is based on the following matrix factorization,

F N = F N S N ' S N = I N / 2 T N / 2 I N / 2 - T N / 2 F N / 2 0 0 F N / 2 S N , F N = F N S N ' S N = I N / 2 T N / 2 I N / 2 - T N / 2 F N / 2 0 0 F N / 2 S N ,

where TN/2TN/2 is a diagonal matrix with WNiWNi, i{0,1,...,N/2-1}i{0,1,...,N/2-1} on the diagonal. We can visualize the above factorization as


where we image the real part of DFT matrices, and the magnitude of the matrices for butterfly operations and even-odd separations. NN is taken to be 128 here.

Review of the Discrete Wavelet Transform

In this section, we briefly review the fundamentals of the discrete wavelet transform and introduce the necessary notation for future sections. The details of the DWT have been covered in other chapters.

At the heart of the discrete wavelet transform are a pair of filters hh and gg — lowpass and highpass respectively. They have to satisfy a set of constraints Figure: Sinc Scaling Function and Wavelet [110], [99], [108]. The block diagram of the DWT is shown in Figure 4. The input data are first filtered by hh and gg then downsampled. The same building block is further iterated on the lowpass outputs.

Figure 4: Building Block for the Discrete Wavelet Transform
Building Block for the Discrete Wavelet Transform

The computational complexity of the DWT algorithm can also be easily established. Let CDWT(N)CDWT(N) be the complexity for a length-N DWT. Since after each scale, we only further operate on half of the output data, we can show

C D W T ( N ) = O ( N ) + C D W T ( N / 2 ) , C D W T ( N ) = O ( N ) + C D W T ( N / 2 ) ,

which gives rise to the solution

C D W T ( N ) = O ( N ) . C D W T ( N ) = O ( N ) .

The operation in Figure 4 can also be expressed in matrix form WN;WN; e.g., for Haar wavelet,

W 4 H a a r = 2 / 2 1 1 0 0 0 0 1 1 1 - 1 0 0 0 0 1 - 1 . W 4 H a a r = 2 / 2 1 1 0 0 0 0 1 1 1 - 1 0 0 0 0 1 - 1 .

The orthogonality conditions on hh and gg ensure WN'WN=INWN'WN=IN. The matrix for multiscale DWT is formed by WNWN for different NN; e.g., for three scale DWT,

W N / 4 I N / 4 I N / 2 W N / 2 I N / 2 W N . W N / 4 I N / 4 I N / 2 W N / 2 I N / 2 W N .

We could further iterate the building block on some of the highpass outputs. This generalization is called the wavelet packets [25].

The Algorithm Development

The key to the fast Fourier transform is the factorization of FNFN into several sparse matrices, and one of the sparse matrices represents two DFTs of half the length. In a manner similar to the DIT FFT, the following matrix factorization can be made:

F N = F N W N T W N = A N / 2 B N / 2 C N / 2 D N / 2 F N / 2 0 0 F N / 2 W N , F N = F N W N T W N = A N / 2 B N / 2 C N / 2 D N / 2 F N / 2 0 0 F N / 2 W N ,

where AN/2,BN/2,CN/2AN/2,BN/2,CN/2, and DN/2DN/2 are all diagonal matrices. The values on the diagonal of AN/2AN/2 and CN/2CN/2 are the length-N DFT (i.e., frequency response) of hh, and the values on the diagonal of BN/2BN/2 and DN/2DN/2 are the length-N DFT of gg. We can visualize the above factorization as


where we image the real part of DFT matrices, and the magnitude of the matrices for butterfly operations and the one-scale DWT using length-16 Daubechies' wavelets [26], [27]. Clearly we can see that the new twiddle factors have non-unit magnitudes.

Figure 5: Last stage of a length-8 DWT based FFT.
Last stage of a length-8 DWT based FFT.

The above factorization suggests a DWT-based FFT algorithm. The block diagram of the last stage of a length-8 algorithm is shown in Figure 5. This scheme is iteratively applied to shorter length DFTs to get the full DWT based FFT algorithm. The final system is equivalent to a full binary tree wavelet packet transform [24] followed by classical FFT butterfly operations, where the new twiddle factors are the frequency response of the wavelet filters.

The detail of the butterfly operation is shown in Figure 6, where i{0,1,...,i{0,1,...,N/2-1}N/2-1}. Now the twiddle factors are length-N DFT of hh and gg. For well defined wavelet filters, they have well known properties; e.g., for Daubechies' family of wavelets, their frequency responses are monotone, and nearly half of which have magnitude close to zero. This fact can be exploited to achieve speed vs. accuracy tradeoff. The classical radix-2 DIT FFT is a special case of the above algorithm when h=[1,0]h=[1,0] and g=[0,1]g=[0,1]. Although they do not satisfy some of the conditions required for wavelets, they do constitute a legitimate (and trivial) orthogonal filter bank and are often called the lazy wavelets in the context of lifting.

Figure 6: Butterfly Operations in a Radix-2 DIT FFT
Figure 6 (figure6.png)

Computational Complexity

For the DWT-based FFT algorithm, the computational complexity is on the same order of the FFT — O(Nlog2N)O(Nlog2N), since the recursive relation in Equation 2 is again satisfied. However, the constant appearing before Nlog2NNlog2N depends on the wavelet filters used.

Fast Approximate Fourier Transform

The basic idea of the fast approximate Fourier transform (FAFT) is pruning; i.e., cutting off part of the diagram. Traditionally, when only part of the inputs are nonzero, or only part of the outputs are required, the part of the FFT diagram where either the inputs are zero or the outputs are undesired is pruned [90], so that the computational complexity is reduced. However, the classical pruning algorithm is quite restrictive, since for a majority of the applications, both the inputs and the outputs are of full length.

The structure of the DWT-based FFT algorithm can be exploited to generalize the classical pruning idea for arbitrary signals. From the input data side, the signals are made sparse by the wavelet transform [78], [75], [76], [27]; thus approximation can be made to speed up the algorithm by dropping the insignificant data. In other words, although the input signal are normally not sparse, DWT creates the sparse inputs for the butterfly stages of the FFT. So any scheme to prune the butterfly stages for the classical FFT can be used here. Of course, the price we have to pay here is the computational complexity of the DWT operations. In actual implementation, the wavelets in use have to be carefully chosen to balance the benefit of the pruning and the price of the transform. Clearly, the optimal choice depends on the class of the data we would encounter.

From the transform side, since the twiddle factors of the new algorithm have decreasing magnitudes, approximation can be made to speed up the algorithm by pruning the sections of the algorithm which correspond to the insignificant twiddle factors. The frequency response of the Daubechies' wavelets are shown in Figure 7. We can see that they are monotone decreasing. As the length increases, more and more points are close to zero. It should be noted that those filters are not designed for frequency responses. They are designed for flatness at 0 and ππ. Various methods can be used to design wavelets or orthogonal filter banks [83], [94], [108] to achieve better frequency responses. Again, there is a tradeoff between the good frequency response of the longer filters and the higher complexity required by the longer filters.

Figure 7: The Frequency Responses of Daubechies' Family of Wavelets
The Frequency Responses of Daubechies' Family of Wavelets

Computational Complexity

The wavelet coefficients are mostly sparse, so the input of the shorter DFTs are sparse. If the implementation scales well with respect to the percentage of the significant input (e.g., it uses half of the time if only half of the inputs are significant), then we can further lower the complexity. Assume for NN inputs, αNαN of them are significant (α1α1), we have

C F A F T ( N ) = O ( N ) + 2 α C F A F T ( N / 2 ) . C F A F T ( N ) = O ( N ) + 2 α C F A F T ( N / 2 ) .

For example, if α=12α=12, Equation Equation 13 simplifies to

C F A F T ( N ) = O ( N ) + C F A F T ( N / 2 ) , C F A F T ( N ) = O ( N ) + C F A F T ( N / 2 ) ,

which leads to

C F A F T ( N ) = O ( N ) . C F A F T ( N ) = O ( N ) .

So under the above conditions, we have a linear complexity approximate FFT. Of course, the complexity depends on the input data, the wavelets we use, the threshold value used to drop insignificant data, and the threshold value used to prune the butterfly operations. It remains to find a good tradeoff. Also the implementation would be more complicated than the classical FFT.

Noise Reduction Capacity

It has been shown that the thresholding of wavelet coefficients has near optimal noise reduction property for many classes of signals [40]. The thresholding scheme used in the approximation in the proposed FAFT algorithm is exactly the hard thresholding scheme used to denoise the data. Soft thresholding can also be easily embedded in the FAFT. Thus the proposed algorithm also reduces the noise while doing approximation. If we need to compute the DFT of noisy signals, the proposed algorithm not only can reduce the numerical complexity but also can produce cleaner results.


In the past, the FFT has been used to calculate the DWT [110], [99], [108], which leads to an efficient algorithm when filters are infinite impulse response (IIR). In this chapter, we did just the opposite – using DWT to calculate FFT. We have shown that when no intermediate coefficients are dropped and no approximations are made, the proposed algorithm computes the exact result, and its computational complexity is on the same order of the FFT; i.e., O(Nlog2N)O(Nlog2N). The advantage of our algorithm is two fold. From the input data side, the signals are made sparse by the wavelet transform, thus approximation can be made to speed up the algorithm by dropping the insignificant data. From the transform side, since the twiddle factors of the new algorithm have decreasing magnitudes, approximation can be made to speed up the algorithm by pruning the section of the algorithm which corresponds to the insignificant twiddle factors. Since wavelets are an unconditional basis for many classes of signals [99], [76], [27], the algorithm is very efficient and has built-in denoising capacity. An alternative approach has been developed by Shentov, Mitra, Heute, and Hossen [98], [56] using subband filter banks.

Nonlinear Filtering or Denoising with the DWT

Wavelets became known to most engineers and scientists with the publication of Daubechies' important paper [26] in 1988. Indeed, the work of Daubechies [27], Mallat [72], [71], [82], Meyer [77], [78], and others produced beautiful and interesting structures, but many engineers and applied scientist felt they had a “solution looking for a problem." With the recent work of Donoho and Johnstone together with ideas from Coifman, Beylkin and others, the field is moving into a second phase with a better understanding of why wavelets work. This new understanding combined with nonlinear processing not only solves currently important problems, but gives the potential of formulating and solving completely new problems. We now have a coherence of approach and a theoretical basis for the success of our methods that should be extraordinarily productive over the next several years. Some of the Donoho and Johnstone references are [29], [37], [40], [36], [35], [32], [31], [30], [17], [38], [39], [19] and related ones are [89], [81], [104], [103], [9]. Ideas from Coifman are in [24], [25], [20], [22], [21], [18], [5].

These methods are based on taking the discrete wavelet transform (DWT) of a signal, passing this transform through a threshold, which removes the coefficients below a certain value, then taking the inverse DWT, as illustrated in Figure 1. They are able to remove noise and achieve high compression ratios because of the “concentrating" ability of the wavelet transform. If a signal has its energy concentrated in a small number of wavelet dimensions, its coefficients will be relatively large compared to any other signal or noise that has its energy spread over a large number of coefficients. This means that thresholding or shrinking the wavelet transform will remove the low amplitude noise or undesired signal in the wavelet domain, and an inverse wavelet transform will then retrieve the desired signal with little loss of detail. In traditional Fourier-based signal processing, we arrange our signals such that the signals and any noise overlap as little as possible in the frequency domain and linear time-invariant filtering will approximately separate them. Where their Fourier spectra overlap, they cannot be separated. Using linear wavelet or other time-frequency or time-scale methods, one can try to choose basis systems such that in that coordinate system, the signals overlap as little as possible, and separation is possible.

The new nonlinear method is entirely different. The spectra can overlap as much as they want. The idea is to have the amplitude, rather than the location of the spectra be as different as possible. This allows clipping, thresholding, and shrinking of the amplitude of the transform to separate signals or remove noise. It is the localizing or concentrating properties of the wavelet transform that makes it particularly effective when used with these nonlinear methods. Usually the same properties that make a system good for denoising or separation by nonlinear methods, makes it good for compression, which is also a nonlinear process.

Denoising by Thresholding

We develop the basic ideas of thresholding the wavelet transform using Donoho's formulations [40], [29], [62]. Assume a finite length signal with additive noise of the form

y i = x i + ϵ n i , i = 1 , ... , N y i = x i + ϵ n i , i = 1 , ... , N

as a finite length signal of observations of the signal xixi that is corrupted by i.i.d.  zero mean, white Gaussian noise nini with standard deviation ϵϵ, i.e., niiidN(0,1)niiidN(0,1). The goal is to recover the signal xx from the noisy observations yy. Here and in the following, vv denotes a vector with the ordered elements vivi if the index ii is omitted. Let WW be a left invertible wavelet transformation matrix of the discrete wavelet transform (DWT). Then Eq. Equation 16 can be written in the transformation domain

Y = X + N , or , Y i = X i + N i , Y = X + N , or , Y i = X i + N i ,

where capital letters denote variables in the transform domain, i.e., Y=WyY=Wy. Then the inverse transform matrix W-1W-1 exists, and we have

W - 1 W = I . W - 1 W = I .

The following presentation follows Donoho's approach [29], [37], [40], [36], [62] that assumes an orthogonal wavelet transform with a square WW; i.e., W-1=WTW-1=WT. We will use the same assumption throughout this section.

Let X^X^ denote an estimate of XX, based on the observations YY. We consider diagonal linear projections

Δ = diag ( δ 1 , ... , δ N ) , δ i { 0 , 1 } , i = 1 , ... , N , Δ = diag ( δ 1 , ... , δ N ) , δ i { 0 , 1 } , i = 1 , ... , N ,

which give rise to the estimate

x ^ = W - 1 X ^ = W - 1 Δ Y = W - 1 Δ W y . x ^ = W - 1 X ^ = W - 1 Δ Y = W - 1 Δ W y .

The estimate X^X^ is obtained by simply keeping or zeroing the individual wavelet coefficients. Since we are interested in the l2l2 error we define the risk measure

R ( X ^ , X ) = E x ^ - x 2 2 = E W - 1 ( X ^ - X ) 2 2 = E X ^ - X 2 2 . R ( X ^ , X ) = E x ^ - x 2 2 = E W - 1 ( X ^ - X ) 2 2 = E X ^ - X 2 2 .

Notice that the last equality in Eq. Equation 21 is a consequence of the orthogonality of WW. The optimal coefficients in the diagonal projection scheme are δi=1Xi>ϵδi=1Xi>ϵ;1 i.e., only those values of YY where the corresponding elements of XX are larger than ϵϵ are kept, all others are set to zero. This leads to the ideal risk

R i d ( X ^ , X ) = n = 1 N min ( X 2 , ϵ 2 ) . R i d ( X ^ , X ) = n = 1 N min ( X 2 , ϵ 2 ) .

The ideal risk cannot be attained in practice, since it requires knowledge of XX, the wavelet transform of the unknown vector xx. However, it does give us a lower limit for the l2l2 error.

Donoho proposes the following scheme for denoising:

  1. compute the DWT Y=WyY=Wy
  2. perform thresholding in the wavelet domain, according to so-called hard thresholding
    or according to so-called soft thresholding
  3. compute the inverse DWT x^=W-1X^x^=W-1X^

This simple scheme has several interesting properties. It's risk is within a logarithmic factor (logNlogN) of the ideal risk for both thresholding schemes and properly chosen thresholds t(N,ϵ)t(N,ϵ). If one employs soft thresholding, then the estimate is with high probability at least as smooth as the original function. The proof of this proposition relies on the fact that wavelets are unconditional bases for a variety of smoothness classes and that soft thresholding guarantees (with high probability) that the shrinkage condition |X^i|<|Xi||X^i|<|Xi| holds. The shrinkage condition guarantees that x^x^ is in the same smoothness class as is xx. Moreover, the soft threshold estimate is the optimal estimate that satisfies the shrinkage condition. The smoothness property guarantees an estimate free from spurious oscillations which may result from hard thresholding or Fourier methods. Also, it can be shown that it is not possible to come closer to the ideal risk than within a factor logNlogN. Not only does Donoho's method have nice theoretical properties, but it also works very well in practice.

Some comments have to be made at this point. Similar to traditional approaches (e.g., low pass filtering), there is a trade-off between suppression of noise and oversmoothing of image details, although to a smaller extent. Also, hard thresholding yields better results in terms of the l2l2 error. That is not surprising since the observation value yiyi itself is clearly a better estimate for the real value xixi than a shrunk value in a zero mean noise scenario. However, the estimated function obtained from hard thresholding typically exhibits undesired, spurious oscillations and does not have the desired smoothness properties.

Shift-Invariant or Nondecimated Discrete Wavelet Transform

As is well known, the discrete wavelet transform is not shift invariant; i.e., there is no “simple” relationship between the wavelet coefficients of the original and the shifted signal2. In this section we will develop a shift-invariant DWT using ideas of a nondecimated filter bank or a redundant DWT [62], [63], [64]. Because this system is redundant, it is not a basis but will be a frame or tight frame (see Section: Overcomplete Representations, Frames, Redundant Transforms, and Adaptive Bases ). Let X=WxX=Wx be the (orthogonal) DWT of xx and SRSR be a matrix performing a circular right shift by RR with RZRZ. Then

X s = W x s = W S R x = W S R W - 1 X , X s = W x s = W S R x = W S R W - 1 X ,

which establishes the connection between the wavelet transforms of two shifted versions of a signal, xx and xsxs, by the orthogonal matrix WSRW-1WSRW-1. As an illustrative example, consider Figure 8.

Figure 8: Shift Variance of the Wavelet Transform
(a) DWT of skyline(b) SWT of skyline circular shifted by 1
Shift Variance of the Wavelet TransformShift Variance of the Wavelet Transform

The first and most obvious way of computing a shift invariant discrete wavelet transform (SIDWT) is simply computing the wavelet transform of all shifts. Usually the two band wavelet transform is computed as follows: 1) filter the input signal by a low-pass and a high-pass filter, respectively, 2) downsample each filter output, and 3) iterate the low-pass output. Because of the downsampling, the number of output values at each stage of the filter bank (corresponding to coarser and coarser scales of the DWT) is equal to the number of the input values. Precisely NN values have to be stored. The computational complexity is O(N)O(N). Directly computing the wavelet transform of all shifts therefore requires the storage of N2N2 elements and has computational complexity O(N2)O(N2).

Beylkin [10], Shensa [96], and the Rice group3 independently realized that 1) there are only NlogNNlogN different coefficient values among those corresponding to all shifts of the input signal and 2) those can be computed with computational complexity NlogNNlogN. This can be easily seen by considering one stage of the filter bank. Let

y = [ y 0 y 1 y 2 ... y N ] T = h x y = [ y 0 y 1 y 2 ... y N ] T = h x

where yy is the output of either the high-pass or the low-pass filter in the analysis filter bank, xx the input and the matrix hh describes the filtering operation. Downsampling of yy by a factor of two means keeping the even indexed elements and discarding the odd ones. Consider the case of an input signal shifted by one. Then the output signal is shifted by one as well, and sampling with the same operator as before corresponds to keeping the odd-indexed coefficients as opposed to the even ones. Thus, the set of data points to be further processed is completely different. However, for a shift of the input signal by two, the downsampled output signal differs from the output of the nonshifted input only by a shift of one. This is easily generalized for any odd and even shift and we see that the set of wavelet coefficients of the first stage of the filter bank for arbitrary shifts consists of only 2N2N different values. Considering the fact that only the low-pass component (NN values) is iterated, one recognizes that after LL stages exactly LNLN values result. Using the same arguments as in the shift variant case, one can prove that the computational complexity is O(NlogN)O(NlogN). The derivation for the synthesis is analogous.

Mallat proposes a scheme for computing an approximation of the continuous wavelet transform [73] that turns out to be equivalent to the method described above. This has been realized and proved by Shensa [96]. Moreover, Shensa shows that Mallat's algorithm exhibits the same structure as the so-called algorithm à trous. Interestingly, Mallat's intention in [73] was not in particular to overcome the shift variance of the DWT but to get an approximation of the continuous wavelet transform.

In the following, we shall refer to the algorithm for computing the SIDWT as the Beylkin algorithm4 since this is the one we have implemented. Alternative algorithms for computing a shift-invariant wavelet transform [68] are based on the scheme presented in [10]. They explicitly or implicitly try to find an optimal, signal-dependent shift of the input signal. Thus, the transform becomes shift-invariant and orthogonal but signal dependent and, therefore, nonlinear. We mention that the generalization of the Beylkin algorithm to the multidimensional case, to an MM-band multiresolution analysis, and to wavelet packets is straightforward.

Combining the Shensa-Beylkin-Mallat-à trous Algorithms and Wavelet Denoising

It was Coifman who suggested that the application of Donoho's method to several shifts of the observation combined with averaging yields a considerable improvement.5 This statement first lead us to the following algorithm: 1) apply Donoho's method not only to “some” but to all circular shifts of the input signal 2) average the adjusted output signals. As has been shown in the previous section, the computation of all possible shifts can be effectively done using Beylkin's algorithm. Thus, instead of using the algorithm just described, one simply applies thresholding to the SIDWT of the observation and computes the inverse transform.

Before going into details, we want to briefly discuss the differences between using the traditional orthogonal and the shift-invariant wavelet transform. Obviously, by using more than NN wavelet coefficients, we introduce redundancy. Several authors stated that redundant wavelet transforms, or frames, add to the numerical robustness [27] in case of adding white noise in the transform domain; e.g., by quantization. This is, however, different from the scenario we are interested in, since 1) we have correlated noise due to the redundancy, and 2) we try to remove noise in the transform domain rather than considering the effect of adding some noise [62], [63].

Performance Analysis

The analysis of the ideal risk for the SIDWT is similar to that by Guo [54]. Define the sets AA and BB according to

A = { i | | X i | ϵ } B = { i | | X i | < ϵ } A = { i | | X i | ϵ } B = { i | | X i | < ϵ }

and an ideal diagonal projection estimator, or oracle,

X ˜ = Y i = X i + N i i A 0 i B . X ˜ = Y i = X i + N i i A 0 i B .

The pointwise estimation error is then

X ˜ i - X i = N i i A - X i i B . X ˜ i - X i = N i i A - X i i B .

In the following, a vector or matrix indexed by AA (or BB) indicates that only those rows are kept that have indices out of AA (or BB). All others are set to zero. With these definitions and Equation 21, the ideal risk for the SIDWT can be derived

R i d ( X ˜ , X ) = E W - 1 ( X ˜ - X ) 2 2 = E W - 1 ( N A - X B ) 2 2 = E ( N A - X B ) T W - 1 T W - 1 C W - 1 ( N A - X B ) = E N A T W - 1 T W - 1 N A - 2 X B T C W - 1 E N A + X B T C W - 1 X B = tr E W - 1 N A N A T W - 1 T + X B T C W - 1 X B = tr W - 1 E W A ϵ n ϵ n T W A T W - 1 T + X B T C W - 1 X B = ϵ 2 tr W - 1 W A W A T W - 1 T + X B T C W - 1 X B . R i d ( X ˜ , X ) = E W - 1 ( X ˜ - X ) 2 2 = E W - 1 ( N A - X B ) 2 2 = E ( N A - X B ) T W - 1 T W - 1 C W - 1 ( N A - X B ) = E N A T W - 1 T W - 1 N A - 2 X B T C W - 1 E N A + X B T C W - 1 X B = tr E W - 1 N A N A T W - 1 T + X B T C W - 1 X B = tr W - 1 E W A ϵ n ϵ n T W A T W - 1 T + X B T C W - 1 X B = ϵ 2 tr W - 1 W A W A T W - 1 T + X B T C W - 1 X B .

where tr(X)(X) denotes the trace of XX. For the derivation we have used, the fact that NA=ϵWAnNA=ϵWAn and consequently the NAiNAi have zero mean. Notice that for orthogonal WW the Eq. Equation 30 immediately specializes to Eq. Equation 22. Eq. Equation 30 depends on the particular signal XBXB, the transform, W-1W-1, and the noise level ϵϵ.

It can be shown that when using the SIDWT introduced above and the thresholding scheme proposed by Donoho (including his choice of the threshold) then there exists the same upper bound for the actual risk as for case of the orthogonal DWT. That is the ideal risk times a logarithmic (in NN) factor. We give only an outline of the proof. Johnstone and Silverman state [61] that for colored noise an oracle chooses δi=1Xiϵiδi=1Xiϵi, where ϵiϵi is the standard deviation of the iith component. Since Donoho's method applies uniform thresholding to all components, one has to show that the diagonal elements of CW-1CW-1 (the variances of the components of NN) are identical. This can be shown by considering the reconstruction scheme of the SIDWT. With these statements, the rest of the proof can be carried out in the same way as the one given by Donoho and Johnstone [29].

Examples of Denoising

The two examples illustrated in Figure 9 show how wavelet based denoising works. The first shows a chirp or doppler signal which has a changing frequency and amplitude. Noise is added to this chirp in (b) and the result of basic Donoho denoising is shown in (c) and of redundant DWT denoising in (d). First, notice how well the noise is removed and at almost no sacrifice in the signal. This would be impossible with traditional linear filters.

The second example is the Houston skyline where the improvement of the redundant DWT is more obvious.

Figure 9: Example of Noise Reduction using ψD8'ψD8'
Example of Noise Reduction using chi_D8

Statistical Estimation

This problem is very similar to the signal recovery problem; a signal has to be estimated from additive white Gaussian noise. By linearity, additive noise is additive in the transform domain where the problem becomes: estimate θθ from y=θ+ϵzy=θ+ϵz, where zz is a noise vector (with each component being a zero mean variance one Gaussian random variable) and ϵ>0ϵ>0 is a scalar noise level. The performance measured by the mean squared error (by Parseval) is given by

R ϵ ( θ ^ , θ ) = E θ ^ ( y ) - θ 2 2 . R ϵ ( θ ^ , θ ) = E θ ^ ( y ) - θ 2 2 .

It depends on the signal (θθ), the estimator θ^θ^, the noise level ϵϵ, and the basis.

For a fixed ϵϵ, the optimal minmax procedure is the one that minimizes the error for the worst possible signal from the coefficient body ΘΘ.

R ϵ * ( Θ ) = inf θ ^ sup θ Θ R ϵ ( θ ^ , θ ) . R ϵ * ( Θ ) = inf θ ^ sup θ Θ R ϵ ( θ ^ , θ ) .

Consider the particular nonlinear procedure θ^θ^ that corresponds to soft-thresholding of every noisy coefficient yiyi:

T ϵ ( x i ) = sgn ( y i ) ( y i - ϵ ) + . T ϵ ( x i ) = sgn ( y i ) ( y i - ϵ ) + .

Let rϵ(θ)rϵ(θ) be the corresponding error for signal θθ and let rϵ(Θ)rϵ(Θ) be the worst-case error for the coefficient body ΘΘ.

If the coefficient body is solid, orthosymmetric in a particular basis, then asymptotically (ϵ0ϵ0) the error decays at least as fast in this basis as in any other basis. That is rϵ(Θ)rϵ(Θ) approaches zero at least as fast as rϵ(UΘ)rϵ(UΘ) for any orthogonal matrix UU. Therefore, unconditional bases are nearly optimal asymptotically. Moreover, for small ϵϵ we can relate this procedure to any other procedure as follows [37]:

R * ( ϵ , Θ ) r * ( ϵ , Θ ) O ( log ( 1 / ϵ ) ) · R * ( ϵ , Θ ) , ϵ 0 . R * ( ϵ , Θ ) r * ( ϵ , Θ ) O ( log ( 1 / ϵ ) ) · R * ( ϵ , Θ ) , ϵ 0 .

Signal and Image Compression

Fundamentals of Data Compression

From basic information theory, we know the minimum average number of bits needed to represent realizations of a independent and identically distributed discrete random variable XX is its entropyH(X)H(X)[23]. If the distribution p(X)p(X) is known, we can design Huffman codes or use the arithmetic coding method to achieve this minimum [7]. Otherwise we need to use adaptive method [112].

Continuous random variables require an infinite number of bits to represent, so quantization is always necessary for practical finite representation. However, quantization introduces error. Thus the goal is to achieve the best rate-distortion tradeoff [60], [23], [16]. Text compression [7], waveform coding [60] and subband coding [110] have been studied extensively over the years. Here we concentrate on wavelet compression, or more general, transform coding. Also we concentrate on low bitrate.

Figure 10: Prototype Transform Coder
Prototype Transform Coder

Prototype Transform Coder

The simple three-step structure of a prototype transform coder is shown in Figure 10. The first step is the transform of the signal. For a length-NN discrete signal f(n)f(n), we expand it using a set of orthonormal basis functions as

f ( n ) = 1 N c i ψ i ( n ) , f ( n ) = 1 N c i ψ i ( n ) ,


c i = f ( n ) , ψ i ( n ) . c i = f ( n ) , ψ i ( n ) .

We then use the uniform scalar quantizer QQ as in Figure 11, which is widely used for wavelet based image compression [95], [100],

c ^ i = Q ( c i ) . c ^ i = Q ( c i ) .

Denote the quantization step size as TT. Notice in the figure that the quantizer has a dead zone, so if |ci|<T|ci|<T, then Q(ci)=0Q(ci)=0. We define an index set for those insignificant coefficients

Figure 11: Uniform Scalar Quantizer
Uniform Scalar Quantizer

I={i:|ci|<T}I={i:|ci|<T}. Let MM be the number of coefficients with magnitudes greater than TT (significant coefficients). Thus the size of II is N-MN-M. The squared error caused by the quantization is

i = 1 N ( c i - c ^ i ) 2 = i I c i 2 + i I ( c i - c ^ i ) 2 . i = 1 N ( c i - c ^ i ) 2 = i I c i 2 + i I ( c i - c ^ i ) 2 .

Since the transform is orthonormal, it is the same as the reconstruction error. Assume TT is small enough, so that the significant coefficients are uniformly distributed within each quantization bins. Then the second term in the error expression is

i I ( c i - c ^ i ) 2 = M T 2 12 . i I ( c i - c ^ i ) 2 = M T 2 12 .

For the first term, we need the following standard approximation theorem [34] that relates it to the lplp norm of the coefficients,

f p = i = 1 N | c i | p 1 / p . f p = i = 1 N | c i | p 1 / p .

Theorem 56 Let λ=1p>12λ=1p>12 then

i I c i 2 f p 2 2 λ - 1 M 1 - 2 λ i I c i 2 f p 2 2 λ - 1 M 1 - 2 λ

This theorem can be generalized to infinite dimensional space if fp2<+fp2<+. It has been shown that for functions in a Besov space, fp2<+fp2<+ does not depend on the particular choice of the wavelet as long as each wavelet in the basis has q>λ-12q>λ-12 vanishing moments and is qq times continuously differentiable [77]. The Besov space includes piece-wise regular functions that may include discontinuities. This theorem indicates that the first term of the error expression decreases very fast when the number of significant coefficient increases.

The bit rate of the prototype compression algorithm can also be separated in two parts. For the first part, we need to indicate whether the coefficient is significant, also known as the significant map. For example, we could use 1 for significant, and 0 for insignificant. We need a total of NN these indicators. For the second part, we need to represent the values of the significant coefficients. We only need MM values. Because the distribution of the values and the indicators are not known in general, adaptive entropy coding is often used [112].

Energy concentration is one of the most important properties for low bitrate transform coding. Suppose for the sample quantization step size TT, we have a second set of basis that generate less significant coefficients. The distribution of the significant map indicators is more skewed, thus require less bits to code. Also, we need to code less number of significant values, thus it may require less bits. In the mean time, a smaller MM reduces the second error term as in Equation 39. Overall, it is very likely that the new basis improves the rate-distortion performance. Wavelets have better energy concentration property than the Fourier transform for signals with discontinuities. This is one of the main reasons that wavelet based compression methods usually out perform DCT based JPEG, especially at low bitrate.

Improved Wavelet Based Compression Algorithms

The above prototype algorithm works well [79], [55], but can be further improved for its various building blocks [47]. As we can see from Figure 12, the significant map still has considerable structure, which could be exploited. Modifications and improvements use the following ideas:

  • Insignificant coefficients are often clustered together. Especially, they often cluster around the same location across several scales. Since the distance between nearby coefficients doubles for every scale, the insignificant coefficients often form a tree shape, as we can see from Figure: Discrete Wavelet Transform of the Houston Skyline, using ψD8'ψD8' with a Gain of 22 for Each Higher Scale . These so called zero-trees can be exploited [95], [100] to achieve excellent results.
  • The choice of basis is very important. Methods have been developed to adaptively choose the basis for the signal [86], [117]. Although they could be computationally very intensive, substantial improvement can be realized.
  • Special run-length codes could be used to code significant map and values [107], [106].
    Figure 12: The Significant Map for the Lenna image.
    The Significant Map for the Lenna image.
  • Advanced quantization methods could be used to replace the simple scalar quantizer [59].
  • Method based on statistical analysis like classification, modeling, estimation, and prediction also produces impressive result [70].
  • Instead of using one fixed quantization step size, we can successively refine the quantization by using smaller and smaller step sizes. These embedded schemes allow both the encoder and the decoder to stop at any bit rate [95], [100].
  • The wavelet transform could be replaced by an integer-to-integer wavelet transform, no quantization is necessary, and the compression is lossless [100].

Other references are:[40], [29], [37], [89], [53], [95], [91], [92], [95], [109], [100], [101], [16], [55].

Why are Wavelets so Useful?

The basic wavelet in wavelet analysis can be chosen so that it is smooth, where smoothness is measured in a variety of ways [75]. To represent f(t)f(t) with KK derivatives, one can choose a wavelet ψ(t)ψ(t) that is KK (or more) times continuously differentiable; the penalty for imposing greater smoothness in this sense is that the supports of the basis functions, the filter lengths and hence the computational complexity all increase. Besides, smooth wavelet bases are also the “best bases” for representing signals with arbitrarily many singularities [37], a remarkable property.

The usefulness of wavelets in representing functions in these and several other classes stems from the fact that for most of these spaces the wavelet basis is an unconditional basis, which is a near-optimal property.

To complete this discussion, we have to motivate the property of an unconditional basis being asymptotically optimal for a particular problem, say data compression [37]. Figure 13 suggests why a basis in which the coefficients are solid and orthosymmetric may be desired. The signal class is defined to be the interior of the rectangle bounded by the lines x=±ax=±a and y=±by=±b. The signal corresponding to point AA is the worst-case signal for the two bases shown in the figure; the residual error (with n=1n=1) is given by asin(θ)+bcos(θ)asin(θ)+bcos(θ) for θ0,αθ0,α and is minimized by θ=0θ=0, showing that the orthosymmetric basis is preferred. This result is really a consequence of the fact that abab (which is typically the case why one uses transform coding—if a=ba=b, it turns out that the “diagonal” basis with θ=π4θ=π4 is optimal for n=1n=1). The closer the coefficient body is to a solid, orthosymmetric body with varying side lengths, the less the individual coefficients are correlated with each other and the greater the compression in this basis.

In summary, the wavelet bases have a number of useful properties:

  1. They can represent smooth functions.
  2. They can represent singularities
  3. The basis functions are local. This makes most coefficient-based algorithms naturally adaptive to inhomogeneities in the function.
  4. They have the unconditional basis (or near optimal in a minimax sense) property for a variety of function classes implying that if one knows very little about a signal, the wavelet basis is usually a reasonable choice.
    Figure 13: Optimal Basis for Data Compression
    Optimal Basis for Data Compression
  5. They are computationally inexpensive—perhaps one of the few really useful linear transform with a complexity that is O(N)O(N)—as compared to a Fourier transform, which is Nlog(N)Nlog(N) or an arbitrary linear transform which is O(N2)O(N2).
  6. Nonlinear soft-thresholding is near optimal for statistical estimation.
  7. Nonlinear soft-thresholding is near optimal for signal recovery.
  8. Coefficient vector truncation is near optimal for data compression.


Listed below are several application areas in which wavelet methods have had some success.

Numerical Solutions to Partial Differential Equations

The use of wavelets as basis functions for the discretization of PDEs has had excellent success. They seem to give a generalization of finite element methods with some characteristics of multigrid methods. It seems to be the localizing ability of wavelet expansions that give rise to sparse operators and good numerical stability of the methods [52], [84], [87], [5], [11], [57], [6], [69], [13], [14].

Seismic and Geophysical Signal Processing

One of the exciting applications areas of wavelet-based signal processing is in seismic and geophysical signal processing. Applications of denoising, compression, and detection are all important here, especially with higher-dimensional signals and images. Some of the references can be found in [85], [118], [93], [42], [89], [67], [51], [46][50], [49], [28], [12].

Medical and Biomedical Signal and Image Processing

Another exciting application of wavelet-based signal processing is in medical and biomedical signal and image processing. Again, applications of denoising, compression, and detection are all important here, especially with higher dimensional signals and images. Some of the references can be found in [2], [105], [58].

Application in Communications

Some applications of wavelet methods to communications problems are in [104], [66], [65], [116], [97].


Wavelet-based signal processing has been combined with fractals and to systems that are chaotic [41], [74], [115], [33], [1], [3], [114], [113]. The multiresolution formulation of the wavelet and the self-similar characteristic of certain fractals make the wavelet a natural tool for this analysis. An application to noise removal from music is in [4].

Other applications are to the automatic target recognition (ATR) problem, and many other questions.

Wavelet Software

There are several software packages available to study, experiment with, and apply wavelet signal analysis. There are several Matlab programs at the end of this book. MathWorks, Inc. has a Wavelet Toolbox [80]; Donoho's group at Stanford has WaveTool; the Yale group has XWPL and WPLab [111]; Taswell at Stanford has WavBox [102], a group in Spain has Uvi-Wave; MathSoft, Inc. has S+WAVELETS; Aware, Inc. has WaveTool; and the DSP group at Rice has a Matlab wavelet toolbox available over the internet at There is a good description and list of several wavelet software packages in [8]. There are several Matlab programs in Appendix C of this book. They were used to create the various examples and figures in this book and should be studied when studying the theory of a particular topic.


  1. It is interesting to note that allowing arbitrary δiIRδiIR improves the ideal risk by at most a factor of 2[30]
  2. Since we deal with finite length signals, we really mean circular shift.
  3. Those are the ones we are aware of.
  4. However, it should be noted that Mallat published his algorithm earlier.
  5. A similar remark can be found in [88], p. 53.


  1. Argoul, F. and Arneodo, A. and Elezgaray, J. and Grasseau, G. (1989, March). Wavelet Transform of Fractal Aggregates. Physics Letters A., 135, 327-336.
  2. (1996). Wavelets in Medicine and Biology. Boca Raton: CRC Press.
  3. Auscher, P. (1989). Ondelettes fractales et applications. Ph. D. Thesis.
  4. Berger, Jonathan and Coifman, Ronald R. and Goldberg, Maxim J. (1994, October). Removing Noise from Music Using Local Trigonometric Bases and Wavelet Packets. Journal of the Audio Engineering Society, 42(10), 808–817.
  5. Beylkin, G. and Coifman, R. R. and Rokhlin, V. (1991). Fast Wavelet Transforms and Numerical Algorithms I. Communications on Pure and Applied Mathematics, 44, 141–183.
  6. Beylkin, G. and Coifman, R. R. and Rokhlin, V. (1992). Wavelets in Numerical Analysis. In Wavelets and Their Applications. (p. 181–210). [Outgrowth of the NSF/CBMS Conference on Wavelets, Lowell, June 1990]. Boston: Jones and Bartlett.
  7. Bell, T. C. and Cleary, J. G. and Witten, I. H. (1990). Text Compression. N.J.: Prentice Hall.
  8. Brice, Andrew and Donoho, David and Gao, Hong-Ye. (1996, October). Wavelet Analysis. IEEE Spectrum, 33(10), 26–35.
  9. Bruce, A. G. and Donoho, D. L. and Gao, H.-Y. and Martin, R. D. (1994, April). Denoising and Robust Nonlinear Wavelet Analysis. In Proceedings of Conference on Wavelet Applications. (Vol. 2242, p. 325–336). SPIE. Orlando, FL
  10. Beylkin, G. (1992, December). On the Representation of Operators in Bases of Compactly Supported Wavelets. SIAM Journal on Numerical Analysis, 29(6), 1716–1740.
  11. Beylkin, G. (1994). On Wavelet-Based Algorithms for Solving Differential Equations. In Wavelets: Mathematics and Applications. (p. 449–466). Boca Raton: CRC Press.
  12. Beylkin, Gregory. (1997). An Adaptive Pseudo-Wavelet Approach for Solving Nonlinear Partial Differential Equations. In Multiscale Wavelet Methods for Partial Differential Equations. [Volume 6 in the series: Wavelet Analysis and Applications]. San Diego: Academic Press.
  13. Beylkin, Gregory and Keiser, James M. (1997). On the Adaptive Nomerical Solution of Nonlinear Partial Differential Equations in Wavelet Bases. Journal of Computational Physics, 132, 233–259.
  14. Beylkin, G. and Keiser, J. M. and Vozovoi, L. (1993). A New Class of Stabel Time Discretization Schemes for the Solution of Nonlinear PDE's. Technical report. Boulder, CO: Applied Mathematics Program, University of Colorado.
  15. Burrus, C. S. and Parks, T. W. (1985). DFT/FFT and Convolution Algorithms. New York: John Wiley & Sons.
  16. Burrows, Michael and Wheeler, David J. (1994). A Block-Sorting Lossless Data Compression Algorithm. (124). Technical report. Palo Alto: Digital Systems Research Center.
  17. Chen, Shaobing and Donoho, David L. (1994, November). Basis Pursuit. In Proceedings of the 28th Asilomar Conference on Signals, Systems, and Computers. (p. 41–44). [Also Stanford Statistics Dept. Report, 1994]. Pacific Grove, CA
  18. Coifman, R. R. and Donoho, D. L. (1995). Translation-Invariant De-Noising. In Wavelets and Statistics. (p. 125–150). [Springer Lecture Notes in Statistics]. Springer-Verlag.
  19. Chen, Shaobing and Donoho, David L. (1995, May). Atomic Decomposition by Basis Pursuit. (479). [preprint]. Technical report. Stanford: Statistics Department.
  20. Coifman, Ronald R. and Majid, Fazal. (1994). Adapted Waveform Analysis and Denoising. Technical report. New Haven: Yale University.
  21. Coifman, R. R. and Meyer, Y. and Quake, S. and Wickerhauser, M. V. (1992). Signal Processing and Compression with Wave Packets. In Proceedings of the International Conference on Wavelets, 1989 Marseille. Paris: Masson.
  22. Coifman, R. R. and Meyer, Y. and Wickerhauser, M. V. (1992). Wavelet Analysis and Signal Processing. In Wavelets and Their Applications. Boston: Jones and Bartlett.
  23. Cover, T. M. and Thomas, J. A. (1991). Elements of Information Theory. N.Y.: John Wiley $ Sons.
  24. Coifman, Ronald R. and Wickerhauser, M. V. (1990). Best-Adapted Wave Packet Bases. Technical report. New Haven: Math Dept., Yale University.
  25. Coifman, R. R. and Wickerhauser, M. V. (1992, March). Entropy-Based Algorithms for Best Basis Selection. IEEE Transaction on Information Theory, 38(2), 713–718.
  26. Daubechies, Ingrid. (1988, November). Orthonormal Bases of Compactly Supported Wavelets. Communications on Pure and Applied Mathematics, 41, 909–996.
  27. Daubechies, Ingrid. (1992). Ten Lectures on Wavelets. [Notes from the 1990 CBMS-NSF Conference on Wavelets and Applications at Lowell, MA]. Philadelphia, PA: SIAM.
  28. (1997). Multiscale Wavelet Methods for Partial Differential Equations. [Volume 6 in the series: Wavelet Analysis and its Applications]. San Diego: Academic Press.
  29. Donoho, David L. and Johnstone, Iain M. (1994). Ideal Spatial Adaptation via Wavelet Shrinkage. [Also Stanford Statistics Dept. Report TR-400, July 1992]. Biometrika, 81, 425–455.
  30. Donoho, David L. and Johnstone, Iain M. (to appear 1994). Ideal Denoising in an Orthonormal Basis Chosen from a Library of Bases. [Also Stanford Statistics Dept. Report 461, Sept. 1994]. C. R. Acad. Sci. Paris, Ser. I, 319,
  31. Donoho, David L. and Johnstone, Iain M. (to appear 1995). Adapting to Unknown Smoothness via Wavelet Shrinkage. [Also Stanford Statistics Dept. Report TR-425, June 1993]. Journal of American Statist. Assn..
  32. Donoho, David L. and Johnstone, Iain M. and Kerkyacharian, Gérard and Picard, Dominique. (1995). Wavelet Shrinkage: Asymptopia? [Also Stanford Statistics Dept. Report TR-419, March 1993]. Journal Royal Statistical Society B., 57(2), 301–337.
  33. Daubechies, Ingrid and Lagarias, Jeffrey C. (1992, July). Two-Scale Difference Equations, Part II. Local Regularity, Infinite Products of Matrices and Fractals. [From an internal report, AT&T Bell Labs, Sept. 1988]. SIAM Journal of Mathematical Analysis, 23, 1031–1079.
  34. DeVire, R. and Lorentz, G. (1993). Constructive Approximation. Springer-Verlag.
  35. Donoho, David L. (to appear). Interpolating Wavelet Transforms. [Also Stanford Statistics Dept. report TR-408, Nov. 1992]. Applied and Computational Harmonic Analysis.
  36. Donoho, David L. (1993). Nonlinear Wavelet Methods for Recovery of Signals, Densities, and Spectra from Indirect and Noisy Data. In Different Perspectives on Wavelets, I. (p. 173–205). [Proceedings of Symposia in Applied Mathematics and Stanford Report 437, July 1993]. American Mathematical Society. Providence
  37. Donoho, David L. (1993, December). Unconditional Bases are Optimal Bases for Data Compression and for Statistical Estimation. [Also Stanford Statistics Dept. Report TR-410, Nov. 1992]. Applied and Computational Harmonic Analysis, 1(1), 100–115.
  38. Donoho, David L. (1993, January). Wavelet Skrinkage and W. V. D. – A Ten Minute Tour. (TR-416). [Preprint]. Technical report. Stanford, CA: Statistics Department, Stanford University.
  39. Donoho, David L. (1994). On Minimum Entropy Segmentation. In Wavelets: Theory, Algorithms, and Applications. [Also Stanford Tech Report TR-450, 1994; Volume 5 in the series: Wavelet Analysis and its Applications]. San Diego: Academic Press.
  40. Donoho, David L. (1995, May). De-Noising by Soft-Thresholding. [also Stanford Statistics Dept. report TR-409, Nov. 1992]. IEEE Transactions on Information Theory, 41(3), 613–627.
  41. (1993). Wavelets, Fractals, and Fourier Tranforms. [Proceedings of a conference on Wavelets at Newnham College, Cambridge, Dec. 1990.]. Oxford: Clarendon Press.
  42. (1994). Wavelets in Geophyics. [Volume 4 in the series: Wavelet Analysis and its Applications]. San Diego: Academic Press.
  43. Guo, Haitao and Burrus, C. Sidney. (2000, to be submitted). Fast Approximate Fourier Transform via Wavelet Transforms. IEEE Transactions.
  44. Guo, Haitao and Burrus, C. Sidney. (1996, August 6–9). Approximate FFT via the Discrete Wavelet Transform. In Proceedings of SPIE Conference 2825. Denver
  45. Guo, Haitao and Burrus, C. Sidney. (1996, May 7–10). Convolution using the Discrete Wavelet Transform. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing. (Vol. 3, pp. III-1291–1294). IEEE ICASSP-96, Atlanta
  46. Guo, Haitao and Burrus, C. Sidney. (1996, April 4). Phase-Preserving Compression of Seismic Images using the Self-Adjusting Wavelet Transform. In NASA Combined Industry, Space and Earth Science Data Compression Workshop (in conjunction with the IEEE Data Compression Conference, DCC-96), JPL Pub. 96-11. (p. 101–109). Snowbird, Utah
  47. Guo, Haitao and Burrus, C. Sidney. (1997, October 26-29). Waveform and Image Compression with the Burrows Wheeler Transform and the Wavelet Transform. In Proceedings of the IEEE International Conference on Image Processing. (Vol. I, p. I:65–68). IEEE ICIP-97, Santa Barbara
  48. Guo, Haitao and Burrus, C. Sidney. (1997, April 21–24). Wavelet Transform Based Fast Approximate Fourier Transform. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing. (Vol. 3, p. III:1973–1976). IEEE ICASSP-97, Munich
  49. Goupillaud, P. and Grossman, A. and Morlet, J. (1984). Cyclo-octave and related transforms in seismic signal analysis. SIAM J. Math. Anal., 15, 723-736.
  50. Groupillaud, P. and Grossman, A. and Morlet, J. (1984). Cyclo-octave and related transforms in seismic signal analysis. Geoexploration, (23),
  51. Ginette, S. and Grossmann, A. and Tchamitchian, Ph. (1989). Use of Wavelet Transforms in the Study of Propagation of Transient Acoustic Signals Across a Plane Interface Between Two Homogeneous Media. In Wavelets: Time-Frequency Methods and Phase Space. (p. 139–146). [Proceedings of the International Conference, Marseille, Dec. 1987]. Berlin: Springer-Verlag.
  52. Glowinski, R. and Lawton, W. and Ravachol, M. and Tenenbaum, E. (1990). Wavelet Solution of Linear and Nonlinear Elliptic, Parabolic and Hyperbolic Problems in One Dimension. In Proceedings of the Ninth SIAM International Conference on Computing Methods in Applied Sciences and Engineering. Philadelphia
  53. Guo, H. and Odegard, J. E. and Lang, M. and Gopinath, R. A. and Selesnick, I. W. and Burrus, C. S. (1994, November 13-16). Wavelet Based Speckle Reduction with Application to SAR Based ATD/R. In Proceedings of the IEEE International Conference on Image Processing. (Vol. I, p. I:75–79). IEEE ICIP-94, Austin, Texas
  54. Guo, Haitao. (1995, April). Theory and Applications of the Shift-Invariant, Time-Varying and Undecimated Wavelet Transform. Masters thesis. ECE Department, Rice University.
  55. Guo, Haitao. (1997, May). Wavelets for Approximate Fourier Transform and Data Compression. Ph. D. Thesis. ECE Department, Rice University, Houston, Tx.
  56. Hossen, A. N. and Heute, U. and Shentov, O. V. and Mitra, S. K. (1995). Subband DFT – Part II: Accuracy, Complexity, and Applications. Signal Processing, 41, 279–295.
  57. Heurtaux, Frédéric and Planchon, Fabrice and Wickerhauser, Mladen V. (1994). Scale Decomposition in Burgers' Equation. In Wavelets: Mathematics and Applications. (p. 505–524). Boca Raton: CRC Press.
  58. Ivanov, Plamen Ch. and Rosenblum, Michael G and Peng, C.-K. and Mietus, Joseph and Havlin, Shlomo and Stanley, H. Eugene and Goldberger, Ary L. (1996, September 26). Scaling Behaviour of Heartbeat Intervals Obtained by Wavelet-Based Time-Series Analysis. Nature, 383, 323–327.
  59. Josho, R. L. and Crump, V. J. and Fischer, T. R. (1995, December). Image Subband Coding Using Arithmetic Coded Trellis Coded Quantization. IEEE Transactions on Circuits and Systems, 515–523.
  60. Jayant, N. S. and Noll, P. (1984). Digital Coding of Waveforms. (1st). Englewood Cliffs, NJ: Prentice-Hall, Inc.
  61. Johnstone, I. M. and Silverman, B. W. (1994, September). Wavelet Threshold Estimators for Data with Correlated Noise. Technical report. Statistics Dept., University of Bristol.
  62. Lang, M. and Guo, H. and Odegard, J. E. and Burrus, C. S. and Wells, Jr., R. O. (1995, April 17–21). Nonlinear Processing of a Shift-Invariant DWT for Noise Reduction. In Proceedings of SPIE Conference 2491, Wavelet Applications II. (Vol. 2491–60, p. 640–651). Orlando
  63. Lang, M. and Guo, H. and Odegard, J. E. and Burrus, C. S. and Wells, Jr., R. O. (1996, January). Noise Reduction Using an Undecimated Discrete Wavelet Transform. IEEE Signal Processing Letters, 3(1), 10–12.
  64. Lang, M. and Guo, H. and Odegard, J. E. and Burrus, C. S. (1995, June 20–22). Nonlinear Redundant Wavelet Methods for Image Enhancement. In Proceedings of IEEE Workshop on Nonlinear Signal and Image Processing. (p. 754–757). Halkidiki, Greece
  65. Lindsey, A. R. (1995, June). Generalized Orthogonally Multiplexed Communication via Wavelet Packet Bases. Ph. D. Thesis.
  66. Learned, R. E. and Krim, H. and Claus, B. and Willsky, A. S. and Karl, W. C. (1994, July). Wavelet-Packet-Based Multiple Access Communications. In Proceedings of SPIE Conference, Wavelet Applications in Signal and Image Processing, Vol. 2303. (p. 246–259). San Diego
  67. Larsonneur, J. L. and Morlet, J. (1989). Wavelet and Seismic Interpretation. In Wavelets: Time-Frequency Methods and Phase Space. (p. 126–131). [Proceedings of the International Conference, Marseille, Dec. 1987]. Berlin: Springer-Verlag.
  68. Liang, Jie and Parks, Thomas W. (1994, November). A Two-Dimensional Translation Invariant Wavelet Representation and its Applications. In Proceedings of the IEEE International Conference on Image Processing. (Vol. 1, p. I:66–70). Austin
  69. Liandrat, J. and Perrier, V. and Tchamitchian, Ph. (1992). Numerical Resolution of Nonlinear Parital Differential Equations using the Wavelet Approach. In Wavelets and Their Applications. (p. 181–210). [Outgrowth of the NSF/CBMS Conference on Wavelets, Lowell, June 1990]. Boston: Jones and Bartlett.
  70. LoPresto, S. M. and Ramchandran, K. and Orchard, M. T. (1997, March). Image Coding based on Mixture Modeling of Wavelet Coefficients and a Fast Estimation-Quantization Framework. Proc. DCC.
  71. Mallat, S. G. (1989, July). A Theory for Multiresolution Signal Decomposition: The Wavelet Representation. IEEE Transactions on Pattern Recognition and Machine Intelligence, 11(7), 674–693.
  72. Mallat, S. G. (1989). Multiresolution Approximation and Wavelet Orthonormal Bases of. Transactions of the American Mathematical Society, 315, 69-87.
  73. Mallat, S. G. (1991, July). Zero-Crossings of a Wavelet Transform. IEEE Transactions on Information Theory, 37(4), 1019–1033.
  74. Massopust, Peter R. (1994). Fractal Functions, Fractal Surfaces, and Wavelets. San Diego: Academic Press.
  75. Meyer, Y. (1987, September). L'analyses par ondelettes. Pour la Science.
  76. Meyer, Y. (1990). Ondelettes et opérateurs. Paris: Hermann.
  77. Meyer, Yves. (1992). Wavelets and Operators. [Translated by D. H. Salinger from the 1990 French edition.]. Cambridge: Cambridge.
  78. Meyer, Yves. (1993). Wavelets, Algorithms and Applications. [Translated by R. D. Ryan based on lectures given for the Spanish Institute in Madrid in Feb. 1991]. Philadelphia: SIAM.
  79. Mallat, Stéphane and Falzon, Frédéric. (1997, April). Understanding Image Transform Codes. In Proceedings of SPIE Conference, Aerosense. Orlando
  80. Misiti, Michel and Misiti, Yves and Oppenheim, Georges and Poggi, Jean-Michel. (1996). Wavelet Toolbox User's Guide. Natick, MA: The MathWorks, Inc.
  81. Moulin, Pierre. (1993). A Wavelet Regularization Method for Diffuse Radar-Target Imaging and Speckle-Noise Reduction. Journal of Mathematical Imaging and Vision, 3, 123–134.
  82. Mallat, S. G. and Zhang, Z. (1993, December). Matching Pursuits with Time-Frequency Dictionaries. IEEE Transactions on Signal Processing, 41(12), 3397–3415.
  83. Odegard, J. E. (1996, May). Moments, smoothness and optimization of wavelet systems. Ph. D. Thesis. Rice University, Houston, TX 77251, USA.
  84. Robertsson, J. O. A. and Blanch, J. O. and Symes, W. W. and Burrus, C. S. (1994, January). Galerkin–Wavelet Modeling of Wave Propagation: Optimal Finite–Difference Stencil Design. Mathematical and Computer Modelling, 19(1), 31–38.
  85. Robinson, E. A. and Durrani, T. S. (1986). Geophysical Signal Processing. Englewood Cliffs, NJ: Prentice Hall.
  86. Ramchandran, K. and Veterli, M. (1993). Best Wavelet Packet Bases in a Rate-Distortion Sense. IEEE Transactions on Image Processing, 2(2), 160–175.
  87. Resnikoff, H. L. and Wells, Jr., R. O. (1998). Wavelet Analysis: The Scalable Structure of Information. New York: Springer-Verlag.
  88. Saito, Naoki. (1994). Local Feature Extraction and Its Applications Using a Library of Bases. Ph. D. Thesis. Yale University, New Haven, CN.
  89. Saito, Naoki. (1994). Simultaneous Noise Suppression and Signal Compression using a Library of Orthonormal Bases and the Minimum Discription Length Criterion. In Wavelets in Geophysics. San Diego: Academic Press.
  90. Sorensen, H. V. and Burrus, C. S. (1993, March). Efficient Computation of the DFT with Only a Subset of Input or Output Points. IEEE Transactions on Signal Processing, 41(3), 1184–1200.
  91. (1995, March). Proceedings of Data Compression Conference. Snowbird, Utah: IEEE Computer Society Press.
  92. (1996, April). Proceedings of Data Compression Conference. Snowbird, Utah: IEEE Computer Society Press.
  93. Scales, John A. (1994). Theory of Seismic Imaging. Golden, CO: Samizat Press.
  94. Selesnick, I. W. (1996). New Techniques for Digital Filter Design. Ph. D. Thesis. Rice University.
  95. Shapiro, J. M. (1993, December). Embedded Image Coding Using Zerotrees of Wavelet Coefficients. IEEE Transactions on Signal Processing, 41(12), 3445–3462.
  96. Shensa, M. J. (1992, October). The Discrete Wavelet Transform: Wedding the à trous and Mallat Algorithms. IEEE Transactions on Signal Processing, 40(10), 2464–2482.
  97. Sablatash, M. and Lodge, J. H. (1996, June). The Design of Filter Banks with Specified Minimum Stopband Attenuation for Wavelet Packet-Based Multiple Access Communications. In Proceedings of 18th Biennial Symposium on Communications, Queen's University. Kingston, ON, Canada
  98. Shentov, O. V. and Mitra, S. K. and Heute, U. and Hossen, A. N. (1995). Subband DFT – Part I: Definition, Interpretation and Extensions. Signal Processing, 41, 261–278.
  99. Strang, Gilbert and Nguyen, T. (1996). Wavelets and Filter Banks. Wellesley, MA: Wellesley–Cambridge Press.
  100. Said, A. and Pearlman, W. A. (1996, June). A New, Fast, and Efficient Image Codec Based on Set Partitioning in Hierarchical Trees. IEEE Transactions Cir. Syst. Video Tech., 6(3), 243–250.
  101. Said, A. and Perlman, W. A. (1996, September). An Image Multiresolution Representation for Lossless and Lossy Image Compression. IEEE Transactions on Image Processing, 5, 1303–1310.
  102. Taswell, Carl. (1996). Handbook of Wavelet Transform Algorithms. Boston: Birkhäuser.
  103. Tao, Hai and Moorhead, R. J. (1994, November). Lossless Progressive Transmission of Scientific Data Using Biorthogonal Wavelet Transform. In Proceedings of the IEEE Conference on Image Processing. Austin, ICIP-94
  104. Tao, Hai and Moorhead, R. J. (1994, October). Progressive Transmission of Scientific Data Using Biorthogonal Wavelet Transform. In Proceedings of the IEEE Conference on Visualization. Washington
  105. Turner, M. (1986). Texture discrimination by Gabor functions. Biological Cybernetics, 55, 71-82.
  106. Tsai, M. J. and Villasenor, J. D. and Chen, F. (1996, October). Stack-Run Image Coding. IEEE Trans. Circ. and Syst. Video Tech., 519–521.
  107. Tian, J. and Wells, R. O. (1996, April). Image Compression by Reduction of Indices of Wavelet Transform Coefficients. Proc. DCC.
  108. Vaidyanathan, P. P. (1992). Multirate Systems and Filter Banks. Englewood Cliffs, NJ: Prentice-Hall.
  109. Villasenor, J. D. and Belzer, B. and Liao, J. (1995, August). Wavelet Filter Evaluation for Image Compression. IEEE Transactions on Image Processing, 4,
  110. Vetterli, Martin and Kovačević, Jelena. (1995). Wavelets and Subband Coding. Upper Saddle River, NJ: Prentice–Hall.
  111. Wickerhauser, Mladen Victor. (1995). Adapted Wavelet Analysis from Theory to Software. Wellesley, MA: A K Peters.
  112. Witten, I. and Neal, R. and Cleary, J. (1987, June). Arithmetic coding for data compression. Communications of the ACM, 30, 520–540.
  113. Wornell, G. W. and Oppenheim, A. V. (1992, March). Wavelet-Based Representations for a Class of Self-Similar Signals with Application to Fractal Modulation. IEEE Transactions on Information Theory, 38(2), 785-800.
  114. Wornell, G. and Oppenheim, A. V. (1992, March). Estimation of Fractal Signals from Noisy Measurements Using Wavelets. IEEE Transactions on Acoustics, Speech, and Signal Processing, 40(3), 611-623.
  115. Wornell, Gregory W. (1996). Signal Processing with Fractals, A Wavelet- Based Approach. Upper Saddle River, NJ: Prentice Hall.
  116. Wu, J. and Wong, K. M. and Jin, Q. (1995, April). Multiplexing Based on Wavelet Packets. In Proceedings of SPIE Conference, Aerosense. Orlando
  117. Xiong, Z. and Herley, C. and Ramchandran, K. and Orcgard, M. T. (1995, October). Space-frequency quantization for a space-varying wavelet packet image coder. Proc. Int. Conf. Image Processing, 1, 614–617.
  118. Yilmaz, Özdoğan. (1987). Seismic Data Processing. [Stephen M. Doherty editor]. Tulsa: Society of Exploration Geophysicists.

Content actions

Download module as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Add module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens


A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks