Note: Your browser may not currently support MathML. See our browser support page for additional details. You can always view the correct math in the PDF version.
A major application of the FFT is fast convolution or fast filtering where the DFT of the signal is multiplied term-by-term by the DFT of the impulse (helps to be doing finite impulse response (FIR) filtering) and the time-domain output is obtained by taking the inverse DFT of that product. What is less well-known is the DFT can be calculated by convolution. There are several different approaches to this, each with different application.
In this section a method quite different from the index mapping or polynomial evaluation is developed. Rather than dealing with the DFT directly, it is converted into a cyclic convolution which must then be carried out by some efficient means. Those means will be covered later, but here the conversion will be explained. This method requires use of some number theory, which can be found in an accessible form in [12] or [13] and is easy enough to verify on one's own. A good general reference on number theory is [14].
The DFT and cyclic convolution are defined by
For both, the indices are evaluated modulo
creates a unique, one-to-one map of
the
| r | m= | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | |
| 2 | 1 | 2 | 4 | 3 | 1 | 2 | 4 | 3 | |
| 3 | 1 | 3 | 4 | 2 | 1 | 3 | 4 | 2 | |
| 4 | 1 | 4 | 1 | 4 | 1 | 4 | 1 | 4 | |
| 5 | * | 0 | 0 | 0 | * | 0 | 0 | 0 | |
| 6 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
Table 1 is an array of values of
and
where the term with the negative exponent (the inverse) is defined as the integer that satisfies
If
for
New functions are defined, which are simply a permutation in the order of the original functions, as
Equation Equation 7 then becomes
which is cyclic convolution of length N-1 (plus
Applying this change of variables (use of logarithms) to the DFT can best be illustrated from the matrix formulation of the DFT. Equation Equation 1 is written for a length-5 DFT as
where the square matrix should contain the terms of
and
which can be seen to be a reordering of the structure in Equation 12. This is in the form of cyclic convolution as indicated in Equation 10. Rader first showed this in 1968 [12], stating that a prime length-N DFT could be converted into a length-(N-1) cyclic convolution of a permutation of the data with a permutation of the W's. He also stated that a slightly more complicated version of the same idea would work for a DFT with a length equal to an odd prime to a power. The details of that theory can be found in [12], [10].
Until 1976, this conversion approach received little attention since it seemed to offer few advantages. It has specialized applications in calculating the DFT if the cyclic convolution is done by distributed arithmetic table look-up [5] or by use of number theoretic transforms [1], [12], [13]. It and the Goertzel algorithm [16], [3] are efficient when only a few DFT values need to be calculated. It may also have advantages when used with pipelined or vector hardware designed for fast inner products. One example is the TMS320 signal processing microprocessor which is pipelined for inner products. The general use of this scheme emerged when new fast cyclic convolution algorithms were developed by Winograd [21].
The DFT of
The mathematical identity
which substituted into the definition of the DFT in Multidimensional Index Mapping: Equation 1 gives
This equation can be interpreted as first multiplying (modulating) the data
Define the chirp sequence or signal as
We know that convolution can be carried out by multiplying the DFTs of the signals,
here we see that evaluation of the DFT can be carried out by convolution. Indeed,
the convolution represented by
As developed here, the chirp z-transform evaluates the z-transform at equally spaced points on the unit circle. A slight modification allows evaluation on a spiral and in segments [19], [16] and allows savings with only some input values are nonzero or when only some output values are needed. The story of the development of this transform is given in [18].
Two Matlab programs to calculate an arbitrary length DFT using the chirp z-transform is shown in Code 1.
|
Goertzel's algorithm [7], [3], [15] is another methods that calculates the DFT by converting it into a digital filtering problem. The method looks at the calculation of the DFT as the evaluation of a polynomial on the unit circle in the complex plane. This evaluation is done by Horner's method which is implemented recursively by an IIR filter.
The polynomial whose values on the unit circle are the DFT is a slightly modified z-transform of x(n) given by
which for clarity in this development uses a positive exponent . This is illustrated for a length-4 sequence as a third-order polynomial by
The DFT is found by evaluating Equation 18 at
where
The most efficient way of evaluating a general polynomial without any pre-processing is by “Horner's rule" [11] which is a nested evaluation. This is illustrated for the polynomial in Equation 19 by
This nested sequence of operations can be written as a linear difference equation in the form of
with initial condition
Equation Equation 23 can be viewed as a first-order IIR filter with the
input being the data sequence in reverse order and the value of the
polynomial at
with
where
The flowgraph of the algorithm can be found in [3], [15] and a simple FORTRAN program is given in the appendix.
When comparing this program with the direct calculation of Equation 27, it
is seen that the number of floating-point multiplications and additions
are the same. In fact, the structures of the two algorithms look similar,
but close examination shows that the way the sines and cosines enter the
calculations is different. In Equation 27, new sine and cosine values are
calculated for each frequency and for each data value, while for the Goertzel algorithm in Equation 25, they are calculated only for each
frequency in the outer loop. Because of the recursive or feedback nature
of the algorithm, the sine and cosine values are “updated" each loop
rather than recalculated. This results in
It is possible to modify this algorithm to allow entering the data in forward order rather than reverse order. The difference equation Equation 23 becomes
if Equation 24 becomes
for
One of the reasons the first-order Goertzel algorithm does not improve efficiency is that the constant in the feedback or recursive path is complex and, therefore, requires four real multiplications and two real additions. A modification of the scheme to make it second-order removes the complex multiplications and reduces the number of required multiplications by two.
Define the variable
This substituted into the right-hand side of Equation 23 gives
Combining Equation 30 and Equation 31 gives the second order difference equation
which together with the output equation Equation 30, comprise the second-order Goertzel algorithm where
for initial conditions
A similar development starting with Equation 28 gives a second-order algorithm with forward ordered input as
with
and for
Note that both difference equations Equation 32 and Equation 34 are not changed
if
and
Clearly, this allows the DFT of a sequence to be calculated with half
the arithmetic since the outputs are calculated two at a time. The
second-order DE actually produces a solution
Analysis of the various forms of the Goertzel algorithm from their programs gives the following operation count for real multiplications and real additions assuming real data.
| Algorithm | Real Mults. | Real Adds | Trig Eval. |
| Direct DFT |
|
|
|
| First-Order |
|
|
|
| Second-Order |
|
|
|
| Second-Order 2 |
|
|
|
Timings of the algorithms on a PC in milliseconds are given in the following table.
| Algorithm |
|
|
| Direct DFT | 4.90 | 19.83 |
| First-Order | 4.01 | 16.70 |
| Second-Order | 2.64 | 11.04 |
| Second-Order 2 | 1.32 | 5.55 |
These timings track the floating point operation counts fairly well.
Goertzel's algorithm in its first-order form is not particularly interesting, but the two-at-a-time second-order form is significantly faster than a direct DFT. It can also be used for any polynomial evaluation or for the DTFT at unequally spaced values or for evaluating a few DFT terms. A very interesting observation is that the inner-most loop of the Glassman-Ferguson FFT [6] is a first-order Goertzel algorithm even though that FFT is developed in a very different framework.
In addition to floating-point arithmetic counts, the number of trigonometric
function evaluations that must be made or the size of a table to store
precomputed values should be considered. Since the value of the
It may be possible to further improve the efficiency of the second-order
Goertzel algorithm for calculating all of the DFT of a number sequence.
Perhaps a fourth order DE could calculate four output values at a time and
they could be separated by a numerator that would cancel three of the
zeros. Perhaps the algorithm could be arranged in stages to give an
One stage of the QFT can use the symmetries of the sines and cosines to
calculate a DFT more efficiently than directly implementing the definition
Multidimensional Index Mapping: Equation 1. Similar to the Goertzel algorithm, the one-stage QFT is a
better
"The Fast Fourier Transform (FFT) is a landmark algorithm used in fields ranging from signal processing to high-performance computing. First popularized by two American scientists in 1965, the […]"