Summary: This collection of modules is from a Rice University, ECE Department Technical Report written around September 1994. It grew out of the doctoral and post doctoral research of Ivan Selesnick working with Prof. C. Sidney Burrus at Rice. Earlier reports on this work were published in the ICASSP and ISCAS conference proceedings in 1992-94 and a fairly complete report was published in the IEEE Transaction on Signal Processing in January 1996.
Note: Your browser may not currently support MathML. See our browser support page for additional details. You can always view the correct math in the PDF version.
The development of algorithms for the fast computation of the Discrete Fourier Transform in the last 30 years originated with the radix 2 Cooley-Tukey FFT and the theory and variety of FFTs has grown significantly since then. Most of the work has focused on FFTs whose sizes are composite, for the algorithms depend on the ability to factor the length of the data sequence, so that the transform can be found by taking the transform of smaller lengths. For this reason, algorithms for prime length transforms are building blocks for many composite length FFTs - the maximum length and the variety of lengths of a PFA or WFTA algorithm depend upon the availability of prime length FFT modules. As such, prime length Fast Fourier Transforms are a special, important and difficult case.
Fast algorithms designed for specific short prime lengths have been developed and
have been written as straight line code [3], [2].
These dedicated programs rely upon an observation made in Rader's paper [6] in which
he shows that a prime
In this paper we describe a set of programs
for circular convolution and prime length FFTs
that are are short, possess great structure,
share many computational procedures,
and cover a large variety of lengths.
Because the underlying convolution is decomposed into a
set of disjoint operations
they can be performed in parallel
and this parallelism is made clear in the programs.
Moreover, each of these independent operations is made up of a sequence
of sub-operations
of the form
We have also developed a program that automatically generates these programs for circular convolution and prime length DFTs. This code generating program requires information only about a set of modules for computing cyclotomic convolutions. We compute these non-circular convolutions by computing a linear convolution and reducing the result. Furthermore, because these linear convolution algorithms can be built from smaller ones, the only modules needed are ones for the linear convolution of prime length sequences. It turns out that with linear convolution algorithms for only the lengths 2 and 3, we can generate a wide variety of prime length FFT algorithms. In addition, the code we generate is made up of calls to a relatively small set of functions. Accordingly, the subroutines can be designed and optimized to specifically suit a given architecture.
The programs we describe use Rader's conversion of a prime point DFT into a circular convolution, but this convolution we compute using the split nesting algorithm [5]. As Stasinski notes [11], this yields algorithms possessing greater structure and simpler programs and doesn't generally require more computation.
In computing the DFT of an
Earlier reports on this work were published in the conference proceedings [7], [8], [9] and a fairly complete report was published in the IEEE Transaction on Signal Processing [10]. Some parts of this approach appear in the Connexions book, Fast Fourier Transforms. This work is built on and an extension of that in [9] which is also in the Connexions Technical Report.