Inside Collection (Course): Digital Signal Processing: A User's Guide
Summary: FFT structures with different memory layout and interconnection patterns are occasionally useful for certain applications. These alternate structures include decimation-in-frequency FFTs with in-order outputs, decimation-in-time FFTs with in-order inputs, structures with both in-order inputs and outputs, and constant-geometry structures with the same connection pattern in every stage.
Bit-reversing the input in decimation-in-time (DIT) FFTs or the output in decimation-in-frequency (DIF) FFTs can sometimes be inconvenient or inefficient. For such situations, alternate FFT structures have been developed. Such structures involve the same mathematical computations as the standard algorithms, but alter the memory locations in which intermediate values are stored or the order of computation of the FFT butterflies.
The structure in Figure 1 computes a decimation-in-frequency FFT, but remaps the memory usage so that the input is bit-reversed, and the output is in-order as in the conventional decimation-in-time FFT. This alternate structure is still considered a DIF FFT because the twiddle factors are applied as in the DIF FFT. This structure is useful if for some reason the DIF butterfly is preferred but it is easier to bit-reverse the input.
There is a similar structure for the decimation-in-time FFT with in-order inputs and bit-reversed frequencies. This structure can be useful for fast convolution on machines that favor decimation-in-time algorithms because the filter can be stored in bit-reverse order, and then the inverse FFT returns an in-order result without ever bit-reversing any data. As discussed in Efficient FFT Programming Tricks, this may save several percent of the execution time.
The structure in Figure 2 implements a decimation-in-frequency FFT that has both input and output in order. It thus avoids the need for bit-reversing altogether. Unfortunately, it destroys the in-place structure somewhat, making an FFT program more complicated and requiring more memory; on most machines the resulting cost exceeds the benefits. This structure can be computed in place if two butterflies are computed simultaneously.
The structure in Figure 3 has a constant geometry; the connections between memory locations are identical in each FFT stage. Since it is not in-place and requires bit-reversal, it is inconvenient for software implementation, but can be attractive for a highly parallel hardware implementation because the connections between stages can be hardwired. An analogous structure exists that has bit-reversed inputs and in-order outputs.