FFTs of length
N=2M
N
2
M
equal to a power of two are, by far, the most commonly used.
These algorithms are very efficient, relatively simple, and a single program can compute
power-of-two FFTs of different lengths.
As with most FFT algorithms, they gain their efficiency by computing
all DFT points
simultaneously through extensive reuse of intermediate computations; they are thus efficient
when many DFT frequency samples are needed.
The simplest power-of-two FFTs are the decimation-in-time radix-2 FFT
and the decimation-in-frequency radix-2 FFT; they reduce the
length-
N=2M
N
2
M
DFT to a series of length-2 DFT computations with twiddle-factor complex multiplications
between them.
The radix-4 FFT algorithm
similarly reduces a length-
N=4M
N
4
M
DFT to a series of length-4 DFT computations with twiddle-factor multiplies
in between.
Radix-4 FFTs require only 75% as many complex multiplications as the radix-2
algorithms, although the number of complex additions remains the same.
Radix-8 and higher-radix FFT algorithms can be derived using
multi-dimensional index maps to reduce the
computational complexity a bit more.
However, the split-radix algorithm and its recent extensions combine
the best elements of the radix-2 and radix-4 algorithms to obtain lower
complexity than either or than any higher radix, requiring only two-thirds
as many complex multiplies as the radix-2 algorithms.
All of these algorithms obtain huge savings over direct computation of the DFT, reducing the complexity from
ON2
O
N
2
to
ONlogN
O
N
N
.
The efficiency of an FFT implementation depends on more than just the
number of computations.
Efficient FFT programming tricks can make
up to a several-fold difference in the run-time of FFT programs.
Alternate FFT structures can lead to
a more convenient data flow for certain hardware.
As discussed in choosing the best FFT algorithm,
certain hardware is designed for, and thus most efficient for, FFTs of specific lengths or radices.