The radix-2 decimation-in-frequency and
decimation-in-time fast Fourier transforms
(FFTs) are the simplest
FFT algorithms.
Like all FFTs, they compute the
discrete Fourier transform (DFT)
Xk=∑n=0N-1xnⅇ-ⅈ2πnkN=∑n=0N-1xn
W
N
n
k
X
k
n
N
1
0
x
n
2
n
k
N
n
N
1
0
x
n
W
N
n
k
(1)
where for notational convenience
W
N
k
=ⅇ-ⅈ2πkN
W
N
k
2
k
N
.
FFT algorithms gain their speed by reusing the results of smaller,
intermediate computations to compute multiple DFT frequency outputs.
Decimation in frequency
The radix-2 decimation-in-frequency algorithm rearranges the
discrete Fourier transform (DFT) equation
into two parts: computation of the even-numbered discrete-frequency indices
Xk
X
k
for
k=024…N-2
k
0
2
4
…
N
2
(or
X2r
X
2
r
as in
Equation 2)
and computation of the odd-numbered indices
k=135…N-1
k
1
3
5
…
N
1
(or
X2r+1
X
2
r
1
as in
Equation 3)
X2r=∑n=0N-1xn
W
N
2
r
n
=∑n=0N2-1xn
W
N
2
r
n
+∑n=0N2-1xn+N2
W
N
2
r
(
n
+
N
2
)
=∑n=0N2-1xn
W
N
2
r
n
+∑n=0N2-1xn+N2
W
N
2
r
n
1=∑n=0N2-1xn+xn+N2
W
N
2
r
n
=
DFT
N
2
xn+xn+N2
X
2
r
n
N
1
0
x
n
W
N
2
r
n
n
N
2
1
0
x
n
W
N
2
r
n
n
N
2
1
0
x
n
N
2
W
N
2
r
(
n
+
N
2
)
n
N
2
1
0
x
n
W
N
2
r
n
n
N
2
1
0
x
n
N
2
W
N
2
r
n
1
n
N
2
1
0
x
n
x
n
N
2
W
N
2
r
n
DFT
N
2
x
n
x
n
N
2
(2)
X2r+1=∑n=0N-1xn
W
N
(
2
r
+
1
)
n
=∑n=0N2-1xn+
W
N
N
2
xn+N2
W
N
(
2
r
+
1
)
n
=∑n=0N2-1xn-xn+N2
W
N
n
W
N
2
r
n
=
DFT
N
2
xn-xn+N2
W
N
n
X
2
r
1
n
N
1
0
x
n
W
N
(
2
r
+
1
)
n
n
N
2
1
0
x
n
W
N
N
2
x
n
N
2
W
N
(
2
r
+
1
)
n
n
N
2
1
0
x
n
x
n
N
2
W
N
n
W
N
2
r
n
DFT
N
2
x
n
x
n
N
2
W
N
n
(3)
The mathematical simplifications in
Equation 2 and
Equation 3
reveal that both the even-indexed and odd-indexed frequency outputs
Xk
X
k
can each be computed by a
length-
N2
N
2
DFT.
The inputs to these DFTs are sums or differences of the first and second halves of the input signal,
respectively,
where the input to the short DFT producing the odd-indexed frequencies is multiplied by a so-called
twiddle factor
term
W
N
k
=ⅇ-ⅈ2πkN
W
N
k
2
k
N
.
This is called a
decimation in frequency because the
frequency samples are computed separately in alternating groups,
and a
radix-2 algorithm because there
are two groups.
Figure 1 graphically illustrates this form of the DFT computation.
This conversion of the full DFT into a series of shorter DFTs with a simple
preprocessing step
gives the decimation-in-frequency FFT its computational savings.
Whereas direct computation of all
NN DFT frequencies
according to the
DFT equation would require
N2
N
2
complex multiplies and
N2-N
N
2
N
complex additions (for complex-valued data),
by breaking the computation into two short-length DFTs
with some preliminary combining of the data,
as illustrated in
Figure 1,
the computational cost is now
New Operation Counts-
2N22+N=N22+N2
2
N
2
2
N
N
2
2
N
2
complex multiplies
-
2N2N2-1+N=N22
2
N
2
N
2
1
N
N
2
2
complex additions
This simple manipulation has reduced the total computational cost of the DFT by almost a factor of two!
The initial combining operations for both short-length DFTs involve parallel groups of two time samples,
xn
x
n
and
xn+N2
x
n
N
2
.
One of these so-called
butterfly operations is illustrated in
Figure 2.
There are
N2
N
2
butterflies per
stage, each requiring a complex addition and subtraction followed
by one
twiddle-factor multiplication by
W
N
n
=ⅇ-ⅈ2πnN
W
N
n
2
n
N
on the lower output branch.
It is worthwhile to note that the initial add/subtract part of the
DIF butterfly is actually a length-2 DFT!
The theory of
multi-dimensional index maps
shows that this must be the case, and that FFTs of any factorable
length may consist of successive stages of shorter-length FFTs
with twiddle-factor multiplications in between.
It is also worth noting that this butterfly differs from the
decimation-in-time radix-2 butterfly
in that the twiddle factor multiplication occurs
after the combining.
Radix-2 decimation-in-frequency algorithm
The same radix-2 decimation in frequency can be applied recursively to the two
length-
N2
N
2
DFTs to save additional computation.
When successively applied until
the shorter and shorter DFTs reach length-2, the result is the
radix-2 decimation-in-frequency FFT algorithm.
The full radix-2 decimation-in-frequency decomposition illustrated in
Figure 3
requires
M=log2N
M
2
N
stages, each with
N2
N
2
butterflies per stage.
Each butterfly requires
11
complex multiply and
22
adds per butterfly.
The total cost of the algorithm is thus
Computational cost of radix-2 DIF FFT-
N2log2N
N
2
2
N
complex multiplies
-
Nlog2N
N
2
N
complex adds
This is a remarkable savings over direct computation of the DFT.
For example, a length-1024 DFT would require
10485761048576
complex multiplications and
10475521047552 complex additions
with direct computation, but only
51205120 complex multiplications and
1024010240 complex
additions using the radix-2 FFT, a savings by a factor of 100 or more.
The relative savings increase with longer FFT lengths, and are less for shorter lengths.
Modest additional reductions in computation can be achieved by noting that certain twiddle factors,
namely
W
N
0
W
N
0
,
W
N
N
2
W
N
N
2
,
W
N
N
4
W
N
N
4
,
W
N
N
8
W
N
N
8
,
W
N
3
N
8
W
N
3
N
8
,
require no multiplications, or fewer real multiplies than other ones.
By implementing special butterflies for these twiddle factors
as discussed in
FFT algorithm and programming tricks,
the computational cost of the radix-2 decimation-in-frequency FFT can be reduced to
-
2Nlog2N-7N+12
2
N
2
N
7
N
12
real multiplies
-
3Nlog2N-3N+4
3
N
2
N
3
N
4
real additions
The decimation-in-frequency FFT is a flow-graph
reversal of the
decimation-in-time FFT:
it has the same twiddle factors (in reverse pattern) and the same operation counts.
Note:
In a decimation-in-frequency radix-2 FFT as illustrated in
Figure 3,
the output is in
bit-reversed order (hence
"decimation-in-frequency").
That is, if the frequency-sample index
n
n is written as a binary number, the order is that
binary number reversed.
The bit-reversal process is illustrated
here.
It is important to note that, if the input data are in order
before beginning the FFT computations, the outputs of each butterfly throughout the
computation can be placed in the same memory locations from which the inputs were fetched,
resulting in an in-place algorithm that requires no extra memory to perform
the FFT.
Most FFT implementations are in-place, and overwrite the input data with the intermediate
values and finally the output.