One of the main applications of the FFT is to do convolution more
efficiently than the direct calculation from the definition which is:
y
(
n
)
=
∑
m
=
0
n
h
(
m
)
x
(
n
-
m
)
y
(
n
)
=
∑
m
=
0
n
h
(
m
)
x
(
n
-
m
)
(1)
which can also be written as:
y
(
n
)
=
∑
m
=
0
n
x
(
m
)
h
(
n
-
m
)
y
(
n
)
=
∑
m
=
0
n
x
(
m
)
h
(
n
-
m
)
(2)
This is often used to filter a signal x(n)x(n) with a filter whose
impulse response is h(n)h(n). Each output value y(n)y(n) requires NN
multiplications and N-1N-1 additions if y(n)y(n) has NN terms. So, for
NN output values, on the order of N2N2 arithmetic operations are
required.
Because the DFT converts convolution to multiplication:
D
F
T
{
y
(
n
)
}
=
D
F
T
{
h
(
n
)
}
D
F
T
{
x
(
n
)
}
D
F
T
{
y
(
n
)
}
=
D
F
T
{
h
(
n
)
}
D
F
T
{
x
(
n
)
}
(3)
can be calculated with the FFT and bring the order of arithmetic
operations down to Nlog(N)Nlog(N) which can be significate with large NN.
This approach, which is called “fast convolutions", is a form of
block processing since a whole block of x(n)x(n) must be available to
calculate even one output value, y(n)y(n). So, a time delay of one
block length is always required. Another problem is the filtering
use of convolution is usually non-cyclic and the convolution implemented
with the DFT is cyclic. This is dealt with by appending zeros to x(n)x(n)
and h(n)h(n) such that the output of the cyclic convolution gives one
block of the output of the desired non-cyclic convolution.
For filtering and some other applications, one want “on going" convolution
where the filter response h(n)h(n) may be finite in length or duration, but
the input x(n)x(n) is of arbitrary length. Two methods have traditionally
used to break the input into blocks and use the FFT to convolve the block
so that the output that would have been calculated by directly implementing
Equation 1 or Equation 3 can be constructed efficiently. These are called
“overlap-add" and “over-lap save".
Convolution is intimately related to the DFT. It was shown
in The DFT as Convolution or Filtering that a prime length DFT could be converted to
cyclic convolution. It has been long known Entry 48 that
convolution can be calculated by multiplying the DFTs of signals.
An important question is what is the fastest method for
calculating digital convolution. There are several methods that each
have some advantage. The earliest method for fast convolution was
the use of sectioning with overlap-add or overlap-save and the FFT
Entry 48, Entry 53, Entry 10. In most cases the convolution is of real data and,
therefore, real-data FFTs should be used. That approach is still
probably the fastest method for longer convolution on a general
purpose computer or microprocessor. The shorter convolutions should
simply be calculated directly.
The partitioning of long or infinite strings of data into shorter sections
or blocks has been used to allow application of the FFT to realize
on-going or continuous convolution Entry 57, Entry 30. This section
develops the idea of block processing and shows that it is a generalization
of the overlap-add and overlap-save methods Entry 57, Entry 26. They
further generalize the idea to a multidimensional formulation of
convolution Entry 3, Entry 15. Moving in the opposite direction, it is
shown that, rather than partitioning a string of scalars into blocks and
then into blocks of blocks, one can partition a scalar number into blocks
of bits and then include the operation of multiplication in the signal
processing formulation. This is called distributed arithmetic Entry 14
and, since it describes operations at the bit level, is completely
general. These notes try to present a coherent development of these
ideas.
In this section the usual convolution and recursion that implements FIR
and IIR discrete-time filters are reformulated in terms of vectors and
matrices. Because the same data is partitioned and grouped in a variety
of ways, it is important to have a consistent notation in order to be
clear. The nthnth element of a data sequence is expressed h(n)h(n) or, in
some cases to simplify, hnhn. A block or finite length column vector is
denoted h̲nh̲n with nn indicating the nthnth block or
section of a longer vector. A matrix, square or rectangular, is indicated
by an upper case letter such as HH with a subscript if appropriate.
The operation of a finite impulse response (FIR) filter is described by a
finite convolution as
y
(
n
)
=
∑
k
=
0
L
-
1
h
(
k
)
x
(
n
-
k
)
y
(
n
)
=
∑
k
=
0
L
-
1
h
(
k
)
x
(
n
-
k
)
(4)
where x(n)x(n) is causal, h(n)h(n) is causal and of length LL, and the time
index nn goes from zero to infinity or some large value. With a change
of index variables this becomes
y
(
n
)
=
∑
k
=
0
n
h
(
n
-
k
)
x
(
k
)
y
(
n
)
=
∑
k
=
0
n
h
(
n
-
k
)
x
(
k
)
(5)
which can be expressed as a matrix operation by
y
0
y
1
y
2
⋮
=
h
0
0
0
⋯
0
h
1
h
0
0
h
2
h
1
h
0
⋮
⋮
x
0
x
1
x
2
⋮
.
y
0
y
1
y
2
⋮
=
h
0
0
0
⋯
0
h
1
h
0
0
h
2
h
1
h
0
⋮
⋮
x
0
x
1
x
2
⋮
.
(6)
The HH matrix of impulse response values is partitioned into NN by NN
square sub matrices and the XX and YY vectors are partitioned into
length-NN blocks or sections. This is illustrated for N=3N=3 by
H
0
=
h
0
0
0
h
1
h
0
0
h
2
h
1
h
0
H
1
=
h
3
h
2
h
1
h
4
h
3
h
2
h
5
h
4
h
3
etc.
H
0
=
h
0
0
0
h
1
h
0
0
h
2
h
1
h
0
H
1
=
h
3
h
2
h
1
h
4
h
3
h
2
h
5
h
4
h
3
etc.
(7)
x
̲
0
=
x
0
x
1
x
2
x
̲
1
=
x
3
x
4
x
5
y
̲
0
=
y
0
y
1
y
2
etc.
x
̲
0
=
x
0
x
1
x
2
x
̲
1
=
x
3
x
4
x
5
y
̲
0
=
y
0
y
1
y
2
etc.
(8)
Substituting these definitions into Equation 6 gives
y
̲
0
y
̲
1
y
̲
2
⋮
=
H
0
0
0
⋯
0
H
1
H
0
0
H
2
H
1
H
0
⋮
⋮
x
̲
0
x
̲
1
x
̲
2
⋮
y
̲
0
y
̲
1
y
̲
2
⋮
=
H
0
0
0
⋯
0
H
1
H
0
0
H
2
H
1
H
0
⋮
⋮
x
̲
0
x
̲
1
x
̲
2
⋮
(9)
The general expression for the nthnth output block is
y
̲
n
=
∑
k
=
0
n
H
n
-
k
x
̲
k
y
̲
n
=
∑
k
=
0
n
H
n
-
k
x
̲
k
(10)
which is a vector or block convolution. Since the matrix-vector
multiplication within the block convolution is itself a convolution, Equation 11
is a sort of convolution of convolutions and the finite length
matrix-vector multiplication can be carried out using the FFT or other
fast convolution methods.
The equation for one output block can be written as the product
y
̲
2
=
H
2
H
1
H
0
x
̲
0
x
̲
1
x
̲
2
y
̲
2
=
H
2
H
1
H
0
x
̲
0
x
̲
1
x
̲
2
(11)
and the effects of one input block can be written
H
0
H
1
H
2
x
̲
1
=
y
̲
0
y
̲
1
y
̲
2
.
H
0
H
1
H
2
x
̲
1
=
y
̲
0
y
̲
1
y
̲
2
.
(12)
These are generalize statements of overlap save and overlap add
Entry 57, Entry 26. The block length can be longer, shorter, or equal to
the filter length.
Although less well-known, IIR filters can be implemented with block
processing Entry 24, Entry 18, Entry 59, Entry 12, Entry 13. The block form of an IIR
filter is developed in much the same way as for the block convolution
implementation of the FIR filter. The general constant coefficient
difference equation which describes an IIR filter with recursive
coefficients alal, convolution coefficients bkbk, input signal x(n)x(n),
and output signal y(n)y(n) is given by
y
(
n
)
=
∑
l
=
1
N
-
1
a
l
y
n
-
l
+
∑
k
=
0
M
-
1
b
k
x
n
-
k
y
(
n
)
=
∑
l
=
1
N
-
1
a
l
y
n
-
l
+
∑
k
=
0
M
-
1
b
k
x
n
-
k
(13)
using both functional notation and subscripts, depending on which is
easier and clearer. The impulse response h(n)h(n) is
h
(
n
)
=
∑
l
=
1
N
-
1
a
l
h
(
n
-
l
)
+
∑
k
=
0
M
-
1
b
k
δ
(
n
-
k
)
h
(
n
)
=
∑
l
=
1
N
-
1
a
l
h
(
n
-
l
)
+
∑
k
=
0
M
-
1
b
k
δ
(
n
-
k
)
(14)
which can be written in matrix operator form
1
0
0
⋯
0
a
1
1
0
a
2
a
1
1
a
3
a
2
a
1
0
a
3
a
2
⋮
⋮
h
0
h
1
h
2
h
3
h
4
⋮
=
b
0
b
1
b
2
b
3
0
⋮
1
0
0
⋯
0
a
1
1
0
a
2
a
1
1
a
3
a
2
a
1
0
a
3
a
2
⋮
⋮
h
0
h
1
h
2
h
3
h
4
⋮
=
b
0
b
1
b
2
b
3
0
⋮
(15)
In terms of NN by NN submatrices and length-NN blocks, this becomes
A
0
0
0
⋯
0
A
1
A
0
0
0
A
1
A
0
⋮
⋮
h
̲
0
h
̲
1
h
̲
2
⋮
=
b
̲
0
b
̲
1
0
⋮
A
0
0
0
⋯
0
A
1
A
0
0
0
A
1
A
0
⋮
⋮
h
̲
0
h
̲
1
h
̲
2
⋮
=
b
̲
0
b
̲
1
0
⋮
(16)
From this formulation, a block recursive equation can be written that will
generate the impulse response block by block.
A
0
h
̲
n
+
A
1
h
̲
n
-
1
=
0
for
n
≥
2
A
0
h
̲
n
+
A
1
h
̲
n
-
1
=
0
for
n
≥
2
(17)
h
̲
n
=
-
A
0
-
1
A
1
h
̲
n
-
1
=
K
h
̲
n
-
1
for
n
≥
2
h
̲
n
=
-
A
0
-
1
A
1
h
̲
n
-
1
=
K
h
̲
n
-
1
for
n
≥
2
(18)
with initial conditions given by
h
̲
1
=
-
A
0
-
1
A
1
A
0
-
1
b
̲
0
+
A
0
-
1
b
̲
1
h
̲
1
=
-