Connexions

You are here: Home » Content » Fast Convolution
Content Actions

Fast Convolution

Module by: Douglas L. Jones

Summary: Efficient computation of convolution using FFTs.

Fast Circular Convolution

Since, m=0N-1xmhn-mmodN=yn is equivalent to Yk=XkHk m 0 N 1 x m h n m N y n is equivalent to Y k X k H k yn y n can be computed as yn=IDFTDFTxnDFThn y n IDFT DFT x n DFT h n
    Cost
    • Direct
    • N2 N 2 complex multiplies.
    • NN-1 N N 1 complex adds.
    • Via FFTs
    • 3 FFTs + NN multipies.
    • N+3N2log2N N 3 N 2 2 N complex multiplies.
    • 3Nlog2N 3 N 2 N complex adds.
If Hk H k can be precomputed, cost is only 2 FFts + NN multiplies.

Fast Linear Convolution

DFT produces cicular convolution. For linear convolution, we must zero-pad sequences so that circular wrap-around always wraps over zeros.
figure6.png
Figure 1
To achieve linear convolution using fast circular convolution, we must use zero-padded DFTs of length NL+M-1 N L M 1
Figure7.png
Figure 2
Choose shortest convenient N N (usually smallest power-of-two greater than or equal to L+M-1 L M 1 ) yn= IDFT N DFT N xn DFT N hn y n IDFT N DFT N x n DFT N h n
note: There is some inefficiency when compared to circular convolution due to longer zero-padded DFTs. Still, ONlog2N O N 2 N savings over direct computation.

Running Convolution

Suppose L= L , as in a real time filter application, or LM L M . There are efficient block methods for computing fast convolution.

Overlap-Save (OLS) Method

Note that if a length-MM filter hn h n is circularly convulved with a length-NN segment of a signal xn x n ,
figure4.png
Figure 3
the first M-1 M 1 samples are wrapped around and thus is incorrect. However, for M-1nN-1 M 1 n N 1 ,the convolution is linear convolution, so these samples are correct. Thus N-M+1 N M 1 good outputs are produced for each length-NN circular convolution.
The Overlap-Save Method: Break long signal into successive blocks of N N samples, each block overlapping the previous block by M-1 M 1 samples. Perform circular convolution of each block with filter hm h m . Discard first M-1 M 1 points in each output block, and concatenate the remaining points to create yn y n .
Figure1.png
Figure 4
Computation cost for a length-NN equals 2n 2 n FFT per output sample is (assuming precomputed Hk H k ) 2 FFTs and NN multiplies 2N2log2N+NN-M+1=Nlog2N+1N-M+1 complex multiplies 2 N 2 2 N N N M 1 N 2 N 1 N M 1 complex multiplies 2Nlog2NN-M+1=2Nlog2NN-M+1 complex adds 2 N 2 N N M 1 2 N 2 N N M 1 complex adds
Compare to M M mults, M-1 M 1 adds per output point for direct method. For a given M M, optimal N N can be determined by finding N N minimizing operation counts. Usualy, optimal N N is 4MNopt8M 4 M Nopt 8 M .

Overlap-Add (OLA) Method

Zero-pad length-LL blocks by M-1 M 1 samples.
figure5.png
Figure 5
Add successive blocks, overlapped by M-1 M 1 samples, so that the tails sum to produce the complete linear convolution.
Figure2.png
Figure 6
Computational Cost: Two length N=L+M-1 N L M 1 FFTs and M M mults and M-1 M 1 adds per L L output points; essentially the sames as OLS method.

Comments, questions, feedback, criticisms?

Send feedback