Note: Your browser may not currently support MathML. See our browser support page for additional details. You can always view the correct math in the PDF version.
To write a program that computes the circular convolution of
In Prime Factor Permutations we discussed the permutation pfp()
appears in the appendix.
The reduction operations KRED()
etc, also appear in the appendix.
To carry out the operation of
To facilitate the discussion of the programs we generate,
it is useful to consider an example.
Take as an example the 45 point circular convolution
algorithm listed in the appendix.
From Equation 19 from Bilinear Forms for Circular Convolution we find that we need to compute
We noted above that bilinear forms for linear convolution,
In our approach this is what we have done.
When we use the bilinear forms for convolution given
in the appendix, for which
and since
From the discussion above, we found that the Kronecker products
like ID2I(a,b,x)
and
ID3I(a,b,x)
and are listed in the appendix.
The transposed form, ID2tI(a,b,x)
.
To compute the multiplicative constants we need
The Matlab function KFt
carries out
the operation Kcrot
implements the
operation
By recognizing that the convolution algorithms for different lengths share a lot of the same computations, it is possible to write a set of programs that take advantage of this. The programs we have generated call functions from a relatives small set. Each program calls these functions with different arguments, in differing orders, and a different number of times. By organizing the program structure in a modular way, we are able to generate relatively compact code for a wide variety of lengths.
In the appendix we have listed code for the following functions, from which we create circular convolution algorithms. In the next section we generate FFT programs using this same set of functions.
pfp
implements this permutation of Prime Factor Permutations.
Its transpose is implemented by pfpt
.
KRED
implements the reduction operations of Reduction Operations.
Its transpose is implemented by tKRED
.
Its inverse transpose is implemented by itKRED
and this function
is used only for computing the multiplicative constants.
ID2I
and ID3I
are Matlab functions for the operations
ID2tI
and ID3tI
implement the transposes,
Table 1 lists operation counts for some of the
circular convolution algorithms we have generated.
The operation counts do not include any arithmetic operations
involved in the index variable or loops.
They include only the arithmetic operations that
involve the data sequence
The table in [2] for the split nesting algorithm gives
very similar arithmetic operation counts.
For all lengths not divisible by 9, the algorithms we have developed
use the same number of multiplications and the same number or fewer additions.
For lengths which are divisible by 9, the algorithms described in [2]
require fewer additions than do ours.
This is because the algorithms whose operation counts are
tabulated in the table in [2] use
a special
| N | muls | adds | N | muls | adds | N | muls | adds | N | muls | adds | |||
| 2 | 2 | 4 | 24 | 56 | 244 | 80 | 410 | 1546 | 240 | 1640 | 6508 | |||
| 3 | 4 | 11 | 27 | 94 | 485 | 84 | 320 | 1712 | 252 | 1520 | 7920 | |||
| 4 | 5 | 15 | 28 | 80 | 416 | 90 | 380 | 1858 | 270 | 1880 | 9074 | |||
| 5 | 10 | 31 | 30 | 80 | 386 | 105 | 640 | 2881 | 280 | 2240 | 9516 | |||
| 6 | 8 | 34 | 35 | 160 | 707 | 108 | 470 | 2546 | 315 | 3040 | 13383 | |||
| 7 | 16 | 71 | 36 | 95 | 493 | 112 | 656 | 2756 | 336 | 2624 | 11132 | |||
| 8 | 14 | 46 | 40 | 140 | 568 | 120 | 560 | 2444 | 360 | 2660 | 11392 | |||
| 9 | 19 | 82 | 42 | 128 | 718 | 126 | 608 | 3378 | 378 | 3008 | 16438 | |||
| 10 | 20 | 82 | 45 | 190 | 839 | 135 | 940 | 4267 | 420 | 3200 | 14704 | |||
| 12 | 20 | 92 | 48 | 164 | 656 | 140 | 800 | 3728 | 432 | 3854 | 16430 | |||
| 14 | 32 | 170 | 54 | 188 | 1078 | 144 | 779 | 3277 | 504 | 4256 | 19740 | |||
| 15 | 40 | 163 | 56 | 224 | 1052 | 168 | 896 | 4276 | 540 | 4700 | 21508 | |||
| 16 | 41 | 135 | 60 | 200 | 952 | 180 | 950 | 4466 | 560 | 6560 | 25412 | |||
| 18 | 38 | 200 | 63 | 304 | 1563 | 189 | 1504 | 7841 | 630 | 6080 | 28026 | |||
| 20 | 50 | 214 | 70 | 320 | 1554 | 210 | 1280 | 6182 | 720 | 7790 | 30374 | |||
| 21 | 64 | 317 | 72 | 266 | 1250 | 216 | 1316 | 6328 | 756 | 7520 | 38144 |
It is possible to make further improvements to the operation counts given in Table 1[1], [2]. Specifically, algorithms for prime power cyclotomic convolution based on the polynomial transform, although more complicated, will give improvements for the longer lengths listed [1], [2]. These improvements can be easily included in the code generating program we have developed.