Skip to content Skip to navigation Skip to collection information

OpenStax_CNX

You are here: Home » Content » Automatic Generation of Prime Length FFT Programs » Preliminaries

Navigation

Lenses

What is a lens?

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

This content is ...

Affiliated with (What does "Affiliated with" mean?)

This content is either by members of the organizations listed or about topics related to the organizations listed. Click each link to see a list of all content affiliated with the organization.
  • Rice Digital Scholarship

    This collection is included in aLens by: Digital Scholarship at Rice University

    Click the "Rice Digital Scholarship" link to see all content affiliated with them.

  • Featured Content display tagshide tags

    This collection is included inLens: Connexions Featured Content
    By: Connexions

    Click the "Featured Content" link to see all content affiliated with them.

    Click the tag icon tag icon to display tags associated with this content.

Also in these lenses

  • UniqU content

    This collection is included inLens: UniqU's lens
    By: UniqU, LLC

    Click the "UniqU content" link to see all content selected in this lens.

  • Lens for Engineering

    This module and collection are included inLens: Lens for Engineering
    By: Sidney Burrus

    Click the "Lens for Engineering" link to see all content selected in this lens.

Recently Viewed

This feature requires Javascript to be enabled.

Tags

(What is a tag?)

These tags come from the endorsement, affiliation, and other lenses that include this content.
 

Preliminaries

Module by: Ivan Selesnick, C. Sidney Burrus. E-mail the authors

Summary: This collection of modules is from a Rice University, ECE Department Technical Report written around September 1994. It grew out of the doctoral and post doctoral research of Ivan Selesnick working with Prof. C. Sidney Burrus at Rice. Earlier reports on this work were published in the ICASSP and ISCAS conference proceedings in 1992-94 and a fairly complete report was published in the IEEE Transaction on Signal Processing in January 1996.

Preliminaries

Because we compute prime point DFTs by converting them in to circular convolutions, most of this and the next section is devoted to an explanation of the split nesting convolution algorithm. In this section we introduce the various operations needed to carry out the split nesting algorithm. In particular, we describe the prime factor permutation that is used to convert a one-dimensional circular convolution into a multi-dimensional one. We also discuss the reduction operations needed when the Chinese Remainder Theorem for polynomials is used in the computation of convolution. The reduction operations needed for the split nesting algorithm are particularly well organized. We give an explicit matrix description of the reduction operations and give a program that implements the action of these reduction operations.

The presentation relies upon the notions of similarity transformations, companion matrices and Kronecker products. With them, we describe the split nesting algorithm in a manner that brings out its structure. We find that when companion matrices are used to describe convolution, the reduction operations block diagonalizes the circular shift matrix.

The companion matrix of a monic polynomial, M(s)=m0+m1s++mn-1sn-1+snM(s)=m0+m1s++mn-1sn-1+sn is given by

C M = - m 0 1 1 - m 1 1 - m n - 1 . C M = - m 0 1 1 - m 1 1 - m n - 1 .
(1)

Its usefulness in the following discussion comes from the following relation which permits a matrix formulation of convolution. Let

X ( s ) = x 0 + x 1 s + x n - 1 s n - 1 H ( s ) = h 0 + h 1 s + h n - 1 s n - 1 Y ( s ) = y 0 + y 1 s + y n - 1 s n - 1 M ( s ) = m 0 + m 1 s + m n - 1 s n - 1 + s n X ( s ) = x 0 + x 1 s + x n - 1 s n - 1 H ( s ) = h 0 + h 1 s + h n - 1 s n - 1 Y ( s ) = y 0 + y 1 s + y n - 1 s n - 1 M ( s ) = m 0 + m 1 s + m n - 1 s n - 1 + s n
(2)

Then

Y ( s ) = H ( s ) X ( s ) M ( s ) y = k = 0 n - 1 h k C M k x Y ( s ) = H ( s ) X ( s ) M ( s ) y = k = 0 n - 1 h k C M k x
(3)

where y=(y0,,yn-1)ty=(y0,,yn-1)t, x=(x0,,xn-1)tx=(x0,,xn-1)t, and CMCM is the companion matrix of M(s)M(s). In Equation 3, we say yy is the convolution of xx and hh with respect to M(s)M(s). In the case of circular convolution, M(s)=sn-1M(s)=sn-1 and Csn-1Csn-1 is the circular shift matrix denoted by SnSn,

S n = 1 1 1 S n = 1 1 1
(4)

Notice that any circulant matrix can be written as khkSnkkhkSnk.

Similarity transformations can be used to interpret the action of some convolution algorithms. If CM=T-1ATCM=T-1AT for some matrix TT (CMCM and AA are similar, denoted CMACMA), then Equation 3 becomes

y = T - 1 k = 0 n - 1 h k A k T x . y = T - 1 k = 0 n - 1 h k A k T x .
(5)

That is, by employing the similarity transformation given by TT in this way, the action of SnkSnk is replaced by that of AkAk. Many circular convolution algorithms can be understood, in part, by understanding the manipulations made to SnSn and the resulting new matrix AA. If the transformation TT is to be useful, it must satisfy two requirements: (1) TxTx must be simple to compute, and (2) AA must have some advantageous structure. For example, by the convolution property of the DFT, the DFT matrix FF diagonalizes SnSn,

S n = F - 1 w 0 w 1 w n - 1 F S n = F - 1 w 0 w 1 w n - 1 F
(6)

so that it diagonalizes every circulant matrix. In this case, TxTx can be computed by using an FFT and the structure of AA is the simplest possible. So the two above mentioned conditions are met.

The Winograd Structure can be described in this manner also. Suppose M(s)M(s) can be factored as M(s)=M1(s)M2(s)M(s)=M1(s)M2(s) where M1M1 and M2M2 have no common roots, then CMCM1CM2CMCM1CM2 where denotes the matrix direct sum. Using this similarity and recalling Equation 3, the original convolution is decomposed into disjoint convolutions. This is, in fact, a statement of the Chinese Remainder Theorem for polynomials expressed in matrix notation. In the case of circular convolution, sn-1=d|nΦd(s)sn-1=d|nΦd(s), so that SnSn can be transformed to a block diagonal matrix,

S n C Φ 1 C Φ d C Φ n = d | n C Φ d S n C Φ 1 C Φ d C Φ n = d | n C Φ d
(7)

where Φd(s)Φd(s) is the dthdth cyclotomic polynomial. In this case, each block represents a convolution with respect to a cyclotomic polynomial, or a `cyclotomic convolution'. Winograd's approach carries out these cyclotomic convolutions using the Toom-Cook algorithm. Note that for each divisor, dd, of nn there is a corresponding block on the diagonal of size φ(d)φ(d), for the degree of Φd(s)Φd(s) is φ(d)φ(d) where φ(·)φ(·) is the Euler totient function. This method is good for short lengths, but as nn increases the cyclotomic convolutions become cumbersome, for as the number of distinct prime divisors of dd increases, the operation described by khkCΦdkkhkCΦdk becomes more difficult to implement.

The Agarwal-Cooley Algorithm utilizes the fact that

S n = P t S n 1 S n 2 P S n = P t S n 1 S n 2 P
(8)

where n=n1n2n=n1n2, (n1,n2)=1(n1,n2)=1 and PP is an appropriate permutation [1]. This converts the one dimensional circular convolution of length nn to a two dimensional one of length n1n1 along one dimension and length n2n2 along the second. Then an n1n1-point and an n2n2-point circular convolution algorithm can be combined to obtain an nn-point algorithm. In polynomial notation, the mapping accomplished by this permutation PP can be informally indicated by

Y ( s ) = X ( s ) H ( s ) s n - 1 P Y ( s , t ) = X ( s , t ) H ( s , t ) s n 1 - 1 , t n 2 - 1 . Y ( s ) = X ( s ) H ( s ) s n - 1 P Y ( s , t ) = X ( s , t ) H ( s , t ) s n 1 - 1 , t n 2 - 1 .
(9)

It should be noted that Equation 8 implies that a circulant matrix of size n1n2n1n2 can be written as a block circulant matrix with circulant blocks.

The Split-Nesting algorithm [3] combines the structures of the Winograd and Agarwal-Cooley methods, so that SnSn is transformed to a block diagonal matrix as in Equation 7,

S n d | n Ψ ( d ) . S n d | n Ψ ( d ) .
(10)

Here Ψ(d)=p|d,pPCΦHd(p)Ψ(d)=p|d,pPCΦHd(p) where Hd(p)Hd(p) is the highest power of pp dividing dd, and PP is the set of primes.

Example 1

S 45 1 C Φ 3 C Φ 9 C Φ 5 C Φ 3 C Φ 5 C Φ 9 C Φ 5 S 45 1 C Φ 3 C Φ 9 C Φ 5 C Φ 3 C Φ 5 C Φ 9 C Φ 5
(11)

In this structure a multidimensional cyclotomic convolution, represented by Ψ(d)Ψ(d), replaces each cyclotomic convolution in Winograd's algorithm (represented by CΦdCΦd in Equation 7. Indeed, if the product of b1,,bkb1,,bk is dd and they are pairwise relatively prime, then CΦdCΦb1CΦbkCΦdCΦb1CΦbk. This gives a method for combining cyclotomic convolutions to compute a longer circular convolution. It is like the Agarwal-Cooley method but requires fewer additions [3].

Prime Factor Permutations

One can obtain Sn1Sn2Sn1Sn2 from Sn1n2Sn1n2 when (n1,n2)=1(n1,n2)=1, for in this case, SnSn is similar to Sn1Sn2Sn1Sn2, n=n1n2n=n1n2. Moreover, they are related by a permutation. This permutation is that of the prime factor FFT algorithms and is employed in nesting algorithms for circular convolution [1], [2]. The permutation is described by Zalcstein [7], among others, and it is his description we draw on in the following.

Let n=n1n2n=n1n2 where (n1,n2)=1(n1,n2)=1. Define ekek, (0kn-10kn-1), to be the standard basis vector, (0,,0,1,0,,0)t(0,,0,1,0,,0)t, where the 1 is in the kthkth position. Then, the circular shift matrix, SnSn, can be described by

S n e k = e k + 1 n . S n e k = e k + 1 n .
(12)

Note that, by inspection,

( S n 2 S n 1 ) e a + n 1 b = e a + 1 n 1 + n 1 b + 1 n 2 ( S n 2 S n 1 ) e a + n 1 b = e a + 1 n 1 + n 1 b + 1 n 2
(13)

where 0an1-10an1-1 and 0bn2-10bn2-1. Because n1n1 and n2n2 are relatively prime a permutation matrix PP can be defined by

P e k = e k n 1 + n 1 k n 2 . P e k = e k n 1 + n 1 k n 2 .
(14)

With this PP,

P S n e k = P e k + 1 n = e k + 1 n n 1 + n 1 k + 1 n n 2 = e k + 1 n 1 + n 1 k + 1 n 2 P S n e k = P e k + 1 n = e k + 1 n n 1 + n 1 k + 1 n n 2 = e k + 1 n 1 + n 1 k + 1 n 2
(15)

and

( S n 2 S n 1 ) P e k = ( S n 2 S n 1 ) e k n 1 + n 1 k n 2 = e k + 1 n 1 + n 1 k + 1 n 2 . ( S n 2 S n 1 ) P e k = ( S n 2 S n 1 ) e k n 1 + n 1 k n 2 = e k + 1 n 1 + n 1 k + 1 n 2 .
(16)

Since PSnek=(Sn2Sn1)PekPSnek=(Sn2Sn1)Pek and P-1=PtP-1=Pt, one gets, in the multi-factor case, the following.

Lemma 1

If n=n1nkn=n1nk and n1,...,nkn1,...,nk are pairwise relatively prime, then Sn=Pt(SnkSn1)PSn=Pt(SnkSn1)P where PP is the permutation matrix given by P ek =e k n1 +n1 k n2 ++ n1 nk-1k n k P ek =e k n1 +n1 k n2 ++ n1 nk-1k n k .

This useful permutation will be denoted here as Pnk,,n1Pnk,,n1. If n=p1e1p2e2pkekn=p1e1p2e2pkek then this permutation yields the matrix, Sp1e1SpkekSp1e1Spkek. This product can be written simply as i=1kSpieii=1kSpiei, so that one has Sn=Pn1,,nkti=1kSpieiPn1,,nkSn=Pn1,,nkti=1kSpieiPn1,,nk.

It is quite simple to show that

P a , b , c = ( I a P b , c ) P a , b c = ( P a , b I c ) P a b , c . P a , b , c = ( I a P b , c ) P a , b c = ( P a , b I c ) P a b , c .
(17)

In general, one has

P n 1 , , n k = i = 2 k ( P n 1 n i - 1 , n i I n i + 1 n k ) . P n 1 , , n k = i = 2 k ( P n 1 n i - 1 , n i I n i + 1 n k ) .
(18)

A Matlab function for Pa,bIsPa,bIs is pfp2I() in one of the appendices. This program is a direct implementation of the definition. In a paper by Templeton [5], another method for implementing Pa,bPa,b, without `if' statements, is given. That method requires some precalculations, however. A function for Pn1,,nkPn1,,nk is pfp(). It uses Equation 18 and calls pfp2I() with the appropriate arguments.

Reduction Operations

The Chinese Remainder Theorem for polynomials can be used to decompose a convolution of two sequences (the polynomial product of two polynomials evaluated modulo a third polynomial) into smaller convolutions (smaller polynomial products) [6]. The Winograd nn point circular convolution algorithm requires that polynomials are reduced modulo the cyclotomic polynomial factors of sn-1sn-1, Φd(s)Φd(s) for each dd dividing nn.

When nn has several prime divisors the reduction operations become quite complicated and writing a program to implement them is difficult. However, when nn is a prime power, the reduction operations are very structured and can be done in a straightforward manner. Therefore, by converting a one-dimensional convolution to a multi-dimensional one, in which the length is a prime power along each dimension, the split nesting algorithm avoids the need for complicated reductions operations. This is one advantage the split nesting algorithm has over the Winograd algorithm.

By applying the reduction operations appropriately to the circular shift matrix, we are able to obtain a block diagonal form, just as in the Winograd convolution algorithm. However, in the split nesting algorithm, each diagonal block represents multi-dimensional cyclotomic convolution rather than a one-dimensional one. By forming multi-dimensional convolutions out of one-dimensional ones, it is possible to combine algorithms for smaller convolutions (see the next section). This is a second advantage split nesting algorithm has over the Winograd algorithm. The split nesting algorithm, however, generally uses more than the minimum number of multiplications.

Below we give an explicit matrix description of the required reduction operations, give a program that implements them, and give a formula for the number of additions required. (No multiplications are needed.)

First, consider n=pn=p, a prime. Let

X ( s ) = x 0 + x 1 s + + x p - 1 s p - 1 X ( s ) = x 0 + x 1 s + + x p - 1 s p - 1
(19)

and recall sp-1=(s-1)(sp-1+sp-2++s+1)=Φ1(s)Φp(s)sp-1=(s-1)(sp-1+sp-2++s+1)=Φ1(s)Φp(s). The residue X ( s ) Φ 1 ( s ) X ( s ) Φ 1 ( s ) is found by summing the coefficients of X(s)X(s). The residue X ( s ) Φ p ( s ) X ( s ) Φ p ( s ) is given by k=0p-2(xk-xp-1)skk=0p-2(xk-xp-1)sk. Define RpRp to be the matrix that reduces X(s)X(s) modulo Φ1(s)Φ1(s) and Φp(s)Φp(s), such that if X 0 ( s ) = X ( s ) Φ 1 ( s ) X 0 ( s ) = X ( s ) Φ 1 ( s ) and X 1 ( s ) = X ( s ) Φ p ( s ) X 1 ( s ) = X ( s ) Φ p ( s ) then

X 0 X 1 = R p X X 0 X 1 = R p X
(20)

where XX, X0X0 and X1X1 are vectors formed from the coefficients of X(s)X(s), X0(s)X0(s) and X1(s)X1(s). That is,

R p = 1 1 1 1 1 1 - 1 1 - 1 1 - 1 1 - 1 R p = 1 1 1 1 1 1 - 1 1 - 1 1 - 1 1 - 1
(21)

so that Rp= 1 - 1 GpRp= 1 - 1 Gp where GpGp is the Φp(s)Φp(s) reduction matrix of size (p-1)×p(p-1)×p. Similarly, let X(s)=x0+x1s++xpe-1spe-1X(s)=x0+x1s++xpe-1spe-1 and define RpeRpe to be the matrix that reduces X(s)X(s) modulo Φ1(s)Φ1(s), Φp(s)Φp(s), ..., Φpe(s)Φpe(s) such that

X 0 X 1 X e = R p e X , X 0 X 1 X e = R p e X ,
(22)

where as above, XX, X0X0, ..., XeXe are the coefficients of X ( s ) X ( s ) , X ( s ) Φ 1 ( s ) X ( s ) Φ 1 ( s ) , ..., X ( s ) Φ p e ( s ) X ( s ) Φ p e ( s ) .

It turns out that R p e R p e can be written in terms of R p R p . Consider the reduction of X ( s ) = x 0 + + x 8 s 8 X ( s ) = x 0 + + x 8 s 8 by Φ 1 ( s ) = s - 1 Φ 1 ( s ) = s - 1 , Φ 3 ( s ) = s 2 + s + 1 Φ 3 ( s ) = s 2 + s + 1 , and Φ 9 ( s ) = s 6 + s 3 + 1 Φ 9 ( s ) = s 6 + s 3 + 1 . This is most efficiently performed by reducing X ( s ) X ( s ) in two steps. That is, calculate X ' ( s ) = X ( s ) s 3 - 1 X ' ( s ) = X ( s ) s 3 - 1 and X 2 ( s ) = X ( s ) s 6 + s 3 + 1 X 2 ( s ) = X ( s ) s 6 + s 3 + 1 . Then calculate X 0 ( s ) = X ' ( s ) s - 1 X 0 ( s ) = X ' ( s ) s - 1 and X 1 ( s ) = X ' ( s ) s 2 + s + 1 X 1 ( s ) = X ' ( s ) s 2 + s + 1 . In matrix notation this becomes

X ' X 2 = I 3 I 3 I 3 I 3 - I 3 I 3 - I 3 X and X 0 X 1 = 1 1 1 1 - 1 1 - 1 X ' . X ' X 2 = I 3 I 3 I 3 I 3 - I 3 I 3 - I 3 X and X 0 X 1 = 1 1 1 1 - 1 1 - 1 X ' .
(23)

Together these become

X 0 X 1 X 2 = R 3 I 3 I 3 I 3 I 3 I 3 I 3 - I 3 I 3 - I 3 X X 0 X 1 X 2 = R 3 I 3 I 3 I 3 I 3 I 3 I 3 - I 3 I 3 - I 3 X
(24)

or

X 0 X 1 X 2 = ( R 3 I 6 ) ( R 3 I 3 ) X X 0 X 1 X 2 = ( R 3 I 6 ) ( R 3 I 3 ) X
(25)

so that R9=(R3I6)(R3I3)R9=(R3I6)(R3I3) where denotes the matrix direct sum. Similarly, one finds that R27=(R3I24)((R3I3)I18)(R3I9)R27=(R3I24)((R3I3)I18)(R3I9). In general, one has the following.

Lemma 2

RpeRpe is a pe×pepe×pe matrix given by Rpe=k=0e-1((RpIpk)Ipe-pk+1)Rpe=k=0e-1((RpIpk)Ipe-pk+1) and can be implemented with 2(pe-1)2(pe-1) additions.

The following formula gives the decomposition of a circular convolution into disjoint non-circular convolutions when the number of points is a prime power.

R p e S p e R p e - 1 = 1 C Φ p C Φ p e = i = 0 e C Φ p i R p e S p e R p e - 1 = 1 C Φ p C Φ p e = i = 0 e C Φ p i
(26)

Example 2

R 9 S 9 R 9 - 1 = 1 C Φ 3 C Φ 9 R 9 S 9 R 9 - 1 = 1 C Φ 3 C Φ 9
(27)

It turns out that when nn is not a prime power, the reduction of polynomials modulo the cyclotomic polynomial Φn(s)Φn(s) becomes complicated, and with an increasing number of prime factors, the complication increases. Recall, however, that a circular convolution of length p1e1pkekp1e1pkek can be converted (by an appropriate permutation) into a kk dimensional circular convolution of length pieipiei along dimension ii. By employing this one-dimensional to multi-dimensional mapping technique, one can avoid having to perform polynomial reductions modulo Φn(s)Φn(s) for non-prime-power nn.

The prime factor permutation discussed previously is the permutation that converts a one-dimensional circular convolution into a multi-dimensional one. Again, we can use the Kronecker product to represent this. In this case, the reduction operations are applied to each matrix in the following way:

T S p 1 e 1 S p k e k T - 1 = i = 0 e 1 C Φ p 1 i i = 0 e k C Φ p k i T S p 1 e 1 S p k e k T - 1 = i = 0 e 1 C Φ p 1 i i = 0 e k C Φ p k i
(28)

where

T = R p 1 e 1 R p k e k T = R p 1 e 1 R p k e k
(29)

Example 3

T S 9 S 5 T - 1 = 1 C Φ 3 C Φ 9 1 C Φ 5 T S 9 S 5 T - 1 = 1 C Φ 3 C Φ 9 1 C Φ 5
(30)

where T=R9R5T=R9R5.

The matrix Rp1e1RpkekRp1e1Rpkek can be factored using a property of the Kronecker product. Notice that (AB)=(AI)(IB)(AB)=(AI)(IB), and (ABC)=(AI)(IBI)(IC)(ABC)=(AI)(IBI)(IC) (with appropriate dimensions) so that one gets

i = 1 k R p i e i = i = 1 k ( I m i R p i e i I n i ) , i = 1 k R p i e i = i = 1 k ( I m i R p i e i I n i ) ,
(31)

where mi=j=1i-1pjejmi=j=1i-1pjej, ni=j=i+1kpjejni=j=i+1kpjej and where the empty product is taken to be 1. This factorization shows that TT can be implemented basically by implementing copies of RpeRpe. There are many variations on this factorization as explained in [4]. That the various factorization can be interpreted as vector or parallel implementations is also explained in [4].

Example 4

R 9 R 5 = ( R 9 I 5 ) ( I 9 R 5 ) R 9 R 5 = ( R 9 I 5 ) ( I 9 R 5 )
(32)

and

R 9 R 25 R 7 = ( R 9 I 175 ) ( I 9 R 25 I 7 ) ( I 225 R 7 ) R 9 R 25 R 7 = ( R 9 I 175 ) ( I 9 R 25 I 7 ) ( I 225 R 7 )
(33)

When this factored form of RniRni or any of the variations alluded to above, is used, the number of additions incurred is given by

i = 1 k N p i e i A ( R p i e i ) = i = 1 k N p i e i 2 ( p i e i - 1 ) = 2 N i = 1 k 1 - 1 p i e i = 2 N k - i = 1 k 1 p i e i i = 1 k N p i e i A ( R p i e i ) = i = 1 k N p i e i 2 ( p i e i - 1 ) = 2 N i = 1 k 1 - 1 p i e i = 2 N k - i = 1 k 1 p i e i
(34)

where N=p1e1pkekN=p1e1pkek.

Although the use of operations of the form Rp1e1RpkekRp1e1Rpkek is simple, it does not exactly separate the circular convolution into smaller disjoint convolutions. In other words, its use does not give rise in Equation 28 and Equation 30 to block diagonal matrices whose diagonal blocks are the form iCΦpiiCΦpi. However, by reorganizing the arrangement of the operations we can obtain the block diagonal form we seek.

First, suppose AA, BB and CC are matrices of sizes a×aa×a, b×bb×b and c×cc×c respectively. If

T B T - 1 = B 1 B 2 T B T - 1 = B 1 B 2
(35)

where B1B1 and B2B2 are matrices of sizes b1×b1b1×b1 and b2×b2b2×b2, then

Q ( A B C ) Q - 1 = A B 1 C A B 2 C Q ( A B C ) Q - 1 = A B 1 C A B 2 C
(36)

where

Q = I a T ( 1 : b 1 , : ) I c I a T ( b 1 + 1 : b , : ) I c . Q = I a T ( 1 : b 1 , : ) I c I a T ( b 1 + 1 : b , : ) I c .
(37)

Here T(1:b1,:)T(1:b1,:) denotes the first b1b1 rows and all the columns of TT and similarly for T(b1+1:b,:)T(b1+1:b,:). Note that

A B 1 C A B 2 C A B 1 B 2 C . A B 1 C A B 2 C A B 1 B 2 C .
(38)

That these two expressions are not equal explains why the arrangement of operations must be reorganized in order to obtain the desired block diagonal form. The appropriate reorganization is described by the expression in Equation 37. Therefore, we must modify the transformation of Equation 28 appropriately. It should be noted that this reorganization of operations does not change their computational cost. It is still given by Equation 34.

For example, we can use this observation and the expression in Equation 37 to arrive at the following similarity transformation:

Q S p 1 S p 2 Q - 1 = 1 C Φ p 1 C Φ p 2 C Φ p 1 C Φ p 2 Q S p 1 S p 2 Q - 1 = 1 C Φ p 1 C Φ p 2 C Φ p 1 C Φ p 2
(39)

where

Q = I p 1 1 -p 2 t I p 1 G p 2 R p 1 I p 2 Q = I p 1 1 -p 2 t I p 1 G p 2 R p 1 I p 2
(40)

1 -p 1 -p is a column vector of pp 1's

1 -p = 1 1 1 t 1 -p = 1 1 1 t
(41)

and GpGp is the (p-1)×p(p-1)×p matrix:

G p = 1 - 1 1 - 1 1 - 1 = I p - 1 - 1 ̲ p - 1 . G p = 1 - 1 1 - 1 1 - 1 = I p - 1 - 1 ̲ p - 1 .
(42)

In general we have

R S p 1 e 1 S p k e k R - 1 = d | n Ψ ( d ) R S p 1 e 1 S p k e k R - 1 = d | n Ψ ( d )
(43)

where R=Rp1e1,,pkekR=Rp1e1,,pkek is given by

R p 1 e 1 , , p k e k = i = k 1 Q ( m i , p i e i , n i ) R p 1 e 1 , , p k e k = i = k 1 Q ( m i , p i e i , n i )
(44)

with mi=j=1i-1pjejmi=j=1i-1pjej, ni=j=i+1kpjejni=j=i+1kpjej and

Q ( a , p e , c ) = j = 0 e - 1 I a 1 - p t I c p j I a G p I c p j I a c ( p e - p j + 1 ) . Q ( a , p e , c ) = j = 0 e - 1 I a 1 - p t I c p j I a G p I c p j I a c ( p e - p j + 1 ) .
(45)

1 -p 1 -p and G p G p are as given in Equation 41 and Equation 42.

Example 5

R S 9 S 5 R - 1 = 1 C Φ 3 C Φ 9 C Φ 5 C Φ 3 C Φ 5 C Φ 9 C Φ 5 R S 9 S 5 R - 1 = 1 C Φ 3 C Φ 9 C Φ 5 C Φ 3 C Φ 5 C Φ 9 C Φ 5
(46)
where
R = R 9 , 5 = Q ( 9 , 5 , 1 ) Q ( 1 , 9 , 5 ) R = R 9 , 5 = Q ( 9 , 5 , 1 ) Q ( 1 , 9 , 5 )
(47)
and RR can be implemented with 152 additions.

Notice the distinction between this example and example "Reduction Operations". In example "Reduction Operations" we obtained from S9S5S9S5 a Kronecker product of two block diagonal matrices, but here we obtained a block diagonal matrix whose diagonal blocks are the Kronecker product of cyclotomic companion matrices. Each block in Equation 46 represents a multi-dimensional cyclotomic convolution.

A Matlab program that carries out the operation Rp1e1,,pkekRp1e1,,pkek in Equation 43 is KRED() .

function x = KRED(P,E,K,x)
% x = KRED(P,E,K,x);
% P : P = [P(1),...,P(K)];
% E : E = [E(K),...,E(K)];
for i = 1:K
   a = prod(P(1:i-1).^E(1:i-1));
   c = prod(P(i+1:K).^E(i+1:K));
   p = P(i);
   e = E(i);
   for j = e-1:-1:0
      x(1:a*c*(p^(j+1))) = RED(p,a,c*(p^j),x(1:a*c*(p^(j+1))));
   end
end

It calls the Matlab program RED() .

function y = RED(p,a,c,x)
% y = RED(p,a,c,x);
y = zeros(a*c*p,1);
for i = 0:c:(a-1)*c
   for j = 0:c-1
      y(i+j+1) = x(i*p+j+1);
      for k = 0:c:c*(p-2)
         y(i+j+1) = y(i+j+1) + x(i*p+j+k+c+1);
         y(i*(p-1)+j+k+a*c+1) = x(i*p+j+k+1) - x(i*p+j+c*(p-1)+1);
      end
   end
end

These two Matlab programs are not written to run as fast as they could be. They are a `naive' coding of Rp1e1,,pkekRp1e1,,pkek and are meant to serve as a basis for more efficient programs. In particular, the indexing and the loop counters can be modified to improve the efficiency. However, the modifications that minimize the overhead incurred by indexing operations depends on the programming language, the compiler and the computer used. These two programs are written with simple loop counters and complicated indexing operations so that appropriate modifications can be easily made.

Inverses

The inverse of RpRp has the form

R p - 1 = 1 p 1 p - 1 - 1 - 1 - 1 1 - 1 p - 1 - 1 - 1 1 - 1 - 1 p - 1 - 1 1 - 1 - 1 - 1 p - 1 1 - 1 - 1 - 1 - 1 R p - 1 = 1 p 1 p - 1 - 1 - 1 - 1 1 - 1 p - 1 - 1 - 1 1 - 1 - 1 p - 1 - 1 1 - 1 - 1 - 1 p - 1 1 - 1 - 1 - 1 - 1
(48)

and

R p e - 1 = k = 0 e - 1 ( ( R p - 1 I p e - 1 - k ) I p e - p e - k ) . R p e - 1 = k = 0 e - 1 ( ( R p - 1 I p e - 1 - k ) I p e - p e - k ) .
(49)

Because the inverse of QQ in Equation 37 is given by

Q - 1 = I a T - 1 ( : , 1 : b 1 ) I c I a T - 1 ( : , b 1 + 1 : b ) I c Q - 1 = I a T - 1 ( : , 1 : b 1 ) I c I a T - 1 ( : , b 1 + 1 : b ) I c
(50)

the inverse of the matrix RR described by eqs Equation 43, Equation 44 and Equation 45 is given by

R - 1 = i = 1 k Q ( m i , p i e i , n i ) - 1 R - 1 = i = 1 k Q ( m i , p i e i , n i ) - 1
(51)

with mi=j=1i-1pjejmi=j=1i-1pjej, ni=j=i+1kpjejni=j=i+1kpjej and

Q ( a , p e , c ) - 1 = j = e - 1 0 I a 1 -p t I c p j I a V p I c p j I a c ( p e - p j + 1 ) Q ( a , p e , c ) - 1 = j = e - 1 0 I a 1 -p t I c p j I a V p I c p j I a c ( p e - p j + 1 )
(52)

where VpVp denotes the matrix in Equation 48 without the first column. A Matlab program for RtRt is tKRED() , it calls the Matlab program tRED() . A Matlab program for R-tR-t is itKRED() , it calls the Matlab program itRED() . These programs all appear in one of the appendices.

References

  1. Agarwal, R. C. and Cooley, J. W. (1977, October). New Algorithms for Digital Convolution. IEEE Trans. Acoust., Speech, Signal Proc., 25(5), 392-410.
  2. Nussbaumer, H. J. (1980, April). Fast Polynomial Transform Algorithms for Digital Convolution. IEEE Trans. Acoust., Speech, Signal Proc., 28(2), 205-215.
  3. Nussbaumer, H. J. (1982). Fast Fourier Transform and Convolution Algorithms. Sringer-Verlag.
  4. Tolimieri, R. and An, M. and Lu, C. (1989). Algorithms for Discrete Fourier Transform and Convolution. Springer-Verlag.
  5. Temperton, C. (1985). Implementation of a Self-Sorting In-Place Prime Factor FFT Algorithm. Journal of Computational Physics, 58, 283-299.
  6. Winograd, S. (1980). Arithmetic Complexity of Computations. SIAM.
  7. Zalcstein, Y. (1971, June). A Note on Fast Cyclic Convolution. IEEE Trans. Comput., 20, 665-666.

Collection Navigation

Content actions

Download:

Collection as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Module as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Add:

Collection to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks

Module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks