Skip to content Skip to navigation Skip to collection information

OpenStax-CNX

You are here: Home » Content » Digital Signal Processing Laboratory (ECE 420) » Lab

Navigation

Table of Contents

Lenses

What is a lens?

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

This content is ...

Affiliated with (What does "Affiliated with" mean?)

This content is either by members of the organizations listed or about topics related to the organizations listed. Click each link to see a list of all content affiliated with the organization.
  • TI DSP display tagshide tags

    This collection is included inLens: Texas Instruments DSP Lens
    By: Texas Instruments

    Comments:

    "Doug course at UIUC using the TI C54x DSP has been adopted by many EE, CE and CS depts Worldwide "

    Click the "TI DSP" link to see all content affiliated with them.

    Click the tag icon tag icon to display tags associated with this content.

Also in these lenses

  • Lens for Engineering

    This module and collection are included inLens: Lens for Engineering
    By: Sidney Burrus

    Click the "Lens for Engineering" link to see all content selected in this lens.

Recently Viewed

This feature requires Javascript to be enabled.

Tags

(What is a tag?)

These tags come from the endorsement, affiliation, and other lenses that include this content.
 

Lab

Module by: Matt Kleffner. E-mail the author

Based on: Spectrum Analyzer: Processor Exercise Using C Language with C Introduction by Matt Kleffner

Summary: This is a processor exercise in which students implement a spectrum analyzer using mixed C and assembly code. Students are to acquire a block of 1024 samples, apply a Hamming window, compute a length-1024 Discrete Fourier Transform using provided Fast Fourier Transform code, and display the magnitude-squared spectrum on an oscilloscope. Students will also compile and explore a reference implementation of an autocorrelation-based power spectral density (PSD) estimator. This implementation estimates the PSD of an IIR-filtered pseudo-noise generator. This version of the module contains code that is compatible with the Very Portable Optimizer (VPO). This code serves as a basis for introducing VPO and the VISTA viewer in an optimization lab.

Implementation

As this is your first experience with the C environment, you will have the option to add most of the required code in C or assembly. A C skeleton will provide access to input samples, output samples, and interrupt handling code. You will add code to transfer the inputs and outputs (in blocks at a time), apply a hamming window, compute the magnitude-squared spectrum, and include a trigger pulse. After the hamming window is created, either an assembly or C module that bit reverses the input and performs an FFT calculation is called.

As your spectrum analyzer works on a block of samples at a time, you will need to use interrupts to pause your processing while samples are transferred from/to the CODEC (A/D and D/A) buffer. Fortunately, the interrupt handling routines have been written for you in a C shell program available at v:\ece420\54x\dspclib\lab4main.c and the core code.

Interrupt Basics

Interrupts are an essential part of the operation of any microprocessor. They are particularly important in embedded applications where DSPs are often used. Hardware interrupts provide a way for interacting with external devices while the processor executes code. For example, in a key entry system, a key press would generate a hardware interrupt. The system code would then jump to a specified location in program memory where a routine could process the key input. Interrupts provide an alternative to polling. Instead of checking for key presses at a predetermined rate (requires a clock), the system could be busy executing other code. On the TI-C54x DSP, interrupts provide a convenient way to transfer blocks of data to/from the CODEC in a timely fashion.

Interrupt Handling

The lab4main.c code and the core code are intended to make your interaction with the hardware much simpler. As there was a core file for working in the assembly environment (Labs 0-3), there is a core file for the C environment (V:\ece420\54x\dspclib\core.asm) which handles the interrupts from the CODEC (A/D and D/A) and the serial port. Here, we will describe the important aspects of the core code necessary to complete the assignment.

At the heart of the hardware interaction is the auto-buffering serial port. In the auto-buffering serial mode, the TI-C54x processor is able to do processing uninterrupted while samples are transferred to/from a buffer of length BlockLen=64 BlockLen 64 samples. However, the spectrum analyzer to be implemented in this lab works over a block of N=1024 N 1024 samples. If it were possible to compute a 1024-point FFT in the sample time of one BlockLen, then no additional interrupt handling routines would be necessary. Samples could be collected in a 1024-length buffer and a 1024-point FFT could be computed uninterrupted while the auto-buffering buffer fills. Unfortunately, the DSP is not fast enough to accomplish this task.

We now provide an explanation of the shell C program lab4main.c listed in Appendix A. The lab4main.c file contains the function interrupt void irq and a main program. The main program is an infinite loop over blocks of N=1024 N 1024 samples. Note that while the DSP is executing instructions in this loop, interrupts occur every BlockLen samples. Inside the infinite loop, you will insert code to do the operations which follow. Although each of these operations may be performed in C or assembly, we suggest you follow the guidelines suggested.

  1. Transfer inputs and outputs (C)
  2. Apply a Hamming Window (C/assembly)
  3. Bit-reverse the input (C and assembly)
  4. Apply an N N-point FFT (C and assembly)
  5. Compute the magnitude-squared spectrum (C/assembly)
  6. Include a trigger pulse (C/assembly)

The function WaitAudio is an assembly function in the core code which handles the CODEC interrupts. An interrupt from the CODEC occurs every BlockLen samples. The SetAudioInterrupt(irq) call in the main program tells the core code to jump to the irq function when an interrupt occurs. In the irq function, BlockLen samples of the A/D input in Rcvptr (channel 1) are written to a length N N inputs buffer, and BlockLen of the output samples in the outputs buffer are written to the D/A output via Xmitptr on channel 2. In C, pointers may be used as array names so that Xmitptr[0] is the first word pointed to by Xmitptr. As in the assembly core, the input samples are not in consecutive order. The right and left inputs are offset from Rcvptr respectively by 4i 4 i and 4i+2 4 i 2 , i=0 i 0 , …, BlockLen1 BlockLen 1 . The six output channels are accessed consecutively as offsets from Xmitptr. On channel 1 of the output, the input is echoed out. You are to fill the buffer outputs with the windowed magnitude-squared FFT values by performing the operations listed above.

In the main code, the while(!input_full); loop waits for N N samples to collect in the inputs buffer. Next, the N N inputs and outputs must be transferred. You are to write this portion of code. This portion of code is to be done first, within BlockLen sample times; otherwise the first BlockLen of samples of output would not be available on time. Once this loop is finished, the lengthy processing of the FFT can continue. During this processing, the DSP is interrupted every BlockLen samples to transfer samples. Once this processing is over, the infinite loop returns to while(!input_full); to wait for N N samples to finish collecting.

The flow diagram in Figure 1 summarizes the operation of the interrupt handling routine

Figure 1: Overall program flow of the main function and the interrupt handling function.
(a) main(b) interrupt handler
Figure 1(a) (main_flow.png)Figure 1(b) (interrupt_flow.png)

Assembly FFT Routine

As the list of operations indicates, bit-reversal and FFT computation are to be done in both C and assembly. For the assembly version, make sure that the line defining C_FFT is commented in lab4main.c. We are providing you with a shell assembly file, available at v:\ece420\54x\dspclib\c_fft_given.asm and shown in Appendix B, containing many useful declarations and some code. The code for performing bit-reversal and other declarations needed for the FFT routine are also provided in this section. However, we would like you to enter this code manually, as you will be expected to understand its operation.

The assembly file c_fft_given.asm contains two main parts, the data section starting with .sect ".data" and the program section starting with .sect ".text". Every function and variable accessed in C must be preceded by a single underscore _ in assembly and a .global _name must be placed in the assembly file for linking. In this example, bit_rev_fft is an assembly function called from the C program with a label _bit_rev_fft in the text portion of the assembly file and a .global _bit_rev_fft declaration. In each assembly function, the macro ENTER_ASM is called upon entering and LEAVE_ASM is called upon exiting. These macros are defined in v:\ece420\54x\dspclib\core.inc. The ENTER_ASM macro saves the status registers and AR1, AR6, and AR7 when entering a function as required by the register use conventions. The ENTER_ASM macro also sets the status registers to the assembly conventions we have been using (i.e, FRCT=1 for fractional arithmetic and CPL=0 for DP referenced addressing). The LEAVE_ASM macro just restores the saved registers.

Parameter Passing

The parameter passing convention between assembly and C is simple for single input, single output assembly functions. From a C program, the input to an assembly program is in the low part of accumulator A with the output returned in the same place. When more than one parameter is passed to an assembly function, the parameters are passed on the stack (see the core file description for more information). We suggest that you avoid passing or returning more than one parameter. Instead, use global memory addresses to pass in or return more than one parameter. Another alternative is to pass a pointer to the start of a buffer intended for passing and returning parameters.

Registers Modified

When entering and leaving an assembly function, the ENTER_ASM and LEAVE_ASM macros ensure that certain registers are saved and restored. Since the C program may use any and all registers, the state of a register cannot be expected to remain the same between calls to assembly function(s). Therefore, any information that needs to be preserved across calls to an assembly function must be saved to memory!

Now, we explain how to use the FFT routine provided by TI for the C54x. The FFT routine fft.asm located in v:\ece420\54x\dsplib\ computes an in-place, complex FFT. The length of the FFT is defined as a label K_FFT_SIZE and the algorithm assumes that the input starts at data memory location _fft_data. To have your code assemble for an N N-point FFT, you will have to include the following label definitions in your assembly code.


	  
	  N                 .set       1024
	  K_FFT_SIZE        .set       N           ; size of FFT
	  K_LOGN            .set       10          ; number of stages (log_2(N))
	  
        

In addition to defining these constants, you will have to include twiddle-factor tables for the FFT. These tables (twiddle1 and twiddle2) are available in the shared directory v:\ece420\54x\dsplib\. Note that the tables are each N N points long representing values from 0 to just shy of 180 degrees and must be accessed using a circular pointer. To include these tables at the proper location in memory with the appropriate labels referenced by the FFT, use the following


	  
	  .sect  ".data"
	  .align  1024
	  sine         .copy "v:\ece420\54x\dsplib\twiddle1"
	  .align  1024
	  cosine       .copy "v:\ece420\54x\dsplib\twiddle2"
	  
        

The FFT provided requires that the input be in bit-reversed order, with alternating real and imaginary components. Bit-reversed addressing is a convenient way to order input xn x n into a FFT so that the output Xk X k is in sequential order (i.e. X0 X 0 , X1 X 1 , …, XN1 X N 1 for an N N-point FFT). The following table illustrates the bit-reversed order for an eight-point sequence.

Table 1
Input Order Binary Representation Bit-Reversed Representation Output Order
0 000 000 0
1 001 100 4
2 010 010 2
3 011 110 6
4 100 001 1
5 101 101 5
6 110 011 3
7 111 111 7

The following routine performs the bit-reversed reordering of the input data. The routine assumes that the input is stored in data memory starting at the location labeled _bit_rev_data, which must be aligned to the least power of two greater than the input buffer length, and consists of alternating real and imaginary parts. Because our input data is going to be purely real in this lab, you will have to make sure that you set the imaginary parts to zero by zeroing out every other memory location.


          
	  1    bit_rev:
	  2            STM     #_bit_rev_data,AR3          ; AR3 -> original input
	  3            STM     #_fft_data,AR7              ; AR7 -> data processing buffer
	  4            MVMM    AR7,AR2                     ; AR2 -> bit-reversed data
	  5            STM     #K_FFT_SIZE-1,BRC
	  6            RPTBD   bit_rev_end-1
	  7            STM     #K_FFT_SIZE,AR0             ; AR0 = 1/2 size of circ buffer
	  8            MVDD    *AR3+,*AR2+
	  9            MVDD    *AR3-,*AR2+
	  10            MAR     *AR3+0B
	  11    bit_rev_end:
	  12            NOP
          13            RET
          
        

As mentioned, in the above code _bit_rev_data is a label indicating the start of the input data and _fft_data is a label indicating the start of a circular buffer where the bit-reversed data will be written. Note that although AR7 is not used by the bit-reversed routine directly, it is used extensively in the FFT routine to keep track of the start of the FFT data space.

In general, to have a pointer index memory in bit-reversed order, the AR0 register needs to be set to one-half the length of the circular buffer; a statement such as ARx+0B is used to move the ARx pointer to the next location. For more information regarding the bit-reversed addressing mode, refer to page 5-18 in the TI-54x CPU and Peripherals manual [link]. Is it possible to bit-reverse a buffer in place? For a diagram of the ordering of the data expected by the FFT routine, see Figure 4-10 in the TI-54x Applications Guide [link]. Note that the FFT code uses all the pointers available and does not restore the pointers to their original values.

C FFT Routine

A bit-reversing and FFT routine have also been provided in lab4fft.c, listed in Appendix C. Again, make sure you understand how the bit reversal is taking place. In lab4main.c, the line defining C_FFT must not be commented for use of the C FFT routine. The sine tables (twiddle factors) are located in sinetables.h. This fft requires its inputs in two buffers, the real buffer real and the imaginary buffer imag, and the output is placed in the same buffers. The length of the FFT, N, and logN are defined in lab4.h, which is also listed in Appendix C. When experimenting with the C FFT make sure you modify these length values instead of the ones in the assembly code and lab4main.c!

Creating the Window

As mentioned, you will be using the FFT to compute the spectrum of a windowed input. For your implementation you will need to create a 1024-point Hamming window. First, create a Hamming window in matlab using the function hamming. For the assembly FFT, use save_coef to save the window to a file that can then be included in your code with the .copy directive. For the C FFT, use the matlab function write_intvector_headerfile with name set to 'window' and elemperline set to 8 to create the header file that is included in lab4main.c.

Displaying the Spectrum

Once the DFT has been computed, you must calculate the squared magnitude of the spectrum for display.

|Xk|2=Xk2+Xk2 X k 2 X k 2 X k 2
(1)
You may find the assembly instructions squr and squra useful in implementing Equation 1.

Because the squared magnitude is always nonnegative, you can replace one of the magnitude values with a -1.0 as a trigger pulse for display on the oscilloscope. This is easily performed by replacing the DC term ( k=0 k 0 ) with a -1.0 when copying the magnitude values to the output buffer. The trigger pulse is necessary for the oscilloscope to lock to a specific point in the spectrum and keep the spectrum fixed on the scope.

Intrinsics

If you are planning on writing some of the code in C, then you may be forced to use intrinsics. Intrinsic instructions provide a way to use assembly instructions directly in C. An example of an intrinsic instruction is bit_rev_data[0]=_smpyr(bit_rev_data[0],window[0]) which performs the assembly signed multiply round instruction. You may also find the _lsmpy instruction useful. For more information on intrinsics, see page 6-22 of the TI-C54x Optimizing C/C++ Compiler User's Guide [link].

The following lines of code were borrowed from the C FFT to serve as an example of arithmetic operations in C. Save this code in a file called mathex.c and compile this file by typing c_asm mathex.c at a command prompt. Look at the resulting assembly file and investigate the differences between each block. Be sure to reference page 6-10 of the compiler user's guide to find out what the state of the FRCT and OVM bits are. Run this program on the DSP, halt the program, and compare the output values in a memory window. Does each block work properly for all possible values?


          
            int s1, s2;
            int t1, t2;
            int i1, i2;
            int n1 = 16383, n2 = 16382, n3 = 16381, n4 = 16380;

            void main(void)
            {
               /* Code for standard 32-bit hardware, */
               /* with x,y limited to 16 bits        */
               s1 = (n1*n2 + n3*n4) >> 15;
               s2 = (n1 + n2) >> 1;

               /* Code for TI TMS320C54X series */
               t1 = ((long int)(n1*n2) + (long int)(n3*n4)) >> 15;
               t2 = ((long int)n1 + (long int)n2) >> 1;

               /* Intrinsic code for TMS320C54X series */
               i1 = _sadd(_smpy(n1,n2), _smpy(n3,n4));
               i2 = _sshl(_sadd(n1, n2),-1);

               while(1);
            }
          
        

Compiling and Linking

A working program can be produced by compiling the C code and linking assembly modules and the core module. The compiler translates C code to a relocatable assembly form. The linker assigns physical addresses on the DSP to the relocatable data and code segments, resolves .global references and links runtime libraries.

The procedure for compiling C code and linking assembly modules has been automated for you in the batch file v:\ece420\54x\dsptools\c_asm.bat. The name of the first file becomes the name of the executable. Once you have completed lab4main.c and c_fft_given.asm, type c_asm lab4main.c c_fft_given.asm to produce a lab4main.out file to be loaded onto the DSP. For the C FFT type c_asm lab4main.c lab4fft.c to produce lab4main.out. Load the output file onto the DSP as usual and confirm that valid FFTs are calculated. Once valid output is obtained, measure how many clock cycles it takes to compute both the assembly and C FFT.

Reference Implementation of a PSD estimator

We provide for you in Appendix D and E a complete C implementation of a PSD estimator. The input is an IIR-filtered pseudo-noise (PN) sequence generator and the PSD estimate is based on windowing the autocorrelation with a rectangular window. The code consists of the files lab4bmain.c, lab4b.h, intrinsics.h, pn.c, iirfilt.c, autocorr.c, c_fft_given_iirc.asm, and the previously-given TI FFT routine. The assembly file c_fft_given_iirc.asm differs from c_fft_given.asm in that the window array has been removed and variables and arrays associated with IIR filtering have been added. Note that the multiply functions in the functions are actually compiler directives contained in intrinsics.h. Make sure you know which ones are used and why; note that VPO is not defined by the TI compiler, therefore the corresponding section of the #ifdef statement is not used. Compile and link these files by issuing c_asm lab4bmain.c pn.c iirfilt.c autocorr.c c_fft_given_iirc.asm at the command line. Load lab4bmain.out onto the DSP and run the code. Make sure that an IIR-filtered PN sequence appears on channel 1 and its PSD estimate appears on channel 2.

Does the output match your expectations based on the theory? Does this application illustrate any limitations of the FFT implementation? (Hint: note that most of the values in the FFT input are zero.) The previously-given C implementation uses a similar algorithm as the TI FFT; take a look at the C code for help. What are the limitation(s) of the FFT that show up in this application?

In lab4b.h M sets the number of autocorrelation points that are calculated. What is the maximum value of M that allows the reference routines to run in real time? In determining this value you may find it useful to connect a wave-function generator to the input and copy inputs into display_inputs. You may limit M to powers of 2 minus 1.

Quiz Information

From your prelab experiments, you should be able to describe the effect of windowing and zero-padding on FFT spectral analysis. In your DSP system, experiment with different inputs, changing N N and the type of window. Can you explain what happens as the input frequency is increased beyond the Nyquist rate? Does the |Xk|2 X k 2 coincide with what you expect from Matlab? What is the relationship between the observed spectrum and the DTFT? What would happen if the FFT calculation takes longer than it takes to fill inputs with N N samples? How long does it take to compute each FFT? What are the tradeoffs between writing code in C versus assembly?

Appendix A:

lab4main.c


        
1    /* v:/ece420/54x/dspclib/lab4main.c */
2    /* dgs - 9/14/2001                  */
3    /* mdk - 2/10/2004   C FFT update   */
4
5    #include "v:/ece420/54x/dspclib/core.h"
6
7    /* comment the next line to use assembly fft */
8    #define C_FFT
9
10    #ifdef C_FFT /* Use C FFT */
11
12       #include "window.h"
13       #include "lab4.h" /* Number of C FFT points defined here */
14
15       /* function defined in lab4fft.c */
16       void fft(void);
17
18       /* FFT data buffers */
19       int real[N]; /* Real part of data      */
20       int imag[N]; /* Imaginary part of data */
21
22    #else        /* Use assembly FFT */
23
24       #define N 1024   /* Number of assembly FFT points */
25
26       /* Function defined by c_fft_given.asm */
27       void bit_rev_fft(void);
28
29       /* FFT data buffers (declared in c_fft_given.asm) */
30       extern int bit_rev_data[N*2]; /* Data input for bit-reverse function */
31       extern int fft_data[N*2];     /* In-place FFT & Output array         */
32       extern int window[N];         /* The Hamming window                  */
33
34    #endif       /* C_FFT */
35
36
37    /* Our input/output buffers */
38    int inputs[N];
39    int outputs[N];
40
41    volatile int input_full = 0;  /* volatile means interrupt changes it */
42    int count = 0;
43
44
45    interrupt void irq(void)
46    {
47      int *Xmitptr,*Rcvptr;                      /* pointers to Xmit & Rcv Bufs   */
48      int i;
49
50      static int in_irq = 0;           /* Flag to prevent reentrance */
51
52      /* Make sure we're not in the interrupt (should never happen) */
53      if( in_irq )
54        return;
55
56      /* Mark we're processing, and enable interrupts */
57      in_irq = 1;
58      enable_irq();
59
60      /* The following waitaudio call is guaranteed not to
61         actually wait; it will simply return the pointers. */
62      WaitAudio(&Rcvptr,&Xmitptr);
63
64      /* input_full should never be true... */
65      if( !input_full )
66      {
67        for (i=0; i<BlockLen; i++)
68        {
69          /* Save input, and echo to channel 1 */
70          inputs[count] = Xmitptr[6*i] = Rcvptr[4*i];
71
72          /* Send FFT output to channel 2 */
73          Xmitptr[6*i+1] = outputs[count];
74
75          count++;
76        }
77      }
78
79      /* Have we collected enough data yet? */
80      if( count >= N )
81        input_full = 1;
82
83      /* We're not in the interrupt anymore... */
84      disable_irq();
85      in_irq = 0;
86    }
87
88
89    main()
90    {
91      /* Initialize IRQ stuff */
92      count = 0;
93      input_full = 0;
94      SetAudioInterrupt(irq);        /* Set up interrupts */
95
96      while (1)
97      {
98        while( !input_full );     /* Wait for a data buffer to collect */
99
100        /* From here until we clear input_full can only take *
101         * BlockLen sample times, so don't do too much here. */
102
103        /* First, transfer inputs and outputs */
104
105    #ifdef C_FFT /* Use C FFT */
106        /* I n s e r t   c o d e   t o   f i l l   */
107        /* C   F F T   b u f f e r s               */
108
109    #else        /* Use assembly FFT */
110        /* I n s e r t   c o d e   t o   f i l l   */
111        /* a s s e m b l y   F F T   b u f f e r s */
112
113    #endif       /* C_FFT */
114
115        /* Done with that... ready for new data collection */
116        count = 0;      /* Need to reset the count                */
117        input_full = 0; /* Mark we're ready to collect more data  */
118
119        /**********************************************************/
120        /* Now that we've gotten the data moved, we can do the    */
121        /* more lengthy processing.                               */
122
123    #ifdef C_FFT /* Use C FFT */
124
125        /* Multiply the input signal by the Hamming window.       */
126        /* . . . i n s e r t   C / a s m   code . . .             */
127
128        /* Bit-reverse and compute FFT in C                       */
129        fft();
130
131        /* Now, take absolute value squared of FFT                */
132        /* . . . i n s e r t   C / a s m   code . . .             */
133
134        /* Last, set the DC coefficient to -1 for a trigger pulse */
135        /* . . . i n s e r t   C / a s m   code . . .             */
136
137        /* done, wait for next time around!                       */
138
139
140    #else        /* Use assembly FFT */
141
142        /* Multiply the input signal by the Hamming window.       */
143        /* . . . i n s e r t   C / a s m   code . . .             */
144
145        /* Bit-reverse and compute FFT in assembly                */
146        bit_rev_fft();
147
148        /* Now, take absolute value squared of FFT                */
149        /* . . . i n s e r t   C / a s m   code . . .             */
150
151        /* Last, set the DC coefficient to -1 for a trigger pulse */
152        /* . . . i n s e r t   C / a s m   code . . .             */
153
154        /* done, wait for next time around!                       */
155
156
157    #endif       /* C_FFT */
158
159      }
160    }
        
      

Appendix B:

c_fft_given.asm


        
1    ; v:\ece420\54x\dspclib\c_fft_given.asm
2    ; dgs - 9/14/2001
3        .copy "v:\ece420\54x\dspclib\core.inc"
4
5        .global	_bit_rev_data
6        .global _fft_data
7        .global _window
8	
9        .global _bit_rev_fft
10
11        .sect	".data"
12
13        .align 4*N
14    _bit_rev_data .space 16*2*N	; Input to _bit_rev_fft
15
16        .align	4*N
17    _fft_data .space 16*2*N		; FFT output buffer
18
19
20    ; Copy in the Hamming window
21    _window					; The Hamming window
22        .copy	"window.asm"	
23
24        .sect ".text"
25
26    _bit_rev_fft
27        ENTER_ASM
28                                                  
29        call bit_rev                    ; Do the bit-reversal.
30
31        call fft		        ; Do the FFT
32
33        LEAVE_ASM
34        RET                                     
35
36    bit_rev:
37        STM     #_bit_rev_data,AR3          ; AR3 -> original input
38        STM     #_fft_data,AR7              ; AR7 -> data processing buffer
39        MVMM    AR7,AR2                     ; AR2 -> bit-reversed data
40        STM     #K_FFT_SIZE-1,BRC
41        RPTBD   bit_rev_end-1
42        STM     #K_FFT_SIZE,AR0             ; AR0 = 1/2 size of circ buffer
43        MVDD    *AR3+,*AR2+
44        MVDD    *AR3-,*AR2+
45        MAR     *AR3+0B
46    bit_rev_end:
47        NOP
48        RET
49
50    ; Copy the actual FFT subroutine.
51    fft_data  .set	_fft_data	; FFT code needs this.
52        .copy 	"v:\ece420\54x\dsplib\fft.asm"
53
54
55    ; If you need any more assembly subroutines, make sure you name them
56    ; _name, and include a ".global _name" directive at the top. Also,
57    ; don't forget to use ENTER_ASM at the beginning, and LEAVE_ASM
58    ; and RET at the end!
        
      

Appendix C:

lab4.h


        
1    #define N 1024       /* Number of FFT points */
2    #define logN 10
        
      

lab4fft.c


        
1    /*****************************************************************/
2    /* lab4fft.c                                                     */
3    /* Douglas L. Jones                                              */
4    /* University of Illinois at Urbana-Champaign                    */
5    /* January 19, 1992                                              */
6    /* Changed for use w/ short integers and lookup table for ECE420 */
7    /* Matt Kleffner                                                 */
8    /* February 10, 2004                                             */
9    /*                                                               */
10    /*   fft: in-place radix-2 DIT DFT of a complex input            */
11    /*                                                               */
12    /*   Permission to copy and use this program is granted          */
13    /*   as long as this header is included.                         */
14    /*                                                               */
15    /* WARNING:                                                      */
16    /*   This file is intended for educational use only, since most  */
17    /*   manufacturers provide hand-tuned libraries which typically  */
18    /*   include the fastest fft routine for their DSP/processor     */
19    /*   architectures. High-quality, open-source fft routines       */
20    /*   written in C (and included in MATLAB) can be found at       */
21    /*   http://www.fftw.org                                         */
22    /*                                                               */
23    /*   #defines expected in lab4.h                                 */
24    /*         N:   length of FFT: must be a power of two            */
25    /*      logN:   N = 2**logN                                      */
26    /*                                                               */
27    /*   16-bit-limited input/output (must be defined elsewhere)     */
28    /*   real:   integer array of length N with real part of data    */
29    /*   imag:   integer array of length N with imag part of data    */
30    /*                                                               */
31    /*   sinetables.h must                                           */
32    /*   1) #define Nt to an equal or greater power of two than N    */
33    /*   2) contain the following integer arrays with                */
34    /*      element magnitudes bounded by M = 2**15-1:               */
35    /*         costable:   M*cos(-2*pi*n/Nt), n=0,1,...,Nt/2-1       */
36    /*         sintable:   M*sin(-2*pi*n/Nt), n=0,1,...,Nt/2-1       */
37    /*                                                               */
38    /*****************************************************************/
39
40    #include "lab4.h"
41    #include "sinetables.h"
42
43    extern int real[N];
44    extern int imag[N];
45
46    void fft(void)
47    {
48       int   i,j,k,n1,n2,n3;
49       int   c,s,a,t,Wr,Wi;
50
51       j = 0;            /* bit-reverse */
52       n2 = N >> 1;
53       for (i=1; i < N - 1; i++)
54       {
55          n1 = n2;
56          while ( j >= n1 )
57          {
58             j = j - n1;
59             n1 = n1 >> 1;
60          }
61          j = j + n1;
62
63          if (i < j)
64          {
65             t = real[i];
66             real[i] = real[j];
67             real[j] = t;
68             t = imag[i];
69             imag[i] = imag[j];
70             imag[j] = t;
71          }
72       }
73
74       /* FFT */
75       n2 = 1; n3 = Nt;
76
77       for (i=0; i < logN; i++)
78       {
79          n1 = n2;      /* n1 = 2**i     */
80          n2 = n2 + n2; /* n2 = 2**(i+1) */
81          n3 = n3 >> 1; /* cos/sin arg of -6.283185307179586/n2 */
82          a = 0;
83
84          for (j=0; j < n1; j++)
85          {
86             c = costable[a];
87             s = sintable[a];
88             a = a + n3;
89
90             for (k=j; k < N; k=k+n2)
91             {
92                /* Code for standard 32-bit hardware, */
93                /* with real,imag limited to 16 bits  */
94                /*
95                Wr = (c*real[k+n1] - s*imag[k+n1]) >> 15;
96                Wi = (s*real[k+n1] + c*imag[k+n1]) >> 15;
97                real[k+n1] = (real[k] - Wr) >> 1;
98                imag[k+n1] = (imag[k] - Wi) >> 1;
99                real[k] = (real[k] + Wr) >> 1;
100                imag[k] = (imag[k] + Wi) >> 1;
101                */
102                /* End standard 32-bit code */
103
104                /* Code for TI TMS320C54X series */
105
106                Wr = ((long int)(c*real[k+n1]) - (long int)(s*imag[k+n1])) >> 15;
107                Wi = ((long int)(s*real[k+n1]) + (long int)(c*imag[k+n1])) >> 15;
108                real[k+n1] = ((long int)real[k] - (long int)Wr) >> 1;
109                imag[k+n1] = ((long int)imag[k] - (long int)Wi) >> 1;
110                real[k] = ((long int)real[k] + (long int)Wr) >> 1;
111                imag[k] = ((long int)imag[k] + (long int)Wi) >> 1;
112
113                /* End code for TMS320C54X series */
114
115                /* Intrinsic code for TMS320C54X series */
116                /*
117                Wr = _ssub(_smpy(c, real[k+n1]), _smpy(s, imag[k+n1]));
118                Wi = _sadd(_smpy(s, real[k+n1]), _smpy(c, imag[k+n1]));
119                real[k+n1] = _sshl(_ssub(real[k], Wr),-1);
120                imag[k+n1] = _sshl(_ssub(imag[k], Wi),-1);
121                real[k] = _sshl(_sadd(real[k], Wr),-1);
122                imag[k] = _sshl(_sadd(imag[k], Wi),-1);
123                */
124                /* End intrinsic code for TMS320C54X series */
125             }
126          }
127       }
128       return;
129    }
        
      

Appendix D: Main routine, header files for PSD estimator

lab4b.h

intrinsics.h

lab4bmain.c


        
1    #define N 1024    /* Length of output buffers */
2    #define L N       /* Length of input data     */
3    #define logL 10   /* log base 2 of L          */
4    #define M 31      /* Compute 2*M+1 autocorrelation points */
5
6    /* #define M (L/2-1) */ /* Be sure to use ()'s in this case */
7                            /* or algebraic substitution bugs   */
8                            /* can be introduced                */
        
      

        
1    /* Compiler intrinsics for the TI compiler        */
2    /* and the Very Portable Optimizer (VPO) port     */
3    /* to TMS320C54X series DSPs                      */
4    /*                                                */
5    /* Use compile option -DVPO when using VPO        */
6    /*                                                */
7    /* Copyright September 2005 by Matt Kleffner      */
8    /* under the Creative Commons Attribution License */
9
10    #ifndef INTRINSICS_H
11
12    #define INTRINSICS_H
13
14    #ifdef VPO
15
16    long int vpo_l_mul_ii(int w0, int w1);
17
18    /* fractional multiply without fractional mode (long result) */
19    #define _l_mul_fract_fb0_ii(w0,w1) \
20             (vpo_l_mul_ii(w0,w1) << 1)
21
22    /* fractional multiply with fractional mode already on (long result) */
23    #define _l_mul_fract_fb1_ii(w0,w1) \
24             (vpo_l_mul_ii(w0,w1))
25
26    /* fractional multiply without fractional mode (int result) */
27    #define _i_mul_fract_fb0_ii(w0,w1) \
28             (vpo_l_mul_ii(w0,w1) >> 15)
29
30    /* fractional multiply with fractional mode already on (int result) */
31    #define _i_mul_fract_fb1_ii(w0,w1) \
32             (vpo_l_mul_ii(w0,w1) >> 16)
33
34    #define _set_fract_bit() vpo_set_fract()
35    #define _reset_fract_bit() vpo_reset_fract()
36    #define _set_ovm_bit() vpo_set_ovm()
37    #define _reset_ovm_bit() vpo_reset_ovm()
38
39    #define _l_add_shiftl_li(w0,w1) (((int32)(w0))+(((int32)(int16)(w1))<<16) )
40    #define _l_sub_shiftl_li(w0,w1) (((int32)(w0))-(((int32)(int16)(w1))<<16) )
41
42    #else
43
44    /* fractional multiply without fractional mode (long result) */
45    #define _l_mul_fract_fb0_ii(w0,w1) \
46             (((long int)w0 * (long int)w1) << 1)
47
48    /* fractional multiply with fractional mode already on (long result) */
49    #define _l_mul_fract_fb1_ii(w0,w1) \
50             (((long int)w0 * (long int)w1))
51
52    /* fractional multiply without fractional mode (int result) */
53    #define _i_mul_fract_fb0_ii(w0,w1) \
54             (((long int)w0 * (long int)w1) >> 15)
55
56    /* fractional multiply with fractional mode already on (int result) */
57    #define _i_mul_fract_fb1_ii(w0,w1) \
58             (((long int)w0 * (long int)w1) >> 16)
59
60    #define _set_fract_bit() asm("   ssbx frct")
61    #define _reset_fract_bit() asm(" rsbx frct")
62    #define _set_ovm_bit() asm("  ssbx ovm")
63    #define _reset_ovm_bit() asm("   rsbx ovm")
64
65    #endif /* VPO */
66
67    #endif /* INTRINSICS_H */
        
      

        
1    /* v:\ece420\54x\dspclib\lab4bmain.c */
2    /* PN generation, IIR filtering, and autocorrelation added */
3    /* by Matt Kleffner - 9/2004 */
4    /* Original by dgs  - 9/14/2001 */
5    /* Use governed by the Creative Commons Attribution License */
6
7    #include "v:\ece420\54x\dspclib\core.h"
8
9    /* #define N 1024 */            /* Number of FFT points */
10    #include "lab4b.h"              /* Define N here in header file */
11
12    /* function defined in pn.c */
13    void rand_fillbuffer(void);
14    unsigned int *iseed;            /* seed for rand_fillbuffer() and randbit() */
15
16    /* IIR values and buffers (declared in c_fft_given_iirc.asm) */
17    #define IIR_order 4
18    extern int scale;
19    extern int coef[IIR_order];
20    extern int state[IIR_order];
21
22    /* Pointer to state buffer location */
23    int iirptr;
24
25    /* function defined in iirfilt.c */
26    void iirfilt(void);
27
28    /* function defined in autocorr.c */
29    void autocorr(void);
30
31    /* Function defined by c_fft_given_iirc.asm */
32    void bit_rev_fft(void);
33
34    /* FFT data buffers (declared in c_fft_given_iirc.asm) */
35    extern int bit_rev_data[N*2];   /* Data input for bit-reverse function */
36    extern int fft_data[N*2];       /* In-place FFT & Output array */
37
38    /* Our input/output buffers */
39    int inputs[N];
40    int outputs[N];
41    int display_inputs[N];
42    int autocorr_in[N];
43    int autocorr_out[N];
44
45    volatile int input_full = 0;    /* volatile means interrupt changes it */
46    int count = 0;
47
48    interrupt void irq(void)
49    {
50      int *Xmitptr,*Rcvptr;         /* pointers to Xmit & Rcv Bufs */
51      int i;
52
53      static int in_irq = 0;        /* Flag to prevent reentrance */
54
55      /* Make sure we're not in the interrupt (should never happen) */
56      if( in_irq )
57        return;
58
59      /* Mark we're processing, and enable interrupts */
60      in_irq = 1;
61      enable_irq();
62
63      /* The following waitaudio call is guaranteed not to
64         actually wait; it will simply return the pointers. */
65      WaitAudio(&Rcvptr,&Xmitptr);
66
67      /* input_full should never be true... */
68      if( !input_full )
69      {
70        for (i=0; i<BlockLen; i++)
71        {
72          /* Save input, send display_inputs to channel 1 */
73          inputs[count] = Rcvptr[4*i];
74          Xmitptr[6*i] = display_inputs[count];
75          /* inputs[count] = Xmitptr[6*i] = Rcvptr[4*i];  */
76
77          /* Send FFT output to channel 2 */
78          Xmitptr[6*i+1] = outputs[count];
79
80          count++;
81        }
82        /* Have we collected enough data yet? */
83      }
84        if( count >= N ) input_full = 1;
85
86      /* We're not in the interrupt anymore... */
87      disable_irq();
88      in_irq = 0;
89    }
90
91    main()
92    {
93      int i;
94      /* Initialize IRQ stuff */
95      count = 0;
96      input_full = 0;
97
98      /* Initialize autocorr_out to zero since some values will remain zero */
99      for (i = 0; i < N; ++i)
100      {
101         autocorr_out[i] = 0;
102         display_inputs[i] = 0;
103      }
104
105      /* Initialize PN-sequence generator */
106      *iseed = 1;
107
108      /* Initialize IIR filter states to zero */
109      iirptr = 0;
110      for (i = 0; i < IIR_order; ++i) state[i] = 0;
111
112      SetAudioInterrupt(irq);       /* Set up interrupts */
113
114      while (1)
115      {
116        while( !input_full );       /* Wait for a data buffer to collect */
117
118        /* From here until we clear input_full can only take *
119         * BlockLen sample times, so don't do too much here. */
120
121        /* First, transfer inputs and outputs */
122
123        for (i = 0; i < N; i++) {
124            display_inputs[i] = autocorr_in[i];
125            outputs[i] = fft_data[i*2] << 8;
126
127            /* Some statements useful in debugging */
128            /* display_inputs[i] = inputs[i]; */
129            /* Be sure to comment out PN-sequence generation */
130            /* when using the next line */
131            /* autocorr_in[i] = 32767; inputs[i];*/
132        }
133        /* Last, set the DC coefficient to -1 for a trigger pulse */
134        outputs[0] = -32768;
135
136        /* Done with that... ready for new data collection */
137        count = 0;            /* Need to reset the count               */
138        input_full = 0;     /* Mark we're ready to collect more data */
139
140        /*************************************************************/
141        /* Now that we've gotten the data moved, we can do the      */
142        /* more lengthy processing.                                 */
143
144        /* Generate PN input */
145        rand_fillbuffer();
146
147        /* Filter input */
148        iirfilt();
149
150        /* Calculate autocorrelation */
151        autocorr();
152
153        /* Transfer autocorr output to FFT input buffer */
154        for (i = 0; i < N; i++) {
155            bit_rev_data[i*2] = autocorr_out[i];
156            bit_rev_data[i*2+1] = 0;
157        }
158        /* Bit-reverse and compute FFT */
159        bit_rev_fft();
160
161        /* Done, wait for next time around! */
162      }
163    }

        
      

Appendix E: Additional routines for PSD estimator

pn.c

iirfilt.c

autocorr.c

c_fft_given_iirc.asm


        
1    /* ECE420, Lab 4, Reference PN Generator Implementation (Non-Optimized) */
2    /* Matt Kleffner 08/04                                                  */
3    /* Original by Michael Frutiger 02/24/04                                */
4    /* Use governed by the Creative Commons Attribution License             */
5
6    #include "lab4b.h"
7
8    extern unsigned int *iseed;
9    extern int autocorr_in[N];
10
11    /* Returns as an integer a random bit, based on the 15 lowest significant
12       bits in iseed (which is modified for the next call). */
13    int randbit()
14    {
15       int newbit;
16       /* XOR bits 15, 1 and 0 of iseed */
17       newbit =  (*iseed >> 15) & 1 ^ (*iseed >> 1) & 1 ^ *iseed & 1;
18       /* Leftshift the seed and put the result of the XOR's in bit 1. */
19       *iseed=(*iseed << 1) | newbit;
20       return(newbit);
21    }
22
23    void rand_fillbuffer(void)
24    {
25       int i;
26
27       for (i = 0; i < N; ++i)
28       {
29          if (randbit()) autocorr_in[i] =  32767;
30          else           autocorr_in[i] = -32767;
31       }
32    }
        
      

        
1    /* Simple, unoptimized IIR filter (feedback only) */
2    /* for TMS320C54X series DSPs                     */
3    /* Copyright September 2005 by Matt Kleffner      */
4    /* under the Creative Commons Attribution License */
5
6    #include "lab4b.h"
7    #include "intrinsics.h"
8
9    /* IIR values and buffers (declared in c_fft_given_iirc.asm) */
10    #define IIR_order 4
11    extern int scale;
12    extern int coef[IIR_order];
13    extern int state[IIR_order];
14
15    /* Arrays declared in main routine */
16    extern int autocorr_in[N];
17    extern int autocorr_out[N];
18
19    /* Pointer to state buffer location */
20    extern int iirptr;
21
22    void iirfilt()
23    {
24        int i, j;
25
26        _set_fract_bit();
27        /* Filter PN input */
28        for (i = 0; i < N; ++i)
29        {
30           int sum = 0;
31           /* Calculate and sum all feedback terms except the "oldest" one */
32           for (j = 0; j < (IIR_order-1); ++j)
33           {
34              sum += _i_mul_fract_fb1_ii(coef[j],state[iirptr]);
35              /* Avoid usage of "modulo" routine */
36              iirptr++;
37              if (iirptr == IIR_order) iirptr = 0;
38           }
39           /* Calculate and sum oldest feedback term without incrementing iirptr */
40           sum += _i_mul_fract_fb1_ii(coef[IIR_order-1],state[iirptr]);
41
42           /* Calculate direct input contribution */
43           sum += _i_mul_fract_fb1_ii(scale,autocorr_in[i]);
44           autocorr_in[i] = sum;
45           state[iirptr] = autocorr_in[i];
46        }
47        _reset_fract_bit();
48    }
       
      

        
1    /***********************************************************/
2    /* autocorr.c                                              */
3    /* Copyright August 2004 by Matt Kleffner                  */
4    /* under the Creative Commons Attribution License          */
5    /*                                                         */
6    /* Simple, unoptimized autocorrelation function            */
7    /* for ECE 420 (TMS320C54X series)                         */
8    /*                                                         */
9    /* #defines expected in lab4b.h                            */
10    /*    L: length of data in autocorr_in buffer              */
11    /*    N: length of data in autocorr_out buffer             */
12    /* logL: log base 2 of L (used for scaling output)         */
13    /*    M: Largest positive lag of autocorrelation desired   */
14    /*       (must be < L and < N/2)                           */
15    /*                                                         */
16    /* 16-bit-limited input/output (must be defined elsewhere) */
17    /*  autocorr_in: buffer for input data  (L pts)            */
18    /* autocorr_out: buffer for output data (N pts)            */
19    /*               N must be >= 2*M+1                        */
20    /*               assumed to be full of zeros at start      */
21    /*               output in zero-phase form                 */
22    /***********************************************************/
23
24    #include "lab4b.h"
25    #include "intrinsics.h"
26
27    extern int autocorr_in[L];
28    extern int autocorr_out[N];
29
30    void autocorr(void)
31    {
32       int i,j,temp;
33
34       _set_fract_bit();
35       for(i=0;i<=M;++i)
36       {
37          long int sum=0;
38          for(j=0;j<(L-i);++j)
39          {
40             temp = _i_mul_fract_fb1_ii(autocorr_in[j],autocorr_in[j+i]);
41             sum += temp;
42          }
43          autocorr_out[i]=(int)(sum >> logL);
44       }
45       _reset_fract_bit();
46
47       /* Copy values for negative indeces at end of buffer */
48       for(i=1,j=N-1;i<=M;++i,--j)
49       {  autocorr_out[j]=autocorr_out[i]; }
50    }
        
      

Collection Navigation

Content actions

Download:

Collection as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Module as:

PDF | More downloads ...

Add:

Collection to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks

Module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks