Summary: This is a processor exercise in which students implement a spectrum analyzer using mixed C and assembly code. Students are to acquire a block of 1024 samples, apply a Hamming window, compute a length-1024 Discrete Fourier Transform using provided Fast Fourier Transform code, and display the magnitude-squared spectrum on an oscilloscope. Students will also compile and explore a reference implementation of an autocorrelation-based power spectral density (PSD) estimator. This implementation estimates the PSD of an IIR-filtered pseudo-noise generator.
As this is your first experience with the C environment, you will have the option to add most of the required code in C or assembly. A C skeleton will provide access to input samples, output samples, and interrupt handling code. You will add code to transfer the inputs and outputs (in blocks at a time), apply a hamming window, compute the magnitude-squared spectrum, and include a trigger pulse. After the hamming window is created, either an assembly or C module that bit reverses the input and performs an FFT calculation is called.
As your spectrum analyzer works on a block of samples at a
time, you will need to use interrupts to pause your processing
while samples are transferred from/to the CODEC (A/D and D/A)
buffer. Fortunately, the interrupt handling routines have
been written for you in a C shell program available at
v:\ece420\54x\dspclib\lab4main.c and the core
code.
Interrupts are an essential part of the operation of any microprocessor. They are particularly important in embedded applications where DSPs are often used. Hardware interrupts provide a way for interacting with external devices while the processor executes code. For example, in a key entry system, a key press would generate a hardware interrupt. The system code would then jump to a specified location in program memory where a routine could process the key input. Interrupts provide an alternative to polling. Instead of checking for key presses at a predetermined rate (requires a clock), the system could be busy executing other code. On the TI-C54x DSP, interrupts provide a convenient way to transfer blocks of data to/from the CODEC in a timely fashion.
The lab4main.c code and the core code are intended
to make your interaction with the hardware much simpler. As
there was a core file for working in the assembly environment
(Labs 0-3), there is a core file for the C environment
(V:\ece420\54x\dspclib\core.asm) which handles the interrupts
from the CODEC (A/D and D/A) and the serial port. Here, we
will describe the important aspects of the core code necessary
to complete the assignment.
At the heart of the hardware interaction is the auto-buffering
serial port. In the auto-buffering serial mode, the TI-C54x
processor is able to do processing
uninterrupted while samples are
transferred to/from a buffer of length
BlockLen, then no additional interrupt handling
routines would be necessary. Samples could be collected in
a 1024-length buffer and a 1024-point FFT could be computed
uninterrupted while the auto-buffering buffer fills.
Unfortunately, the DSP is not fast enough to accomplish this
task.
We now provide an explanation of the shell C program
lab4main.c listed in Appendix A. The
lab4main.c file contains the function
interrupt void irq and a main program. The
main program is an infinite loop over blocks of
BlockLen samples. Inside the infinite loop,
you will insert code to do the operations which
follow. Although each of these operations may be performed
in C or assembly, we suggest you follow the guidelines
suggested.
The function WaitAudio is an assembly function in the core code
which handles the CODEC interrupts. An interrupt from the CODEC
occurs every BlockLen samples. The
SetAudioInterrupt(irq) call in the main program
tells the core code to jump to the irq function
when an interrupt occurs. In the irq function,
BlockLen samples of the A/D input in
Rcvptr (channel 1) are written to a length
inputs buffer, and
BlockLen of the output samples in the
outputs buffer are written to the D/A output
via Xmitptr on channel 2. In C, pointers may be
used as array names so that Xmitptr[0] is the
first word pointed to by Xmitptr. As in the
assembly core, the input samples are not in
consecutive order. The right and left inputs are offset from
Rcvptr respectively by
Xmitptr. On channel 1 of the
output, the input is echoed out. You are to fill
the buffer outputs with the windowed
magnitude-squared FFT values by performing the operations
listed above.
In the main code, the while(!input_full);
loop waits for
inputs
buffer. Next, the
BlockLen sample times;
otherwise the first BlockLen of samples of
output would not be available on time. Once this loop is
finished, the lengthy processing of the FFT can continue.
During this processing, the DSP is interrupted every
BlockLen samples to transfer samples. Once
this processing is over, the infinite loop returns to
while(!input_full); to wait for
The flow diagram in Figure 1 summarizes the operation of the interrupt handling routine
|
As the list of operations indicates, bit-reversal and FFT
computation are to be done in both C and assembly. For the
assembly version, make sure that the line defining
C_FFT is commented in lab4main.c.
We are providing you with a shell assembly file, available at
v:\ece420\54x\dspclib\c_fft_given.asm and shown
in Appendix B, containing many
useful declarations and some code. The code for performing
bit-reversal and other declarations needed for the FFT
routine are also provided in this section.
However, we would like you to enter this code
manually, as you will be expected to understand its
operation.
The assembly file c_fft_given.asm
contains two main parts, the data section
starting with .sect ".data" and the
program section starting with
.sect ".text". Every function and
variable accessed in C must be preceded by a single underscore
_ in assembly and a
.global _name must be placed in the
assembly file for linking. In this example,
bit_rev_fft is an assembly function called from
the C program with a label _bit_rev_fft in the
text portion of the assembly file and a
.global _bit_rev_fft declaration. In each
assembly function, the macro ENTER_ASM is
called upon entering and LEAVE_ASM is
called upon exiting. These macros are defined in
v:\ece420\54x\dspclib\core.inc. The
ENTER_ASM macro saves the status registers
and AR1, AR6, and
AR7 when entering a function as required
by the register use conventions. The ENTER_ASM
macro also sets the status registers to the assembly conventions we
have been using (i.e, FRCT=1 for fractional
arithmetic and CPL=0 for
DP referenced addressing). The
LEAVE_ASM macro just restores the saved
registers.
The parameter passing convention between assembly and C is
simple for single input, single output assembly functions. From
a C program, the input to an assembly program is in the low part
of accumulator A with the output returned
in the same place. When more
than one parameter is passed to an assembly function, the
parameters are passed on the stack (see the core file
description for more information). We suggest that you avoid
passing or returning more than one parameter. Instead, use global
memory addresses to pass in or return more than one parameter.
Another alternative is to pass a pointer to the start of a buffer
intended for passing and returning parameters.
When entering and leaving an assembly function, the
ENTER_ASM and LEAVE_ASM
macros ensure that certain registers are saved and restored.
Since the C program may use any and all registers, the state of
a register cannot be expected to remain the same between calls to
assembly function(s). Therefore, any information that
needs to be preserved across calls to an assembly function must be
saved to memory!
Now, we explain how to use the FFT routine provided by TI
for the C54x. The FFT routine fft.asm located
in v:\ece420\54x\dsplib\ computes an in-place,
complex FFT. The length of the FFT is defined as a label
K_FFT_SIZE and the algorithm assumes that the
input starts at data memory location _fft_data.
To have your code assemble for an
N .set 1024
K_FFT_SIZE .set N ; size of FFT
K_LOGN .set 10 ; number of stages (log_2(N))
In addition to defining these constants, you will have to
include twiddle-factor tables for the FFT. These tables
(twiddle1 and twiddle2) are available in the shared
directory v:\ece420\54x\dsplib\. Note that the
tables are each
.sect ".data"
.align 1024
sine .copy "v:\ece420\54x\dsplib\twiddle1"
.align 1024
cosine .copy "v:\ece420\54x\dsplib\twiddle2"
The FFT provided requires that the input be in bit-reversed
order, with alternating real and imaginary
components. Bit-reversed addressing is a convenient way to
order input
| Input Order | Binary Representation | Bit-Reversed Representation | Output Order |
|---|---|---|---|
| 0 | 000 | 000 | 0 |
| 1 | 001 | 100 | 4 |
| 2 | 010 | 010 | 2 |
| 3 | 011 | 110 | 6 |
| 4 | 100 | 001 | 1 |
| 5 | 101 | 101 | 5 |
| 6 | 110 | 011 | 3 |
| 7 | 111 | 111 | 7 |
The following routine performs the bit-reversed reordering
of the input data. The routine assumes that the input is
stored in data memory starting at the location labeled
_bit_rev_data, which must be aligned to the
least power of two greater than the input buffer length, and
consists of alternating real and imaginary parts. Because
our input data is going to be purely real in this lab, you
will have to make sure that you set the imaginary parts to
zero by zeroing out every other memory location.
1 bit_rev:
2 STM #_bit_rev_data,AR3 ; AR3 -> original input
3 STM #_fft_data,AR7 ; AR7 -> data processing buffer
4 MVMM AR7,AR2 ; AR2 -> bit-reversed data
5 STM #K_FFT_SIZE-1,BRC
6 RPTBD bit_rev_end-1
7 STM #K_FFT_SIZE,AR0 ; AR0 = 1/2 size of circ buffer
8 MVDD *AR3+,*AR2+
9 MVDD *AR3-,*AR2+
10 MAR *AR3+0B
11 bit_rev_end:
12 NOP
13 RET
As mentioned, in the above code _bit_rev_data
is a label indicating the start of the input data and
_fft_data is a label indicating the start of a
circular buffer where the bit-reversed data will be
written. Note that although AR7 is not used by
the bit-reversed routine directly, it is used extensively in
the FFT routine to keep track of the start of the FFT data
space.
In general, to have a pointer index memory in bit-reversed
order, the AR0 register needs to be set to
one-half the length of the circular buffer; a statement such
as ARx+0B is used to move the ARx
pointer to the next location. For more information regarding
the bit-reversed addressing mode, refer to page
5-18 in the TI-54x
CPU and Peripherals manual. Is it possible to
bit-reverse a buffer in place? For a diagram of the
ordering of the data expected by the FFT routine, see
Figure 4-10 in the TI-54x
Applications Guide. Note that the FFT code uses all
the pointers available and does not restore the pointers to
their original values.
A bit-reversing and FFT routine have also been provided in
lab4fft.c, listed in
Appendix C. Again,
make sure you understand how the bit reversal is taking
place.
In lab4main.c, the line defining C_FFT
must not be commented for use of the C FFT routine. The sine
tables (twiddle factors) are located in
sinetables.h.
This fft requires its inputs in two buffers, the real buffer
real and the imaginary buffer imag,
and the output is placed in the same buffers.
The length of the FFT, N, and logN are
defined in lab4.h, which is also listed in
Appendix C.
When experimenting with the C
FFT make sure you modify these length values instead of the ones
in the assembly code and lab4main.c!
As mentioned, you will be using the FFT to compute the
spectrum of a windowed input. For your implementation you
will need to create a 1024-point Hamming window. First,
create a Hamming window in matlab using the function
hamming. For the assembly FFT, use
save_coef to save the window to a file
that can then be included in your code with the
.copy directive. For the C FFT, use the matlab
function
write_intvector_headerfile
with name set to 'window' and
elemperline set to 8 to create
the header file that is included in lab4main.c.
Once the DFT has been computed, you must calculate the squared magnitude of the spectrum for display.
squr and squra useful in
implementing Equation 1.
Because the squared magnitude is always nonnegative, you can
replace one of the magnitude values with a -1.0 as a trigger
pulse for display on the oscilloscope. This is easily
performed by replacing the DC term (
If you are planning on writing some of the code in C, then
you may be forced to use intrinsics. Intrinsic instructions
provide a way to use assembly instructions directly in C.
An example of an intrinsic instruction is
bit_rev_data[0]=_smpyr(bit_rev_data[0],window[0])
which performs the assembly signed multiply round
instruction. You may also find the _lsmpy
instruction useful. For more information on intrinsics, see
page 6-22 of the TI-C54x
Optimizing C/C++ Compiler User's Guide.
The following lines of code were borrowed from the C FFT to serve
as an example of arithmetic operations in C. Save
this code in a file called mathex.c and compile this file by typing
c_asm mathex.c at a command prompt. Look at
the resulting assembly file and investigate the differences between
each block. Be sure to reference page 6-10 of the
compiler user's guide to find
out what the state of the FRCT and OVM bits are. Run
this program on the DSP, halt the program, and compare the
output values in a memory window. Does each block work properly
for all possible values?
int s1, s2;
int t1, t2;
int i1, i2;
int n1 = 16383, n2 = 16382, n3 = 16381, n4 = 16380;
void main(void)
{
/* Code for standard 32-bit hardware, */
/* with x,y limited to 16 bits */
s1 = (n1*n2 + n3*n4) >> 15;
s2 = (n1 + n2) >> 1;
/* Code for TI TMS320C54X series */
t1 = ((long int)(n1*n2) + (long int)(n3*n4)) >> 15;
t2 = ((long int)n1 + (long int)n2) >> 1;
/* Intrinsic code for TMS320C54X series */
i1 = _sadd(_smpy(n1,n2), _smpy(n3,n4));
i2 = _sshl(_sadd(n1, n2),-1);
while(1);
}
A working program can be produced by compiling the C code and
linking assembly modules and the core module. The compiler
translates C code to a relocatable assembly form. The linker
assigns physical addresses on the DSP to the relocatable data
and code segments, resolves .global
references and links runtime libraries.
The procedure for compiling C code and linking assembly modules
has been automated for you in the batch file
v:\ece420\54x\dsptools\c_asm.bat. The name of the
first file becomes the name of the executable. Once you have
completed lab4main.c and c_fft_given.asm,
type c_asm lab4main.c c_fft_given.asm to produce a
lab4main.out file to be loaded onto the DSP. For the
C FFT type c_asm lab4main.c lab4fft.c to produce
lab4main.out.
Load the output file onto the DSP as usual and confirm that valid
FFTs are calculated. Once valid output is obtained, measure how
many clock cycles it takes to compute both the assembly and C FFT.
We provide for you in Appendix D
and E a complete C implementation
of a PSD estimator. The input is an IIR-filtered pseudo-noise (PN)
sequence generator and the PSD estimate is based on windowing the
autocorrelation with a rectangular window.
The code consists of the files lab4bmain.c,
lab4b.h, pn.c, autocorr.c,
c_fft_given_iirc.asm, and the previously-given TI FFT
routine. The assembly file c_fft_given_iirc.asm differs
from c_fft_given.asm in that the window array has been
removed and variables and arrays associated with IIR filtering have
been added. Compile and link these by issuing
c_asm lab4bmain.c pn.c autocorr.c c_fft_given_iirc.asm
at the command line. Load lab4bmain.out onto the DSP and
run the code. Make sure that an
IIR-filtered PN sequence appears on channel 1 and its estimate
appears on channel 2.
Does the output match your expectations based on the theory? Does this application illustrate any limitations of the FFT implementation? (Hint: note that most of the values in the FFT input are zero.) The previously-given C implementation uses a similar algorithm as the TI FFT; take a look at the C code for help. What are the limitation(s) of the FFT that show up in this application?
In lab4b.h M sets the number of
autocorrelation points that are calculated. What is the maximum value
of M that allows the reference routines to run in real
time? In determining this value you may find it useful to connect a
wave-function generator to the input and copy inputs
into display_inputs. You may limit M to
powers of 2 minus 1.
From your prelab experiments, you should be able to describe
the effect of windowing and zero-padding on FFT spectral
analysis. In your DSP system, experiment with different
inputs, changing
inputs with
1 /* v:/ece420/54x/dspclib/lab4main.c */
2 /* dgs - 9/14/2001 */
3 /* mdk - 2/10/2004 C FFT update */
4
5 #include "v:/ece420/54x/dspclib/core.h"
6
7 /* comment the next line to use assembly fft */
8 #define C_FFT
9
10 #ifdef C_FFT /* Use C FFT */
11
12 #include "window.h"
13 #include "lab4.h" /* Number of C FFT points defined here */
14
15 /* function defined in lab4fft.c */
16 void fft(void);
17
18 /* FFT data buffers */
19 int real[N]; /* Real part of data */
20 int imag[N]; /* Imaginary part of data */
21
22 #else /* Use assembly FFT */
23
24 #define N 1024 /* Number of assembly FFT points */
25
26 /* Function defined by c_fft_given.asm */
27 void bit_rev_fft(void);
28
29 /* FFT data buffers (declared in c_fft_given.asm) */
30 extern int bit_rev_data[N*2]; /* Data input for bit-reverse function */
31 extern int fft_data[N*2]; /* In-place FFT & Output array */
32 extern int window[N]; /* The Hamming window */
33
34 #endif /* C_FFT */
35
36
37 /* Our input/output buffers */
38 int inputs[N];
39 int outputs[N];
40
41 volatile int input_full = 0; /* volatile means interrupt changes it */
42 int count = 0;
43
44
45 interrupt void irq(void)
46 {
47 int *Xmitptr,*Rcvptr; /* pointers to Xmit & Rcv Bufs */
48 int i;
49
50 static int in_irq = 0; /* Flag to prevent reentrance */
51
52 /* Make sure we're not in the interrupt (should never happen) */
53 if( in_irq )
54 return;
55
56 /* Mark we're processing, and enable interrupts */
57 in_irq = 1;
58 enable_irq();
59
60 /* The following waitaudio call is guaranteed not to
61 actually wait; it will simply return the pointers. */
62 WaitAudio(&Rcvptr,&Xmitptr);
63
64 /* input_full should never be true... */
65 if( !input_full )
66 {
67 for (i=0; i<BlockLen; i++)
68 {
69 /* Save input, and echo to channel 1 */
70 inputs[count] = Xmitptr[6*i] = Rcvptr[4*i];
71
72 /* Send FFT output to channel 2 */
73 Xmitptr[6*i+1] = outputs[count];
74
75 count++;
76 }
77 }
78
79 /* Have we collected enough data yet? */
80 if( count >= N )
81 input_full = 1;
82
83 /* We're not in the interrupt anymore... */
84 disable_irq();
85 in_irq = 0;
86 }
87
88
89 main()
90 {
91 /* Initialize IRQ stuff */
92 count = 0;
93 input_full = 0;
94 SetAudioInterrupt(irq); /* Set up interrupts */
95
96 while (1)
97 {
98 while( !input_full ); /* Wait for a data buffer to collect */
99
100 /* From here until we clear input_full can only take *
101 * BlockLen sample times, so don't do too much here. */
102
103 /* First, transfer inputs and outputs */
104
105 #ifdef C_FFT /* Use C FFT */
106 /* I n s e r t c o d e t o f i l l */
107 /* C F F T b u f f e r s */
108
109 #else /* Use assembly FFT */
110 /* I n s e r t c o d e t o f i l l */
111 /* a s s e m b l y F F T b u f f e r s */
112
113 #endif /* C_FFT */
114
115 /* Done with that... ready for new data collection */
116 count = 0; /* Need to reset the count */
117 input_full = 0; /* Mark we're ready to collect more data */
118
119 /**********************************************************/
120 /* Now that we've gotten the data moved, we can do the */
121 /* more lengthy processing. */
122
123 #ifdef C_FFT /* Use C FFT */
124
125 /* Multiply the input signal by the Hamming window. */
126 /* . . . i n s e r t C / a s m code . . . */
127
128 /* Bit-reverse and compute FFT in C */
129 fft();
130
131 /* Now, take absolute value squared of FFT */
132 /* . . . i n s e r t C / a s m code . . . */
133
134 /* Last, set the DC coefficient to -1 for a trigger pulse */
135 /* . . . i n s e r t C / a s m code . . . */
136
137 /* done, wait for next time around! */
138
139
140 #else /* Use assembly FFT */
141
142 /* Multiply the input signal by the Hamming window. */
143 /* . . . i n s e r t C / a s m code . . . */
144
145 /* Bit-reverse and compute FFT in assembly */
146 bit_rev_fft();
147
148 /* Now, take absolute value squared of FFT */
149 /* . . . i n s e r t C / a s m code . . . */
150
151 /* Last, set the DC coefficient to -1 for a trigger pulse */
152 /* . . . i n s e r t C / a s m code . . . */
153
154 /* done, wait for next time around! */
155
156
157 #endif /* C_FFT */
158
159 }
160 }
1 ; v:\ece420\54x\dspclib\c_fft_given.asm
2 ; dgs - 9/14/2001
3 .copy "v:\ece420\54x\dspclib\core.inc"
4
5 .global _bit_rev_data
6 .global _fft_data
7 .global _window
8
9 .global _bit_rev_fft
10
11 .sect ".data"
12
13 .align 4*N
14 _bit_rev_data .space 16*2*N ; Input to _bit_rev_fft
15
16 .align 4*N
17 _fft_data .space 16*2*N ; FFT output buffer
18
19
20 ; Copy in the Hamming window
21 _window ; The Hamming window
22 .copy "window.asm"
23
24 .sect ".text"
25
26 _bit_rev_fft
27 ENTER_ASM
28
29 call bit_rev ; Do the bit-reversal.
30
31 call fft ; Do the FFT
32
33 LEAVE_ASM
34 RET
35
36 bit_rev:
37 STM #_bit_rev_data,AR3 ; AR3 -> original input
38 STM #_fft_data,AR7 ; AR7 -> data processing buffer
39 MVMM AR7,AR2 ; AR2 -> bit-reversed data
40 STM #K_FFT_SIZE-1,BRC
41 RPTBD bit_rev_end-1
42 STM #K_FFT_SIZE,AR0 ; AR0 = 1/2 size of circ buffer
43 MVDD *AR3+,*AR2+
44 MVDD *AR3-,*AR2+
45 MAR *AR3+0B
46 bit_rev_end:
47 NOP
48 RET
49
50 ; Copy the actual FFT subroutine.
51 fft_data .set _fft_data ; FFT code needs this.
52 .copy "v:\ece420\54x\dsplib\fft.asm"
53
54
55 ; If you need any more assembly subroutines, make sure you name them
56 ; _name, and include a ".global _name" directive at the top. Also,
57 ; don't forget to use ENTER_ASM at the beginning, and LEAVE_ASM
58 ; and RET at the end!
1 #define N 1024 /* Number of FFT points */
2 #define logN 10
1 /*****************************************************************/
2 /* lab4fft.c */
3 /* Douglas L. Jones */
4 /* University of Illinois at Urbana-Champaign */
5 /* January 19, 1992 */
6 /* Changed for use w/ short integers and lookup table for ECE420 */
7 /* Matt Kleffner */
8 /* February 10, 2004 */
9 /* */
10 /* fft: in-place radix-2 DIT DFT of a complex input */
11 /* */
12 /* Permission to copy and use this program is granted */
13 /* as long as this header is included. */
14 /* */
15 /* WARNING: */
16 /* This file is intended for educational use only, since most */
17 /* manufacturers provide hand-tuned libraries which typically */
18 /* include the fastest fft routine for their DSP/processor */
19 /* architectures. High-quality, open-source fft routines */
20 /* written in C (and included in MATLAB) can be found at */
21 /* http://www.fftw.org */
22 /* */
23 /* #defines expected in lab4.h */
24 /* N: length of FFT: must be a power of two */
25 /* logN: N = 2**logN */
26 /* */
27 /* 16-bit-limited input/output (must be defined elsewhere) */
28 /* real: integer array of length N with real part of data */
29 /* imag: integer array of length N with imag part of data */
30 /* */
31 /* sinetables.h must */
32 /* 1) #define Nt to an equal or greater power of two than N */
33 /* 2) contain the following integer arrays with */
34 /* element magnitudes bounded by M = 2**15-1: */
35 /* costable: M*cos(-2*pi*n/Nt), n=0,1,...,Nt/2-1 */
36 /* sintable: M*sin(-2*pi*n/Nt), n=0,1,...,Nt/2-1 */
37 /* */
38 /*****************************************************************/
39
40 #include "lab4.h"
41 #include "sinetables.h"
42
43 extern int real[N];
44 extern int imag[N];
45
46 void fft(void)
47 {
48 int i,j,k,n1,n2,n3;
49 int c,s,a,t,Wr,Wi;
50
51 j = 0; /* bit-reverse */
52 n2 = N >> 1;
53 for (i=1; i < N - 1; i++)
54 {
55 n1 = n2;
56 while ( j >= n1 )
57 {
58 j = j - n1;
59 n1 = n1 >> 1;
60 }
61 j = j + n1;
62
63 if (i < j)
64 {
65 t = real[i];
66 real[i] = real[j];
67 real[j] = t;
68 t = imag[i];
69 imag[i] = imag[j];
70 imag[j] = t;
71 }
72 }
73
74 /* FFT */
75 n2 = 1; n3 = Nt;
76
77 for (i=0; i < logN; i++)
78 {
79 n1 = n2; /* n1 = 2**i */
80 n2 = n2 + n2; /* n2 = 2**(i+1) */
81 n3 = n3 >> 1; /* cos/sin arg of -6.283185307179586/n2 */
82 a = 0;
83
84 for (j=0; j < n1; j++)
85 {
86 c = costable[a];
87 s = sintable[a];
88 a = a + n3;
89
90 for (k=j; k < N; k=k+n2)
91 {
92 /* Code for standard 32-bit hardware, */
93 /* with real,imag limited to 16 bits */
94 /*
95 Wr = (c*real[k+n1] - s*imag[k+n1]) >> 15;
96 Wi = (s*real[k+n1] + c*imag[k+n1]) >> 15;
97 real[k+n1] = (real[k] - Wr) >> 1;
98 imag[k+n1] = (imag[k] - Wi) >> 1;
99 real[k] = (real[k] + Wr) >> 1;
100 imag[k] = (imag[k] + Wi) >> 1;
101 */
102 /* End standard 32-bit code */
103
104 /* Code for TI TMS320C54X series */
105
106 Wr = ((long int)(c*real[k+n1]) - (long int)(s*imag[k+n1])) >> 15;
107 Wi = ((long int)(s*real[k+n1]) + (long int)(c*imag[k+n1])) >> 15;
108 real[k+n1] = ((long int)real[k] - (long int)Wr) >> 1;
109 imag[k+n1] = ((long int)imag[k] - (long int)Wi) >> 1;
110 real[k] = ((long int)real[k] + (long int)Wr) >> 1;
111 imag[k] = ((long int)imag[k] + (long int)Wi) >> 1;
112
113 /* End code for TMS320C54X series */
114
115 /* Intrinsic code for TMS320C54X series */
116 /*
117 Wr = _ssub(_smpy(c, real[k+n1]), _smpy(s, imag[k+n1]));
118 Wi = _sadd(_smpy(s, real[k+n1]), _smpy(c, imag[k+n1]));
119 real[k+n1] = _sshl(_ssub(real[k], Wr),-1);
120 imag[k+n1] = _sshl(_ssub(imag[k], Wi),-1);
121 real[k] = _sshl(_sadd(real[k], Wr),-1);
122 imag[k] = _sshl(_sadd(imag[k], Wi),-1);
123 */
124 /* End intrinsic code for TMS320C54X series */
125 }
126 }
127 }
128 return;
129 }
1 #define N 1024 /* Length of output buffers */
2 #define L N /* Length of input data */
3 #define logL 10 /* log base 2 of L */
4 #define M 31 /* Compute 2*M+1 autocorrelation points */
5
6 /* #define M (L/2-1) */ /* Be sure to use ()'s in this case */
7 /* or algebraic substitution bugs */
8 /* can be introduced */
1 /* v:\ece420\54x\dspclib\lab4bmain.c */
2 /* PN generation, IIR filtering, and autocorrelation added */
3 /* by Matt Kleffner - 9/2004 */
4 /* Original by dgs - 9/14/2001 */
5 /* Use governed by the Creative Commons Attribution License */
6
7 #include "v:\ece420\54x\dspclib\core.h"
8
9 /* #define N 1024 */ /* Number of FFT points */
10 #include "lab4b.h" /* Define N here in header file */
11
12 /* function defined in pn.c */
13 int randsample(unsigned int *iseed);
14
15 /* IIR values and buffers (declared in c_fft_given_iirc.asm) */
16 #define IIR_order 4
17 extern int scale;
18 extern int coef[IIR_order];
19 extern int state[IIR_order];
20 int iirptr;
21
22 /* function defined in autocorr.c */
23 void autocorr(void);
24
25 /* Function defined by c_fft_given_iirc.asm */
26 void bit_rev_fft(void);
27
28 /* FFT data buffers (declared in c_fft_given_iirc.asm) */
29 extern int bit_rev_data[N*2]; /* Data input for bit-reverse function */
30 extern int fft_data[N*2]; /* In-place FFT & Output array */
31
32 /* Our input/output buffers */
33 int inputs[N];
34 int outputs[N];
35 int display_inputs[N];
36 int autocorr_in[N];
37 int autocorr_out[N];
38
39 unsigned int *iseed; /* seed for randsample() */
40
41 volatile int input_full = 0; /* volatile means interrupt changes it */
42 int count = 0;
43
44 interrupt void irq(void)
45 {
46 int *Xmitptr,*Rcvptr; /* pointers to Xmit & Rcv Bufs */
47 int i;
48
49 static int in_irq = 0; /* Flag to prevent reentrance */
50
51 /* Make sure we're not in the interrupt (should never happen) */
52 if( in_irq )
53 return;
54
55 /* Mark we're processing, and enable interrupts */
56 in_irq = 1;
57 enable_irq();
58
59 /* The following waitaudio call is guaranteed not to
60 actually wait; it will simply return the pointers. */
61 WaitAudio(&Rcvptr,&Xmitptr);
62
63 /* input_full should never be true... */
64 if( !input_full )
65 {
66 for (i=0; i<BlockLen; i++)
67 {
68 /* Save input, send display_inputs to channel 1 */
69 inputs[count] = Rcvptr[4*i];
70 Xmitptr[6*i] = display_inputs[count];
71 /* inputs[count] = Xmitptr[6*i] = Rcvptr[4*i]; */
72
73 /* Send FFT output to channel 2 */
74 Xmitptr[6*i+1] = outputs[count];
75
76 count++;
77 }
78 /* Have we collected enough data yet? */
79 }
80 if( count >= N ) input_full = 1;
81
82 /* We're not in the interrupt anymore... */
83 disable_irq();
84 in_irq = 0;
85 }
86
87 main()
88 {
89 int i,j;
90 *iseed = 1;
91 iirptr = 0;
92 /* Initialize IRQ stuff */
93 count = 0;
94 input_full = 0;
95
96 /* Initialize autocorr_out to zero since some values will remain zero */
97 for (i = 0; i < N; ++i)
98 {
99 autocorr_out[i] = 0;
100 display_inputs[i] = 0;
101 }
102 for (i = 0; i < IIR_order; ++i) state[i] = 0;
103
104 SetAudioInterrupt(irq); /* Set up interrupts */
105
106 while (1)
107 {
108 while( !input_full ); /* Wait for a data buffer to collect */
109
110 /* From here until we clear input_full can only take *
111 * BlockLen sample times, so don't do too much here. */
112
113 /* First, transfer inputs and outputs */
114
115 for (i = 0; i < N; i++) {
116 display_inputs[i] = autocorr_in[i];
117 outputs[i] = fft_data[i*2] << 8;
118
119 /* Some statements useful in debugging */
120 /* display_inputs[i] = inputs[i]; */
121 /* autocorr_in[i] = inputs[i] */
122 }
123 /* Last, set the DC coefficient to -1 for a trigger pulse */
124 outputs[0] = -32768;
125
126 /* Done with that... ready for new data collection */
127 count = 0; /* Need to reset the count */
128 input_full = 0; /* Mark we're ready to collect more data */
129
130 /*************************************************************/
131 /* Now that we've gotten the data moved, we can do the */
132 /* more lengthy processing. */
133
134 /* Generate PN input */
135 for (i = 0; i < N; ++i)
136 autocorr_in[i] = randsample(iseed);
137
138 /* Filter PN input */
139 for (i = 0; i < N; ++i)
140 {
141 int sum = 0;
142 /* Calculate and sum all feedback terms except the "oldest" one */
143 for (j = 0; j < (IIR_order-1); ++j)
144 {
145 sum += ((long int)coef[j] * (long int)state[iirptr]) >> 15;
146 /* Avoid usage of "modulo" routine */
147 iirptr++;
148 if (iirptr == IIR_order) iirptr = 0;
149 }
150 /* Calculate and sum oldest feedback term without incrementing iirptr */
151 sum += ((long int)coef[IIR_order-1] * (long int)state[iirptr]) >> 15;
152
153 autocorr_in[i] = ((long int)scale * (long int)autocorr_in[i]) >> 15;
154 autocorr_in[i] += sum;
155 state[iirptr] = autocorr_in[i];
156 }
157
158 /* Calculate autocorrelation */
159 autocorr();
160
161 /* Transfer autocorr output to FFT input buffer */
162 for (i = 0; i < N; i++) {
163 bit_rev_data[i*2] = autocorr_out[i];
164 bit_rev_data[i*2+1] = 0;
165 }
166 /* Bit-reverse and compute FFT */
167 bit_rev_fft();
168
169 /* Done, wait for next time around! */
170 }
171 }
1 /* ECE420, Lab 4, Reference PN Generator Implementation (Non-Optimized) */
2 /* Matt Kleffner 08/04 */
3 /* Original by Michael Frutiger 02/24/04 */
4 /* Use governed by the Creative Commons Attribution License */
5
6 /* Returns as an integer a random bit, based on the 15 lowest significant
7 bits in iseed (which is modified for the next call). */
8 int randbit(unsigned int *iseed)
9 {
10 int newbit;
11 /* XOR bits 15, 1 and 0 of iseed */
12 newbit = (*iseed >> 15) & 1 ^ (*iseed >> 1) & 1 ^ *iseed & 1;
13 /* Leftshift the seed and put the result of the XOR's in bit 1. */
14 *iseed=(*iseed << 1) | newbit;
15 return(newbit);
16 }
17
18 int randsample(unsigned int *iseed)
19 {
20 if (randbit(iseed)) return( 32767);
21 else return(-32767);
22 }
1 /***********************************************************/
2 /* autocorr.c */
3 /* Copyright August 2004 by Matt Kleffner */
4 /* under the Creative Commons Attribution License */
5 /* */
6 /* Simple, unoptimized autocorrelation function */
7 /* for ECE 420 (TMS320C54X series) */
8 /* */
9 /* #defines expected in lab4b.h */
10 /* L: length of data in autocorr_in buffer */
11 /* N: length of data in autocorr_out buffer */
12 /* logL: log base 2 of L (used for scaling output) */
13 /* M: Largest positive lag of autocorrelation desired */
14 /* (must be < L and < N/2) */
15 /* */
16 /* 16-bit-limited input/output (must be defined elsewhere) */
17 /* autocorr_in: buffer for input data (L pts) */
18 /* autocorr_out: buffer for output data (N pts) */
19 /* N must be >= 2*M+1 */
20 /* assumed to be full of zeros at start */
21 /* output in zero-phase form */
22 /***********************************************************/
23
24 #include "lab4b.h"
25
26 extern int autocorr_in[L];
27 extern int autocorr_out[N];
28
29 void autocorr(void)
30 {
31 int i,j,temp;
32
33 for(i=0;i<=M;++i)
34 {
35 long int sum=0;
36 for(j=0;j<(L-i);++j)
37 {
38 temp = ((long int)autocorr_in[j] * (long int)autocorr_in[j+i]) >> 15;
39 sum += temp;
40 }
41 autocorr_out[i]=(int)(sum >> logL);
42 }
43
44 /* Copy values for negative indeces at end of buffer */
45 for(i=1,j=N-1;i<=M;++i,--j)
46 { autocorr_out[j]=autocorr_out[i]; }
47 }