# Connexions

You are here: Home » Content » Compiler Optimization Flags and Intrinsics

### Recently Viewed

This feature requires Javascript to be enabled.

# Compiler Optimization Flags and Intrinsics

## Compiler Optimization Flags

We began our optimization process by exploring the performance enhancements using the compiler with optimization flags. We used the GNU Compiler Collection (GCC) and were able to use the O3 optimization flag. O3 is less focused on compiling the code in a way that yields easy debugging, but enables some very important built in optimizations, including inline function expansion and loop unrolling.

• Function inlining tells the compiler to insert the entire body of the function where the function is called, instead of creating the code to call the function.
• Loop unrolling eliminates the instructions that control the loop and rewrites the loop as a repeated sequence of similar code, optimizing the program's execution speed at the expense of its binary size.

## SSE3 Intrinsic Instruction Set

Utilizing different instruction sets provides another opportunity for significant speed gains. We utilized the Intel Streaming SIMD (single-instruction, multiple-data) Extensions 3 technology (SSE3) to improve the the efficiency of the floating-point operations of addition and multiplication. SSE3 instructions are designed for parallel mathematical operations: each processor core contains eight 128-bit SSE registers, which are capable of storing up to 4 floating-point numbers. SSE operations basically perform operations on all 4 floats at the same time, providing a considerable increase in the computation speed for banks of numbers. Because digital filtering is essentially a large number of floating-point multiplications and additions, we felt that SSE would be a perfect addition to the project.

The SSE3 instruction set was implemented using two different methods:

• Compiler Optimized: By including the header file <xmmintrin.h> and compiling with the O3 optimization flag, the GCC compiler will automatically apply SSE instructions to speed up instrcutions where it deems fit.
• User-Defined SSE Instructions with Intrinsic Functions: The SSE header file also grants access to intrinsic functions, which allows us to specifically indicate to the compiler how SSE should be used in our program. We wrote a custom SSE implementation of our filter processing code. Each SSE operation performs calculations on 4 channels simultaneously.

## Comparison of Initial Optimizations

The following results were generated on an AMD A6-3400M quad-core processor. We filtered 256 channels with 600,000 time samples. We selected a large number of samples to process to prevent the processor from putting the data in low-level cache, which emulates the behavior of real-time data. The entire program was cycled 100 times to provide temporal resolution of the results, which lets us easily see changes in performance.

Adding O3 optimization resulted in a speed increase of about 2 binary orders of magnitude. Adding SSE optimizations yielded a speed increase by a factor of more than 3. Utilizing compiler optimization and specialized instruction sets provided a major boost in our filter bank's performance.

Note that we performed tests with filter coefficients uniquely defined for each channel, and also with filter coefficients held the same for all channels. Using the same coefficients for all channels yielded significant speed gains. Most filter banks for neural signals will perform the same bandpass filtering on all channels, so this is an acceptable change for optimization.

## Content actions

PDF | EPUB (?)

### What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

#### Definition of a lens

##### Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

##### What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

##### Who can create a lens?

Any individual member, a community, or a respected organization.

##### What are tags?

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks