Recall that the basic operation of a C compiler is to translate C
source code into assembly instructions and then into an executable.
"Compiler optimization is used to improve the
efficiency (in terms of
running time or resource usage) of the executables output by a
compiler. These techniques allow programmers to write source code
in a straightforward manner, expressing their intentions clearly,
while allowing the computer to make choices about implementation
details that lead to efficient execution. Contrary to what the term
might imply, this rarely results in executables that are perfectly
"optimal" by any measure, only executables that are much improved
compared to direct translation of the programmer's original source."
[1]
An optimizing compiler traditionally groups optimizations into
phases. Each phase contains a series of
optimizations (or transformations) that are performed in a fixed order.
These phases
are usually turned on with command-line flags such as -O1,
-O2, etc. Each flag indicates an optimization "level"
where the level includes all of the lower levels. At higher
optimization levels bugs in the code are sometimes introduced, so it
is important to check the behavior of a compiler-optimized program
against the reference implementation. Keep the highest optimization
level that produces accurate code.
At this point the compiled code should be checked against the budgetary
constraints. Is it fast enough? Does it fit in available memory?
Total memory usage is placed in a file produced by the compiler
(sometimes a command-line flag is needed for this). Speed can be
measured in a couple of ways. The most common method is the use of a
profiler. A profiler tracks the performance of the program,
providing data on how many times each function is called, as well as
how much time each function takes in terms of cycles and percentages
of total program cycles. A simulator also allows clock cycles to be
measured, typically by allowing the user to place breakpoints around
sections of code to be measured. If the speed and memory properties of
the compiled code fit the budget, optimization is finished. If not,
some of the routines must be hand-written in assembly.
"Doug course at UIUC using the TI C54x DSP has been adopted by many EE, CE and CS depts Worldwide "