Summary: A basic understanding of human perception of sound is vital if you wish to design music synthesis algorithms to achieve your goals. In this module you will learn about pitch and frequency, intensity and amplitude, harmonics, and tuning systems. The treatment of these concepts is oriented to the creation of music synthesis algorithms.
Note: Your browser may not currently support MathML. See our browser support page for additional details. You can always view the correct math in the PDF version.
![]() |
This module refers to LabVIEW, a software development environment that features a graphical programming language. Please see the LabVIEW QuickStart Guide module for tutorials and documentation that will help you: |
| • Apply LabVIEW to Audio Signal Processing | |
| • Get started with LabVIEW | |
| • Obtain a fully-functional evaluation edition of LabVIEW |
A basic understanding of human perception of sound is vital if you wish to design music synthesis algorithms to achieve your goals. Human hearing and other senses operate quite well in a relative sense. That is, people perceive properties of sound such as pitch and intensity and make relative comparisons. Moreover, people make these comparisons over an enormous dynamic range: they can listen to two people whispering in a quiet auditorium and determine which person is whispering the loudest. During a rock concert in the same auditorium, attendees can determine which vocalist is singing the loudest. However, once the rock concert is in progress, they can no longer hear someone whispering! Senses can adapt to a wide range of conditions, but can make relative comparisons only over a fairly narrow range.
In this module you will learn about pitch and frequency, intensity and amplitude, harmonics and overtones, and tuning systems. The treatment of these concepts is oriented to creating music synthesis algorithms. Connexions offers many excellent modules authored by Catherine Schmidt-Jones that treat these concepts in a music theory context, and some of these documents are referenced in the discussion below. For example, Acoustics for Music Theory describes acoustics in a musical setting, and is a good refresher on audio signals.
Pitch is the human perception of frequency. Often the terms are used interchangeably, but they are actually distinct concepts. Musicians normally refer to the pitch of a signal rather than its frequency; see Pitch: Sharp, Flat, and Natural Notes and The Circle of Fifths.
Perception of frequency is logarithmic in nature. For example, a change in frequency from 400 Hz to 600 Hz will not sound the same as a change from 200 Hz to 400 Hz, even though the difference between each of these frequency pairs is 200 Hz. Instead, you perceive changes in pitch based on the ratio of the two frequencies; in the previous example, the ratios are 1.5 and 2.0, respectively, and the latter pitch pair would sound like a greater change in frequency. Musical Intervals, Frequency, and Ratio offers additional insights.
Often it is desirable to synthesize an audio signal so that its perceived pitch follows a specific trajectory. For example, suppose that the pitch should begin at a low frequency, gradually increase to a high frequency, and then gradually decrease back to the original. Furthermore, suppose that you should perceive a uniform rate of change in the frequency.
The screencast video of Figure 1 illustrates two different approaches to this problem, and demonstrates the perceptual effects that result from treating pitch perception as linear instead of logarithmic.
Perception of sound intensity also logarithmic. When you judge one sound to be twice as loud as another, you actually perceive the ratio of the two sound intensities. For example, consider the case of two people talking with one another. You may decide that one person talks twice as loud as the other, and then measure the acoustic power emanating from each person; call these two measurements
The decibel (abbreviated dB) is normally used to describe ratios of acoustic intensity. The decibel is defined in Equation 1:
where
Acoustic intensity measures power per unit area, with a unit of watts per square meter. The operative word here is power. When designing or manipulating audio signals, you normally think in terms of amplitude, however. The power of a signal is proportional to the square of its amplitude. Therefore, when considering the ratios of two amplitudes
Can you explain why "10" becomes "20"? Recall that
Often it is desirable to synthesize an audio signal so that its perceived intensity will follow a specific trajectory. For example, suppose that the intensity should begin at silence, gradually increase to a maximum value, and then gradually decrease back to silence. Furthermore, suppose that you should perceive a uniform rate of change in intensity.
The screencast video of Figure 2 illustrates two different approaches to this problem, and demonstrates the perceptual effects that result from treating intensity perception as linear instead of logarithmic.
Musical instruments produce sound composed of a fundamental frequency and harmonics or overtones. The relative strength and number of harmonics produced by an instrument is called timbre, a property that allows the listener to distinguish between a violin, an oboe, and a trumpet that all sound the same pitch. See Timbre: The Color of Music for further discussion.
You perhaps have studied the concept of Fourier series, which states that any periodic signal can be expressed as a sum of sinusoids, where each sinusoid is an exact integer multiple of the fundamental frequency; refer to Equation 3:
where
When an instrument produces overtones whose frequencies are essentially integer multiples of the fundamental, you do not perceive all of the overtones as distinct frequencies. Instead, you perceive a single tone; the harmonics fuse together into a single sound. When the overtones follow some other arrangement, you perceive multiple tones. Consider the screencast video in Figure 3 which explains why physical instruments tend to produce overtones at approximately integer multiples of a fundamental frequency.
![]() |
Musicians broadly categorize combinations of tones as either harmonious (also called consonant) or inharmonious (also called dissonant). Harmonious combinations seem to "fit well" together, while inharmonious combinations can sound "rough" and exhibit beating. The screencast video in Figure 4 demonstrates these concepts using sinusoidal tones played by a synthesizer.
Please refer to the documents Consonance and Dissonance and Harmony for more information.
A tuning system defines a relatively small number of pitches that can be combined into a wide variety of harmonic combinations; see Tuning Systems for an excellent treatment of this subject.
The vast majority of Western music is based on the tuning system called equal temperament in which the octave interval (a 2:1 ratio in frequency) is equally subdivided into 12 subintervals called semitones.
Consider the 88-key piano keyboard below. Each adjacent pair of keys is one semitone apart (you perhaps are more familiar with the equivalent term half step). Select some pitches and octave numbers and view the corresponding frequency. In particular, try pitches that are an octave apart (e.g., A3, A4, and A5) and note how the frequency doubles as you go towards the higher-frequency side of the keyboard. Also try some single semitone intervals like A0 and A#0, and A7 and A#7.
Download LabVIEW Source
The frequency values themselves may seem rather mysterious. For example, "middle C" (C4) is 261.6 Hz. Why "261.6" exactly? Would "262" work just as well? Humans can actually perceive differences in the sub-Hz range, so 0.6 Hz is actually noticeable. Fortunately an elegantly simple equation exists to calculate any frequency you like. The screencast video of Figure 5 explains how to derive this equation that you can use in your own music synthesis algorithms. Watch the video, then try the exercises to confirm that you understand how to use the equation.
What is the frequency seven semitones above concert A (440 Hz)?
659.3 Hz (n=7)
What is the frequency six semitones below concert A (440 Hz)?
311.1 Hz (n=-6)
1 kHz is approximately how many semitones away from concert A (440 Hz)? Hint:
14
"Developed by Rose Hulman Prof Ed Doering, this collection is a multimedia educational resource for students and faculty that augments traditional DSP courses and courses that cover music […]"