Skip to content Skip to navigation

OpenStax-CNX

You are here: Home » Content » Time Delays for Audio Effects

Navigation

Lenses

What is a lens?

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

This content is ...

Endorsed by Endorsed (What does "Endorsed by" mean?)

This content has been endorsed by the organizations listed. Click each link for a list of all content endorsed by the organization.
  • IEEE-SPS display tagshide tags

    This module is included inLens: IEEE Signal Processing Society Lens
    By: IEEE Signal Processing Society

    Comments:

    "A clear introduction to producing time delays and echo effects in audio signals."

    Click the "IEEE-SPS" link to see all content they endorse.

    Click the tag icon tag icon to display tags associated with this content.

Also in these lenses

  • Evowl

    This module is included inLens: Rice LMS's Lens
    By: Rice LMS

    Comments:

    "Language: en"

    Click the "Evowl" link to see all content selected in this lens.

Recently Viewed

This feature requires Javascript to be enabled.

Tags

(What is a tag?)

These tags come from the endorsement, affiliation, and other lenses that include this content.
 

Time Delays for Audio Effects

Module by: Davide Rocchesso. E-mail the author

Summary: Delay lines and comb filters are the key elements of many digital audio effects. This module explains what they are and how they are typically implemented.

Most acoustic systems have some component where waves can propagate, such as a membrane, a string, or the air in an enclosure. If propagation in these media is ideal, i.e., free of losses, dispersion, and nonlinearities, it can be simulated by delay lines. A delay line is a linear time-invariant, single-input single-output system, whose output signal is a copy of the input signal delayed by τ τ seconds. In continuous time, the frequency response of such system is 1

H D s iΩ=e(iΩτ) H D s Ω Ω τ
(1)

Equation 1 tells us that the magnitude response is unity, and that the phase is linear with slope τ τ .

The Circular Buffer

A discrete-time realization of the Equation 1 is given by a system that implements the transfer function

H D ( z ) = z - τ F s z - m H D ( z ) = z - τ F s z - m
(2)
where m m is the number of samples of delay. When the delay τ τ is an integral multiple of the sampling quantum, m m is an integer number and it is straightforward to implement Equation 2 by means of a memory buffer.

In fact, an m m-samples delay line can be implemented by means of a circular buffer, that is a set of M M contiguous memory cells accessed by a write pointer IN IN and a read pointer OUT OUT , such that

IN = ( OUT + m ) % M IN=(OUT+m)%M
(3)
where the symbol % % is used for the quotient modulo M M. At each sampling instant, the input is written in the location pointed by IN IN , the output is taken from the location pointed by OUT OUT , and the two pointers are updated with
IN = ( IN + 1 ) % M OUT = ( OUT + 1 ) % M IN = ( IN + 1 ) % M OUT = ( OUT + 1 ) % M
(4)
In words, the pointers are incremented respecting the circularity of the buffer.

In some architectures dedicated to sound processing, memory organization is optimized for wavetable synthesis, where a stored waveform is read with variable increments of the reading pointer. In these architectures, a quantity of 2r 2 r memory locations is available, and from these, M=2s M 2 s locations (with s<r s r ) are uniformly chosen among the 2r 2 r available cells. In this case the locations of the circular buffer are not contiguous, and the update of the pointers is done with the operations

IN = ( IN + 2 r - s ) % 2 r OUT = ( OUT + 2 r - s ) % 2 r IN = ( IN + 2 r - s ) % 2 r OUT = ( OUT + 2 r - s ) % 2 r
(5)

In practice, since the addresses are r r-bits long, there is no need to compute the modulo explicitly. It is sufficient to do the sum neglecting any possible overflow.

Of course, Equation 3 is also replaced by

IN = ( OUT + m 2 r - s ) % 2 r IN = ( OUT + m 2 r - s ) % 2 r
(6)

Fractional-Length Delay Lines

It might be thought that, choosing a sufficiently high sampling rate, it is always possible to use delay lines having an integer number of samples. Actually, there are some good reasons that lead us to state that this is not the case in sound synthesis and processing. In sound synthesis, the models have to be carefully tuned without resorting to very high sample rates. In particular, it is easy to verify that using integer-length delays in physical models we get errors in fundamental frequencies that go well beyond the just noticeable difference in pitch. 2

For instance, for a pressure wave propagating in air at normal temperature conditions, the spatial discretization given by the sampling rate F s =44100Hz F s 44100 Hz gives intervals of 0.0075m 0.0075 m , a distance that can produce well-perceivable pitch differences in a wind instrument. Another reason for using fractional delays is that we often want to vary the delay lengths continuously, in order to reproduce effects such as glissando or vibrato. The adoption of integer-length delays would produce annoying discontinuities. The most widely used techniques for implementing fractional delays are interpolation by FIR filters or by allpass filters. These two techniques are, in some sense, complementary. The choice of one of the two has to be made according to the peculiarities of the system to be simulated or of the architecture chosen for the implementation. In any case, a delay of length m m is obtained by means of a delay line whose length is equal to the integer part of m m, cascaded with a block capable to approximate a constant phase delay equal to the fractional part of m m. We recall that the phase delay at a given frequency ω ω is the delay in time samples experienced by the sinusoidal component at frequency ω ω. For instance, consider a linear filtering block enclosed in a feedback loop (see Section 6): the frequency of the k k-th resonance f k f k of the whole feedback system is found at the points where the phase response equates the multiples of 2π 2 . At these frequencies, the components reappear in phase every round trip in the loop, thus reinforcing their amplitude at the output. The phase delay at frequency f k f k is therefore the effective delay length at that frequency, that is the length of an ideal (linear phase) delay line that gives the same k k-th resonance. Figure 1 shows a phase curve and its crossings with multiples of 2π 2 giving a distribution of resonances.

FIR Interpolation Filters

The easiest and most intuitive way to obtain a variable-length delay is to linearly interpolate the output of the line with the content of its preceding cell in the memory buffer. This corresponds to using the first-order FIR filter

H l z= c 0 + c 1 z-1 H l z c 0 c 1 z
(7)
Figure 1: Graphical construction to find the series of resonances produced by a linear block in a feedback loop. The slope of the dashed lines indicates the phase delay at each resonance frequency.
Figure 1 (fase.gif)

Given a certain phase delay

τ ph 0 =(1 ω 0 arctan( c 1 sin ω 0 ) c 0 + c 1 cos ω 0 ) τ ph 0 1 ω 0 c 1 ω 0 c 0 c 1 ω 0
(8)
that has to be obtained at a given frequency ω 0 ω 0 , the following formulas give the coefficient values:
c 0 + c 1 =1 c 1 =11+sin ω 0 tan τ ph 0 ω 0 cos ω 0 τ ph 0 c 0 c 1 1 c 1 1 1 ω 0 τ ph 0 ω 0 ω 0 τ ph 0
(9)
where the approximation is valid in the low-frequency range. The first part of Equation 9 is needed in order to normalize the low-frequency response to one. In the special case that c 0 = c 1 =1/2 c 0 c 1 12 (averaging filter) the phase is linear and the delay is of half a sample. Unfortunately, the magnitude response of this interpolator is lowpass with a zero at the Nyquist frequency. Figure 2 shows the magnitude, phase, and phase delay responses for several first-order linear interpolators. We can see that the phase is linear in most of the audio range, but the magnitude varies from the allpass to the lowpass with a zero at the Nyquist frequency. When the interpolator is inserted within a feedback loop, its lowpass behavior can be treated as an additional frequency-dependent loss, which should be somewhat taken into account.
Figure 2: Magnitude, phase, and phase delay responses of a linear interpolation filter 1α+αz-1 1 α α z for (α=k)016 α k 0 16
Figure 2 (linintresp.gif)

Interpolation filters can be of order higher than the first. We can do quadratic, cubic, or other polynomial interpolations. In general, the problem of designing an interpolator can be turned into the design of an l l-th order FIR filter approximating a constant-magnitude and linear-phase frequency response. Several criteria can be adopted to drive the approximation problem. One approach is to impose that the first L L derivatives of the error function will be zero at zero frequency. In this way we obtain maximally-flat filters whose coefficients are the same used in Lagrange interpolation as it is taught in numerical analysis courses. For a thorough treatment of interpolation filters we suggest reading the article [1]. Here we only point out that using high orders allows to keep the magnitude response close to unity and a phase response close to linear in a wide frequency band. Of course, this is paid in terms of computational complexity. In special architectures, where the access to delay lines is governed by Equation 5 and Equation 6, the linear interpolation is implemented very efficiently by using the rs r s bits that are not used to access the 2s 2 s samples delay line. In fact, if the address is computed using r r bits, the rs r s least significant bits represent the fractional part of the delay or, equivalenty, the coefficient c 1 c 1 of the interpolator. Therefore, it is sufficient to access two consecutive delay cells and keep the values c 0 c 0 and c 1 =1 c 0 c 1 1 c 0 in two registers. The implementation of a glissando with these architectures is immediate and free from complications.

Allpass Interpolation Filters

Another widely used technique to obtain the fractional part of a desired delay length makes use of unit-magnitude IIR filters, i.e., allpass filters. Since the magnitude of these filters is constant there is no frequency-dependent attenuation, a property that can never be ensured by FIR filters. The simplest allpass filter has order one, and it has the following transfer function:

H a z=c+z-11+cz-1 H a z c z 1 c z
(10)

In order to make sure that the filter is stable, the coefficient c c has to stay within the unit circle. Moreover, if we stick with real coefficients, c c belongs to the real axis. The phase delay given by the filter Equation 10 is shown in Figure 3 for several values of the coefficient c c. It is clear that the phase delay is not as flat as in the case of the FIR interpolator, depicted in Figure 2.

Figure 3: Phase response and phase delay of a first-order allpass filter for the values of the coefficient c=1.998k170.999 c 1.998 k 17 0.999 , k016 k 0 16 .
Figure 3 (apresp.gif)

It is easy to verify that, at frequencies close to dc, the phase response of Equation 10 takes the approximate form

Hωsinωc+cosω+csinω1+ccosω(ω1c1+c) H ω ω c ω c ω 1 c ω ω 1 c 1 c
(11)
where the first approximation is obtained by replacing the argument of the arctan with the function value and the second approximation, valid in an even smaller neighborhood, is obtained by approximating sinx x with x x and cosx x with 1 1. The phase and group delay around dc are
τ ph ω τ gr ω1c1+c τ ph ω τ gr ω 1 c 1 c
(12)

Therefore, the filter coefficient c c can be easily determined from the desired low-frequency delay as

c=1 τ ph 01+ τ ph 0 c 1 τ ph 0 1 τ ph 0
(13)

Figure 3 shows that the delay of the allpass filter is approximately constant only in a narrow frequency range. We can reasonably assume that such range, for positive values of c c smaller than one, extends from 0 0 to F s 5 F s 5 . With F s =50kHz F s 50 kHz we see that at F s 5=10kHz F s 5 10 kHz we have an error of about 0.05 0.05 samples. In a note at that frequency produced by a feedback delay line, such an error produces a pitch deviation smaller than 1% 1 % . For lower fundamental frequencies, such as those found in actual musical instruments, the error is smaller than the just noticeable difference measured with slow pitch modulations. If the first-order filter represents an elegant and efficient solution to the problem of tuning a delay line, it has also the relevant side effect of detuning the upper partials, due to the marked phase nonlinearity. Such detuning can be tolerated in most cases, but has to be taken into account in some other contexts. If a phase response closer to linear is needed, we can use higher-order allpass filters [1]. In some cases, especially in sound synthesis by physical modeling, a specific inharmonic distribution of resonances has to be approximated. This can be obtained by designing allpass filters that approximate a given phase response along the whole frequency axis. In these cases the problem of tuning is superseded by the more difficult problem of accurate partial positioning [2]. With allpass interpolators it is more complicated to handle continuous delay length variations, since the recursive structure of the filter does not show an obvious way of transferring memory cells from and to the delay line, as it was in the case of the FIR interpolator, which is constructed on the delay line by a certain number of taps. Indeed, the glissando can be implemented with the allpass filter by adding a new cell to the delay line whenever the filter coefficient becomes one and, at the same time, zeroing out the filter state variable and the coefficient. What is really more complicated with allpass filters is to handle sudden variations of the delay length, as they are found, for instance, when a finger hole is opened in a wind instrument. In this case, the recursive nature of allpass filters causes annoying transients in the output signal. Ad hoc structures have been devised to cancel these transients [1].

The Non-Recursive Comb Filter

Sounds, propagating in the air, come into contact with surfaces and objects of various kinds and this interaction produces physical phenomena such as reflection, refraction, and diffraction. A simple and very important phenomenon is the reflection of sound about a planar surface. Due to a reflection such as this, a listener receives two delayed copies of the same signal. If the delay is larger than about a hundred milliseconds, the second copy is perceived as a distinguished echo, while if the delay is smaller than about ten milliseconds, the effect of a single reflection is perceived as a spectral coloration. A simple model of single reflection can be constructed starting from the basic blocks described in this and in the preceding chapters. It is constructed as an m m-samples delay line, with the incidental fractional part of m m obtained by FIR interpolation or allpass filtering, cascaded with an attenuation coefficient g g, possibly replaced by a filter if a frequency-dependent absorption has to be simulated. The output of this lossy delay line is summed to the direct signal. Let us analyze the structure in the case that m m is integer and g g is a positive constant not exceeding 1 1. The difference equation is expressed as

yn=xn+gxnm y n x n g x n m
(14)
and, therefore, the transfer function is
Hz=1+gzm H z 1 g z m
(15)

In the case that g=1 g 1 , it is easy to see by using the De Moivre formula that the frequency response of the comb filter has the following magnitude and group delay:

|Hω|=2×(1+cosωm) τ gr , H ω=m2 H ω 2 1 ω m τ gr , H ω m 2
(16)
and it is straightforward to verify that the frequency band ranging from dc to the Nyquist frequency comprises m m zeros (antiresonances), equally spaced by F s mHz F s m Hz The phase response is piecewise linear with discontinuities of π at the odd multiples of F s 2m F s 2 m . If g<1 g 1 , it is easy to see that the amplitude of the resonances is
P=1+g P 1 g
(17)
while the amplitude of the points of minimum (halfway between contiguous resonances) is
V=1g V 1 g
(18)

An important parameter of this filtering structure, called non-recursive comb filter (or FIR comb), is the peak-to-valley ratio

PV=1+g1g P V 1 g 1 g
(19)

Figure 4 shows the response of a non-recursive comb filter having length m=11 m 11 samples and a reflection attenuation g=0.9 g 0.9 . The shape of the frequency response justifies the name comb given to the filter.

Figure 4: Magnitude of the frequency response of the comb FIR filter having coefficient g=0.9 g 0.9 and delay length m=11 m 11
Figure 4 (combfirresp.gif)

The zeros of the comb filter are evenly distributed along the unit circle at the m m-th roots of g g as shown in Figure 5.

Figure 5: Zeros and poles of an FIR comb filter
Figure 5 (pzcombfir.gif)

The Recursive Comb Filter

A simple model of one-dimensional resonator can be constructed using the basic blocks presented in this and in the preceding chapters. It is composed by an m m-samples delay line, with the incidental fractional part of m m obtained by FIR interpolation or allpass filtering, in feedback loop with an attenuation coefficient g g possibly replaced by a filter in order to give different decay times at different frequencies. Let us analyze the whole filtering structure in the case that m m is integer and g g is a positive constant not exceeding 1 1. The difference equation is expressed as

yn=x(nm)+gy(nm) y n x n m g y n m
(20)
and the transfer function is
Hz=zm1gzm H z z m 1 g z m
(21)

Whenever g<1 g 1 , the stability is ensured. In the case that g=1 g 1 , the frequency response of the filter has the following magnitude and group delay:

|Hω|=12sinωm2 τ gr , H ω=m2 H ω 1 2 ω m 2 τ gr , H ω m 2
(22)
and it is easy to verify that the frequency band ranging from dc to the Nyquist frequency comprises m m vertical asymptotes (resonances), equally spaced by F s mHz F s m Hz . If g=1 g 1 the filter is at the limit of stability, and this is the only case when the phase response is piecewise linear, starting with the value π2 2 at dc, with discontinuities of π at the even multiples of F s 2m F s 2 m . If g<1 g 1 , it is easy to verify that the amplitude of the resonances is
P=11g P 1 1 g
(23)
while the amplitude of the points of minimum (halfway between contiguous resonances) is
V=11+g V 1 1 g
(24)

An important parameter of this filtering structure, called recursive comb filter (or IIR comb), is the peak-to-valley ratio

PV=1+g1g P V 1 g 1 g
(25)

Figure 6 shows the frequency response of a recursive comb filter having a delay line of m=11 m 11 samples and feedback attenuation g=0.9 g 0.9 The shape of the magnitude response justifies the name comb given to the filter.

Figure 6: Magnitude and phase delay response of the recursive comb filter having coefficient g=0.9 g 0.9 and delay length m=11 m 11
Figure 6 (combresp.gif)

The poles of the comb filter are evenly distributed along the unit circle at the m m-th roots of g g, as shown in Figure 7.

Figure 7: Zeros and poles of an IIR comb filter
Figure 7 (pzcombiir.gif)

In sound synthesis by physical modeling, a recursive comb filter can be interpreted as a simple model of lossy one-dimensional resonator, like a string, or a tube. This model can be used to simulate several instruments whose resonator is not persistently excited. In fact, if the input is a short burst of filtered noise, we obtain the basic structure of the plucked string synthesis algorithm due to Karplus and Strong [3].

The Comb-Allpass Filter

The filter given by the difference Equation 14 has a frequency response characterized by evenly-distributed resonances.

With a slight modification of its structure, such filter can be made allpass. In other words, the magnitude response of the filter can be made flat even though the impulse response remains almost the same Equation 14. The modification is just a direct path connecting the input of the delay line to the filter output, as it is depicted in Figure 8.

Figure 8: Allpass comb filter
Figure 8 (combap.gif)

It is easy to see that the transfer function of the filter of Figure 8, called the allpass comb filter can be written as

Hz=g+zm1gzm H z g z m 1 g z m
(26)
which has the structure of an allpass filter. It is interesting to note that the direct path introduces a nonzero sample at the time instant zero in the impulse response. All the following samples are just a scaled version of those of the impulse response of the comb filter, with a scaling factor equal to 1g2 1 g 2 . The time properties, such as the time decay, are substantially unvaried. The allpass comb filter does not introduce any coloration in stationary signals. On the other hand, its effect is evident on signals exhibiting rapid transients, and for these signals we can not state that the filter is transparent.

Audio Effects Based on Delay Lines

Many of the effects commonly used in electroacoustic music are obtained by composition of time-varying delay lines, i.e., by lines whose length is modulated by slowly-varying signals. In order to avoid discontinuities in the signals, it is necessary to interpolate the delay lines in some way. The interpolation by means of allpass filters is applicable only for very slow modulations or for narrow-width modulations, since sudden changes in the state of allpass filters give rise to transients that can be perceived as signal distortions [4]. On the other hand, linear (or, more generally, polynomial) interpolation introduces frequency-dependent losses whose magnitude is dependent on the fractional length of the delay line. As the delay length is varied, these variable losses give an amplitude distortion due to amplitude modulation of the various frequency components. Coupled to amplitude modulation, there is also phase modulation due to phase nonlinearity of the interpolator, in both cases of FIR and IIR interpolation. The terminology used for audio effects is not consistent, as terms such as flanger, chorus, and phaser are often associated with a large variety of effects, that can be quite different from each other. A flanger is usually defined as an FIR comb filter whose delay length is sinusoidally modulated between a minimum and a maximum value. This has the effect of expanding and contracting the harmonic series of notches of the frequency response. The name flanger derives from the old practice, used long ago in the analog recording studios, to alternatively slow down the speed of two tape recorders or two turntables playing the same music track by pressing a finger on the flanges. The name phaser is most often reserved for structures similar to the comb FIR filter, with the difference that the notches are not harmonically distributed. Orfanidis [5] proposes to use, instead of the delay line, a bunch of parametric notch filters. Each notch is controllable in its frequency position and width. Smith [6], instead, proposes to use a large allpass filter instead of the delay line. If this allpass filter is obtained as a cascade of second-order allpass sections, it becomes possible to control and modulate the position of any single pole couple, which represent all the single notches of the overall response. A common feature of flangers and phasers is the relatively large distance between the notches. Vice versa, if the notches are very dense, the term chorus is preferred. Orfanidis [5], suggests to implement a chorus as a parallel of FIR comb filters, where the delay lengths are randomly modulated around values that are slightly different from each other. This should simulate the deviations in time and height that are found in performances of a choir singing in unison. Vice versa, Dattorro [4] says that a chorus can be obtained by the same structure used for the flanger, with a difference that the delay lengths have to be set to larger values than for the flanger. In this way, the notches are made more dense. For the flanger the suggested nominal delay is 1msec and for the chorus it is 5msec. If the objective is to recreate the effect of a choir singing in unison, the fact of having many notches in the spectrum is generally disliked. Dattorro [4] proposes a partial solution that makes use of a recursive allpass filter, where the delay line is read by two pointers, one is kept fixed and produces the feedback signal, the other is varied to pick up the signal that is fed directly to the output. In this way, when both the pointers are at the nominal position, the structure does not introduce any coloration for stationary signals. A final remark is reserved to the spatialization of these comb-based effects. In general, flanging, phasing, and chorusing effects can be obtained from two different time-varying allpass chains, whose outputs feed different loudspeakers [6]. In this case, sums and subtractions between signals at the different frequencies happen “on air” in a way dependent from position. Therefore, the spatial sensation is largely due to the different spectral coloration found in different points of the listening area.

A comprehensive book on the topic of digital audio effects was edited by Zölzer [7]

Footnotes

  1. The subscript ss is here used for continuous-time systems.
  2. To figure this out, the reader can consider an mm-sample delay line in a feedback loop. It gives a harmonic series of partials whose fundamental is f 0 = F s m f 0 F s m (see Section 6). The set of integer delay lengths that give the best approximation to a tempered scale can be found and the curve of fundamental frequency errors can be drawn.

References

  1. Timo I. Laakso, Vesa Välimäki, Matti Karjalainen, and Unto K. Laine. (1999, Sep). Splitting the Unit Delay - Tools for Fractional Delay Filter Design. IEEE Signal Processing Magazine.
  2. Davide Rocchesso and Francesco Scalcon. (1996, Jan). Bandwidth of perceived inharmonicity for physical modeling of dispersive strings. IEEE Transactions on Speech and Audio Processing.
  3. K. Karplus and A. Strong. (1983). Digital Synthesis of Plucked String and Drum Timbres. Computer Music J..
  4. Jon Dattorro. (1997, Oct). Effect design - part 2: Delay-line modulation and chorus. J. Audio Eng. Soc..
  5. S. J. Orfanidis. (1996). Introduction to Signal Processing. Prentice Hall.
  6. Julius O. Smith. (1984). An allpass approach to digital phasing and flanging. [http://ccrma.stanford.edu/STANM/stanms/stanm21/index.html]. In Proc. International Computer Music Conference - Also available as Rep. STAN-M-21, CCRMA, Stanford University.
  7. S. J. Orfanidis (Ed.). (2002). DAFx - Digital Audio Effects. John Wiley and Sons.

Content actions

Download module as:

Add module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks