Most acoustic systems have some component where waves can propagate, such as a membrane, a string, or the air in an enclosure.
If propagation in these media is ideal, i.e., free of losses, dispersion, and nonlinearities, it can be simulated by delay lines.
A delay line is a linear time-invariant, single-input single-output system, whose output signal is a copy of the input signal delayed by
τ
τ
seconds. In continuous time, the frequency response of such system is
H
D
s
ⅈΩ=ⅇ-ⅈΩτ
H
D
s
Ω
Ω
τ
(1)
Equation 1 tells us that the magnitude response is
unity, and that the phase is linear with slope
-τ
τ
.
A discrete-time realization of the Equation 1 is given by a system that implements the transfer function
H
D
(
z
)
=
z
-
τ
F
s
≜
z
-
m
H
D
(
z
)
=
z
-
τ
F
s
≜
z
-
m
(2)
where
m
m
is the number of samples of delay. When the delay
τ
τ
is an integral multiple of the sampling quantum,
m
m
is an integer number and it is straightforward
to implement
Equation 2 by means of a memory buffer.
In fact, an
m
m-samples delay line can be implemented by means of a circular buffer, that is a set of
M
M
contiguous memory cells accessed by a write pointer
IN
IN
and a read pointer
OUT
OUT
, such that
IN
=
(
OUT
+
m
)
%
M
IN=(OUT+m)%M
(3)
where the symbol
%
%
is used for the quotient modulo
M
M. At each sampling instant, the input is written in the location pointed by
IN
IN
, the output is taken
from the location pointed by
OUT
OUT
, and the two pointers are updated with
IN
=
(
IN
+
1
)
%
M
OUT
=
(
OUT
+
1
)
%
M
IN
=
(
IN
+
1
)
%
M
OUT
=
(
OUT
+
1
)
%
M
(4)
In words, the pointers are incremented respecting the circularity of the buffer.
In some architectures dedicated to sound processing, memory organization
is optimized for wavetable synthesis, where a stored waveform is read with
variable increments of the reading pointer. In these architectures, a quantity of
2r
2
r
memory locations is available, and from these,
M=2s
M
2
s
locations (with
s<r
s
r
) are uniformly chosen among the
2r
2
r
available cells. In this case the locations of the circular
buffer are not contiguous, and the update of the pointers is
done with the operations
IN
=
(
IN
+
2
r
-
s
)
%
2
r
OUT
=
(
OUT
+
2
r
-
s
)
%
2
r
IN
=
(
IN
+
2
r
-
s
)
%
2
r
OUT
=
(
OUT
+
2
r
-
s
)
%
2
r
(5)
In practice, since the addresses are
r
r-bits long, there is no need to compute
the modulo explicitly. It is sufficient to do the sum neglecting any possible
overflow.
Of course, Equation 3 is also replaced by
IN
=
(
OUT
+
m
2
r
-
s
)
%
2
r
IN
=
(
OUT
+
m
2
r
-
s
)
%
2
r
(6)
It might be thought that, choosing a sufficiently high sampling rate, it is
always possible to use delay lines having an integer number of samples. Actually,
there are some good reasons that lead us to state that this is not the case in
sound synthesis and processing.
In sound synthesis, the models have to be carefully tuned without resorting
to very high sample rates. In particular, it is easy to verify that using integer-length
delays in physical models we get errors in fundamental frequencies that
go well beyond the just noticeable difference in pitch.
For instance, for a pressure wave propagating in air at normal temperature conditions,
the spatial discretization given by the sampling rate
F
s
=44100Hz
F
s
44100
Hz
gives intervals of
0.0075m
0.0075
m
, a distance that can produce well-perceivable pitch
differences in a wind instrument.
Another reason for using fractional delays is that we often want to vary
the delay lengths continuously, in order to reproduce effects such as glissando
or vibrato. The adoption of integer-length delays would produce annoying discontinuities.
The most widely used techniques for implementing fractional delays are
interpolation by FIR filters or by allpass filters. These two techniques are, in
some sense, complementary. The choice of one of the two has to be made according
to the peculiarities of the system to be simulated or of the architecture
chosen for the implementation. In any case, a delay of length
m
m
is obtained by means of a delay line whose length is equal to the integer part of
m
m, cascaded with a block capable to approximate a constant phase delay equal to the fractional part of
m
m. We recall that the phase delay at a given frequency
ω
ω
is the delay in time samples experienced by the sinusoidal component at frequency
ω
ω. For instance, consider a linear filtering block enclosed in a feedback loop
(see Section 6): the frequency of the
k
k-th resonance
f
k
f
k
of the whole feedback system is found at the points where the phase response equates the multiples of
2π
2
. At these frequencies, the components reappear in phase every round trip
in the loop, thus reinforcing their amplitude at the output. The phase delay at frequency
f
k
f
k
is therefore the effective delay length at that frequency, that is
the length of an ideal (linear phase) delay line that gives the same
k
k-th resonance. Figure 1 shows a phase curve and its crossings with multiples of
2π
2
giving a distribution of resonances.
The easiest and most intuitive way to obtain a variable-length delay is to
linearly interpolate the output of the line with the content of its preceding cell
in the memory buffer. This corresponds to using the first-order FIR filter
H
l
z=
c
0
+
c
1
z-1
H
l
z
c
0
c
1
z
(7)
Given a certain phase delay
τ
ph
0
=-1
ω
0
arctan-
c
1
sin
ω
0
c
0
+
c
1
cos
ω
0
τ
ph
0
1
ω
0
c
1
ω
0
c
0
c
1
ω
0
(8)
that has to be obtained at a given frequency
ω
0
ω
0
, the following formulas give the coefficient values:
c
0
+
c
1
=1
c
1
=11+sin
ω
0
tan
τ
ph
0
ω
0
−cos
ω
0
≈
τ
ph
0
c
0
c
1
1
c
1
1
1
ω
0
τ
ph
0
ω
0
ω
0
τ
ph
0
(9)
where the approximation is valid in the low-frequency range. The first part of
Equation 9
is needed in order to normalize the low-frequency response to one. In the special case that
c
0
=
c
1
=1/2
c
0
c
1
12
(averaging filter) the phase is linear and the delay
is of half a sample. Unfortunately, the magnitude response of this interpolator
is lowpass with a zero at the Nyquist frequency.
Figure 2
shows the magnitude, phase, and phase delay responses for several first-order linear interpolators. We
can see that the phase is linear in most of the audio range, but the magnitude
varies from the allpass to the lowpass with a zero at the Nyquist frequency. When
the interpolator is inserted within a feedback loop, its lowpass behavior can be
treated as an additional frequency-dependent loss, which should be somewhat
taken into account.
Interpolation filters can be of order higher than the first. We can do quadratic,
cubic, or other polynomial interpolations. In general, the problem of
designing an interpolator can be turned into the design of an
l
l-th order FIR filter approximating a constant-magnitude and linear-phase frequency response. Several
criteria can be adopted to drive the approximation problem. One approach is
to impose that the first
L
L
derivatives of the error function will be zero at zero
frequency. In this way we obtain maximally-flat filters whose coefficients are
the same used in Lagrange interpolation as it is taught in numerical analysis
courses. For a thorough treatment of interpolation filters we suggest reading the article
[1]. Here we only point out that using high orders allows to keep
the magnitude response close to unity and a phase response close to linear in
a wide frequency band. Of course, this is paid in terms of computational complexity.
In special architectures, where the access to delay lines is governed by
Equation 5 and
Equation 6, the linear interpolation is implemented very efficiently by using the
r−s
r
s
bits that are not used to access the
2s
2
s
samples delay line. In fact, if the address is computed using
r
r
bits, the
r−s
r
s
least significant bits represent the fractional part of the delay or, equivalenty, the coefficient
c
1
c
1
of the interpolator. Therefore, it is sufficient to access two consecutive delay cells and keep the values
c
0
c
0
and
c
1
=1−
c
0
c
1
1
c
0
in two registers. The implementation of a glissando with these architectures is immediate and free from complications.
Another widely used technique to obtain the fractional part of a desired
delay length makes use of unit-magnitude IIR filters, i.e., allpass filters. Since
the magnitude of these filters is constant there is no frequency-dependent attenuation,
a property that can never be ensured by FIR filters. The simplest allpass
filter has order one, and it has the following transfer function:
H
a
z=c+z-11+cz-1
H
a
z
c
z
1
c
z
(10)
In order to make sure that the filter is stable, the coefficient
c
c
has to stay within the unit circle. Moreover, if we stick with real coefficients,
c
c
belongs to the real axis. The phase delay given by the filter Equation 10
is shown in Figure 3 for several values of the coefficient
c
c. It is clear that the phase delay is not as flat as in the case of the FIR interpolator, depicted in
Figure 2.
It is easy to verify that, at frequencies close to dc, the phase response of
Equation 10 takes the approximate form
∠Hω≈-sinωc+cosω+csinω1+ccosω≈-ω1−c1+c
H
ω
ω
c
ω
c
ω
1
c
ω
ω
1
c
1
c
(11)
where the first approximation is obtained by replacing the argument of the
arctan
with the function value and the second approximation, valid in an even
smaller neighborhood, is obtained by approximating
sinx
x
with
x
x
and
cosx
x
with
1
1. The phase and group delay around dc are
τ
ph
ω≈
τ
gr
ω≈1−c1+c
τ
ph
ω
τ
gr
ω
1
c
1
c
(12)
Therefore, the filter coefficient
c
c
can be easily determined from the desired low-frequency delay as
c=1−
τ
ph
01+
τ
ph
0
c
1
τ
ph
0
1
τ
ph
0
(13)
Figure 3 shows that the delay of the allpass filter is approximately constant
only in a narrow frequency range. We can reasonably assume that such range,
for positive values of
c
c
smaller than one, extends from
0
0 to
F
s
5
F
s
5
. With
F
s
=50kHz
F
s
50
kHz
we see that at
F
s
5=10kHz
F
s
5
10
kHz
we have an error of about
0.05
0.05
samples. In a note at that frequency produced by a feedback delay line, such an
error produces a pitch deviation smaller than
1%
1
%
.
For lower fundamental frequencies, such as those found in actual musical instruments, the error is smaller
than the just noticeable difference measured with slow pitch modulations. If the first-order filter represents an elegant and efficient solution to the
problem of tuning a delay line, it has also the relevant side effect of detuning
the upper partials, due to the marked phase nonlinearity. Such detuning can be
tolerated in most cases, but has to be taken into account in some other contexts.
If a phase response closer to linear is needed, we can use higher-order allpass filters
[1].
In some cases, especially in sound synthesis by physical modeling,
a specific inharmonic distribution of resonances has to be approximated. This
can be obtained by designing allpass filters that approximate a given phase
response along the whole frequency axis. In these cases the problem of tuning
is superseded by the more difficult problem of accurate partial positioning
[2].
With allpass interpolators it is more complicated to handle continuous delay
length variations, since the recursive structure of the filter does not show an obvious
way of transferring memory cells from and to the delay line, as it was in
the case of the FIR interpolator, which is constructed on the delay line by a
certain number of taps. Indeed, the glissando can be implemented with the allpass
filter by adding a new cell to the delay line whenever the filter coefficient
becomes one and, at the same time, zeroing out the filter state variable and the
coefficient. What is really more complicated with allpass filters is to handle
sudden variations of the delay length, as they are found, for instance, when a
finger hole is opened in a wind instrument. In this case, the recursive nature of
allpass filters causes annoying transients in the output signal. Ad hoc structures
have been devised to cancel these transients
[1].
Sounds, propagating in the air, come into contact with surfaces and objects
of various kinds and this interaction produces physical phenomena such as reflection, refraction, and diffraction. A simple and very important phenomenon
is the reflection of sound about a planar surface. Due to a reflection such as
this, a listener receives two delayed copies of the same signal. If the delay is
larger than about a hundred milliseconds, the second copy is perceived as a
distinguished echo, while if the delay is smaller than about ten milliseconds,
the effect of a single reflection is perceived as a spectral coloration.
A simple model of single reflection can be constructed starting from the
basic blocks described in this and in the preceding chapters. It is constructed as an
m
m-samples delay line, with the incidental fractional part of
m
m
obtained by FIR interpolation or allpass filtering, cascaded with an attenuation coefficient
g
g, possibly replaced by a filter if a frequency-dependent absorption has to be
simulated. The output of this lossy delay line is summed to the direct signal.
Let us analyze the structure in the case that
m
m
is integer and
g
g
is a positive constant not exceeding
1
1. The difference equation is expressed as
yn=xn+gxn−m
y
n
x
n
g
x
n
m
(14)
and, therefore, the transfer function is
Hz=1+gz-m
H
z
1
g
z
m
(15)
In the case that
g=1
g
1
, it is easy to see by using the De Moivre formula
that the frequency response of the comb filter has the following magnitude and group delay:
|Hω|=21+cosωm
τ
gr
,
H
ω=m2
H
ω
2
1
ω
m
τ
gr
,
H
ω
m
2
(16)
and it is straightforward to verify that the frequency band ranging from dc to the Nyquist frequency comprises
m
m
zeros (antiresonances), equally spaced by
F
s
mHz
F
s
m
Hz
The phase response is piecewise linear with discontinuities of
π
at the odd multiples of
F
s
2m
F
s
2
m
. If
g<1
g
1
, it is easy to see that the amplitude of the resonances is
P=1+g
P
1
g
(17)
while the amplitude of the points of minimum (halfway between contiguous resonances) is
V=1−g
V
1
g
(18)
An important parameter of this filtering structure, called non-recursive comb filter (or FIR comb), is the peak-to-valley ratio
PV=1+g1−g
P
V
1
g
1
g
(19)
Figure 4
shows the response of a non-recursive comb filter having length
m=11
m
11
samples and a reflection attenuation
g=0.9
g
0.9
.
The shape of the frequency response justifies the name comb given to the filter.
The zeros of the comb filter are evenly distributed along the unit circle at the
m
m-th roots of
-g
g
as shown in Figure 5.
A simple model of one-dimensional resonator can be constructed using the
basic blocks presented in this and in the preceding chapters. It is composed by an
m
m-samples delay line, with the incidental fractional part of
m
m
obtained by FIR interpolation or allpass filtering, in feedback loop with an attenuation coefficient
g
g
possibly replaced by a filter in order to give different decay times
at different frequencies. Let us analyze the whole filtering structure in the case that
m
m
is integer and
g
g
is a positive constant not exceeding
1
1.
The difference equation is expressed as
yn=xn−m+gyn−m
y
n
x
n
m
g
y
n
m
(20)
and the transfer function is
Hz=z-m1−gz-m
H
z
z
m
1
g
z
m
(21)
Whenever
g<1
g
1
, the stability is ensured. In the case that
g=1
g
1
, the frequency response of the filter has the following magnitude and group delay:
|Hω|=12sinωm2
τ
gr
,
H
ω=m2
H
ω
1
2
ω
m
2
τ
gr
,
H
ω
m
2
(22)
and it is easy to verify that the frequency band ranging from dc to the Nyquist frequency comprises
m
m
vertical asymptotes (resonances), equally spaced by
F
s
mHz
F
s
m
Hz
. If
g=1
g
1
the filter is at the limit of stability, and this is the only case when the phase response is piecewise linear, starting with the value
-π2
2
at dc, with discontinuities of
π
at the even multiples of
F
s
2m
F
s
2
m
. If
g<1
g
1
, it is easy to verify that the amplitude of the resonances is
P=11−g
P
1
1
g
(23)
while the amplitude of the points of minimum (halfway between contiguous resonances) is
V=11+g
V
1
1
g
(24)
An important parameter of this filtering structure, called recursive comb filter (or IIR comb), is the peak-to-valley ratio
PV=1+g1−g
P
V
1
g
1
g
(25)
Figure 6
shows the frequency response of a recursive comb filter having a delay line of
m=11
m
11
samples and feedback attenuation
g=0.9
g
0.9
The shape of the magnitude response justifies the name comb given to the filter.
The poles of the comb filter are evenly distributed along the unit circle at the
m
m-th roots of
g
g, as shown in Figure 7.
In sound synthesis by physical modeling, a recursive comb filter can be interpreted
as a simple model of lossy one-dimensional resonator, like a string,
or a tube. This model can be used to simulate several instruments whose resonator
is not persistently excited. In fact, if the input is a short burst of filtered
noise, we obtain the basic structure of the plucked string synthesis algorithm
due to Karplus and Strong [3].
The filter given by the difference Equation 14
has a frequency response characterized by evenly-distributed resonances.
With a slight modification of its structure, such filter can be made allpass. In other words, the magnitude response
of the filter can be made flat even though the impulse response remains almost the same
Equation 14.
The modification is just a direct path connecting the input
of the delay line to the filter output, as it is depicted in Figure 8.
It is easy to see that the transfer function of the filter of
Figure 8, called the allpass comb filter can be written as
Hz=-g+z-m1−gz-m
H
z
g
z
m
1
g
z
m
(26)
which has the structure of an allpass filter. It is interesting to note that the
direct path introduces a nonzero sample at the time instant zero in the impulse
response. All the following samples are just a scaled version of those of the
impulse response of the comb filter, with a scaling factor equal to
1−g2
1
g
2
.
The time properties, such as the time decay, are substantially unvaried. The allpass
comb filter does not introduce any coloration in stationary signals. On the other
hand, its effect is evident on signals exhibiting rapid transients, and for these
signals we can not state that the filter is transparent.
Many of the effects commonly used in electroacoustic music are obtained
by composition of time-varying delay lines, i.e., by lines whose length is modulated
by slowly-varying signals. In order to avoid discontinuities in the signals,
it is necessary to interpolate the delay lines in some way. The interpolation
by means of allpass filters is applicable only for very slow modulations or
for narrow-width modulations, since sudden changes in the state of allpass filters
give rise to transients that can be perceived as signal distortions [4]. On
the other hand, linear (or, more generally, polynomial) interpolation introduces
frequency-dependent losses whose magnitude is dependent on the fractional
length of the delay line. As the delay length is varied, these variable losses give
an amplitude distortion due to amplitude modulation of the various frequency
components. Coupled to amplitude modulation, there is also phase modulation
due to phase nonlinearity of the interpolator, in both cases of FIR and IIR interpolation.
The terminology used for audio effects is not consistent, as terms such as
flanger, chorus, and phaser are often associated with a large variety of effects,
that can be quite different from each other. A flanger is usually defined as an
FIR comb filter whose delay length is sinusoidally modulated between a minimum
and a maximum value. This has the effect of expanding and contracting
the harmonic series of notches of the frequency response. The name flanger
derives from the old practice, used long ago in the analog recording studios,
to alternatively slow down the speed of two tape recorders or two turntables
playing the same music track by pressing a finger on the flanges.
The name phaser is most often reserved for structures similar to the comb
FIR filter, with the difference that the notches are not harmonically distributed.
Orfanidis [5] proposes to use, instead of the delay line, a bunch of parametric
notch filters. Each notch is controllable in its frequency position and width. Smith
[6], instead, proposes to use
a large allpass filter instead of the delay line. If this allpass filter is obtained as
a cascade of second-order allpass sections, it becomes possible to control and
modulate the position of any single pole couple, which represent all the single
notches of the overall response. A common feature of flangers and phasers is
the relatively large distance between the notches. Vice versa, if the notches are
very dense, the term chorus is preferred. Orfanidis [5], suggests to implement
a chorus as a parallel of FIR comb filters, where the delay lengths are randomly
modulated around values that are slightly different from each other. This should
simulate the deviations in time and height that are found in performances of a
choir singing in unison. Vice versa, Dattorro [4] says that a chorus can be obtained
by the same structure used for the flanger, with a difference that the delay
lengths have to be set to larger values than for the flanger. In this way, the
notches are made more dense. For the flanger the suggested nominal delay is
1msec and for the chorus it is 5msec. If the objective is to recreate the effect of
a choir singing in unison, the fact of having many notches in the spectrum is
generally disliked. Dattorro [4] proposes a partial solution that makes use of a
recursive allpass filter, where the delay line is read by two pointers, one is kept
fixed and produces the feedback signal, the other is varied to pick up the signal
that is fed directly to the output. In this way, when both the pointers are at the
nominal position, the structure does not introduce any coloration for stationary
signals. A final remark is reserved to the spatialization of these comb-based effects.
In general, flanging, phasing, and chorusing effects can be obtained from two
different time-varying allpass chains, whose outputs feed different loudspeakers [6].
In this case, sums and subtractions between signals at the different frequencies
happen “on air” in a way dependent from position. Therefore, the spatial
sensation is largely due to the different spectral coloration found in different
points of the listening area.
A comprehensive book on the topic of digital audio effects was edited by Zölzer [7]
-
Timo I. Laakso, Vesa Välimäki, Matti Karjalainen, and Unto K. Laine. (1999, Sep). Splitting the Unit Delay - Tools for Fractional Delay Filter Design. IEEE Signal Processing Magazine.
-
Davide Rocchesso and Francesco Scalcon. (1996, Jan). Bandwidth of perceived inharmonicity for physical modeling of dispersive strings. IEEE Transactions on Speech and Audio Processing.
-
K. Karplus and A. Strong. (1983). Digital Synthesis of Plucked String and Drum Timbres. Computer Music J..
-
Jon Dattorro. (1997, Oct). Effect design - part 2: Delay-line modulation and chorus. J. Audio Eng. Soc..
-
S. J. Orfanidis. (1996). Introduction to Signal Processing. Prentice Hall.
-
Julius O. Smith. (1984). An allpass approach to digital phasing and flanging. [http://ccrma.stanford.edu/STANM/stanms/stanm21/index.html]. In Proc. International Computer Music Conference - Also available as Rep. STAN-M-21, CCRMA, Stanford University.
-
S. J. Orfanidis (Ed.). (2002). DAFx - Digital Audio Effects. John Wiley and Sons.