One aspect of discrete-time signal processing that has no
counterpart in the analog world is developing models of systems
to describe how signals arise. For example, the fundamental model of the physics of speech production amounts to
passing a sequence of pulses through a linear system. Meaning is
conveyed by the properties of the linear system, which describes
how the vocal tract responds to pitch pulses. Physically and
mathematically, the speech production system is analog. However,
suppose we sample the speech signal, obeying the Sampling
Theorem of course. The discrete-time model for sampled speech
would again be a sequence of pulses, this time a periodic
sequence of unit samples, passing through a linear discrete-time
system governed by some transfer function. It is this transfer
function that we can estimate from the
sampled speech signal, which means that we have the potential
for extracting from the sampled speech signal what is being
said. In short, we could develop a speech recognition
system.
The key notion we need to develop is how to estimate a
discrete-time system's input-output relation by observing its
output with only partial knowledge of the input (in the speech
case we know it's a sequence of pulses, but we don't know when
they occur). Assume that the input-output relation we seek has
the form of a difference equation with no "
bb " terms other than the first.
This case -- having only one term in the difference equation
involving the input -- turns out to be the simplest. If you try
to include more and follow the subsequent estimation procedure,
you will discover that the situation is not nearly as nice.
sn=
a
1
sn-1+…+
a
p
sn-p+Gxn
sn
a
1
s
n
1
…
a
p
s
n
p
G
x
n
(1)
What we develop in sequel is a way of estimating
a1…ap
a1
…
ap
and
GG from the observed signal
sn
s
n
.
Let's begin with the simplest case:
p=1
p
1
. If the signal is indeed governed by the relation
sn=
a
1
sn-1+Gxn
sn
a
1
s
n
1
G
x
n
, with the input signal
xn
x
n
having mostly zero values save when unit samples
occur, then we should be able to predict
each sample from the previous one. Letting
s
^
n
s
^
n
denote the predicted signal, it is given by
s
^
n=
a
1
sn
s
^
n
a
1
s
n
. However, we don't know the value of
a
1
a
1
; that's what we want to find by minimizing the
mean-squared prediction error. Mathematically, we
want to solve the problem
min
a
1
{∑nsn-
s
^
n2}
a
1
n
n
s
n
s
^
n
2
(2)
Substituting the form of prediction made by our simple linear system,
the problem becomes finding the value of
a1a1
to minimize
∑nsn-
a
1
sn-12
n
n
s
n
a
1
s
n
1
2
. This minimization is easily accomplished by evaluating the
derivative with respect to
a1a1, setting the result
to zero, and solving for
a1a1.
dd
a
1
∑nsn-
a
1
sn-12=-2∑nsn-1sn-
a
1
sn-1=0
a
1
n
n
s
n
a
1
s
n
1
2
2
n
n
s
n
1
s
n
a
1
s
n
1
0
(3)
?a1=∑nsnsn-1∑ns2n
?
a1
n
n
s
n
s
n
1
n
n
s
n
2
(4)
Therefore, we can find the best model having the assumed form
from the signal itself . For our simple
case, the quantities that must be computed and the calculation
of the model's parameters can easily be made.
More generally, we need to use more terms; we solve for the
parameters
a
1
…
a
n
a
1
…
a
n
by minimizing the prediction error as before.
min
a
1
…
a
n
{∑nsn-∑m=1p
a
m
sn-m2}
a
1
…
a
n
n
n
sn
m
1
p
a
m
s
n
m
2
(5)
We must calculate the partial derivative with respect to each
parameter, and set each derivative to zero, yielding a set of
pp equations to solve.
∀i,i∈1…p:∂∂
a
i
∑nsn-∑m=1p
a
m
sn-m2=-2∑nsn-isn-∑m=1p
a
m
sn-m
i
i
1
…
p
a
i
n
n
sn
m
1
p
a
m
s
n
m
2
2
n
n
s
n
i
sn
m
1
p
a
m
s
n
m
(6)
Simplifications lead to the following set of
linear equations that must be solved to find
a
1
…
a
p
a
1
…
a
p
.
a
1
∑ns2n-1+
a
2
∑nsn-1sn-2+…+ap∑nsn-1sn-p=∑nsnsn-1
a
1
n
n
s
n
1
2
a
2
n
n
s
n
1
s
n
2
…
ap
n
n
s
n
1
s
n
p
n
n
s
n
s
n
1
(7)
a1∑nsn-2sn-1+a2∑ns2n-2+…+ap∑nsn-2sn-p=∑nsnsn-2
a1
n
n
s
n
2
s
n
1
a2
n
n
s
n
2
2
…
ap
n
n
s
n
2
s
n
p
n
n
s
n
s
n
2
(8)
?
?
(9)
a1∑nsn-psn-1+a2∑nsn-psn-2+…+ap∑ns2n-p=∑nsnsn-p
a1
n
n
s
n
p
s
n
1
a2
n
n
s
n
p
s
n
2
…
ap
n
n
s
n
p
2
n
n
s
n
s
n
p
(10)
The coefficients in this equation set are known as
autocorrelation functions, and are succinctly
expressed as
rij=∑nsn-isn-j
r
i
j
n
n
s
n
i
s
n
j
(11)
To use matrix notation, let
RR denote the matrix formed from
these autocorrelations (
R
i
,
j
=rij
R
i
,
j
r
ij
),
a=
a
1
…
a
P
a
a
1
…
a
P
, and
r=r01…r0pT
r
r
0
1
…
r
0
p
. The equations that must be solved for our model's
parameters now take the form
Ra=r
R
a
r
(12)
We have
pp
equations in
pp
unknowns, and the solution is mathematically expressed using the
matrix inverse.
a=R-1r
a
R
r
(13)
Analytically calculating this matrix inverse is quite
difficult. In MATLAB, numerically solving such sets of linear
equations is easy.
Once solved, the parameters thus obtained provide us with a
system model, expressible with both the difference equation and
the frequency response, for how the signal came to be. This
technique has many important applications, all of which revolve
around the fact that a few numbers, the filter parameters,
summarize all the signal's essential features. Focusing on
speech, we can use the parameters to provide a spectrum that
reflects the vocal tract's characteristics. We can
use these parameters to concisely represent an entire speech
section. Such a succinct, accurate representation means that we
can efficiently represent essential speech features to
communicate what is being said or to develop speech recognition
systems.