A training sequence used for equalization is often chosen to be a noise-like sequence which is needed to estimate the channel frequency response.

In the simplest sense, training sequence might be a single narrow pulse, but a pseudonoise (PN) signal is preferred in practise because the PN signal has larger average power and hence larger SNR for the same peak transmitted power.

Consider that a single pulse was transmitted over a system designated to have a raised-cosine transfer function
HRC(t)=Ht(f).Hr(f)HRC(t)=Ht(f).Hr(f) size 12{H rSub { size 8{ ital "RC"} } \( t \) =H rSub { size 8{t} } \( f \) "." H rSub { size 8{r} } \( f \) } {}, also consider that the channel induces ISI, so that the received demodulated pulse exhibits distortion, as shown in figure 1, such that the pulse sidelobes do not go through zero at sample times. To achieve the desired raised-cosine transfer function, the equalizing filter should have a frequency response

He(f)=1Hc(f)=1∣Hc(f)∣e−jθc(f)He(f)=1Hc(f)=1∣Hc(f)∣e−jθc(f) size 12{H rSub { size 8{e} } \( f \) = { {1} over {H rSub { size 8{c} } \( f \) } } = { {1} over { lline H rSub { size 8{c} } \( f \) rline } } e rSup { size 8{ - jθ rSub { size 6{c} } \( f \) } } } {} (1)

In other words, we would like the equalizing filter to generate a set of canceling echoes. The transversal filter, illustrated in figure 2, is the most popular form of an easily adjustable equalizing filter consisting of a delay line with T-second taps (where T is the symbol duration). The tab weights could be chosen to force the system impulse response to zero at all but one of the sampling times, thus making
He(f)He(f) size 12{H rSub { size 8{e} } \( f \) } {} correspond exactly to the inverse of the channel transfer function
Hc(f)Hc(f) size 12{H rSub { size 8{c} } \( f \) } {}

Consider that there are
2N+12N+1 size 12{2N+1} {} taps with weights
c−N,c−N+1,...cNc−N,c−N+1,...cN size 12{c rSub { size 8{ - N} } ,c rSub { size 8{ - N+1} } , "." "." "." c rSub { size 8{N} } } {} . Output samples
z(k)z(k) size 12{z \( k \) } {} are the convolution the input sample
x(k)x(k) size 12{x \( k \) } {} and tap weights
cncn size 12{c rSub { size 8{n} } } {} as follows:

z(k)=∑n=−NNx(k−n)cnz(k)=∑n=−NNx(k−n)cn size 12{z \( k \) = Sum cSub { size 8{n= - N} } cSup { size 8{N} } {x \( k - n \) c rSub { size 8{n} } } } {}k=−2N,...2Nk=−2N,...2N size 12{k= - 2N, "." "." "." 2N} {}(2)

By defining the vectors z and c and the matrix x as respectively,

z
=
z
(
−
2N
)
⋮
z
(
0
)
⋮
z
(
2N
)
z
=
z
(
−
2N
)
⋮
z
(
0
)
⋮
z
(
2N
)
size 12{z= left [ matrix {
z \( - 2N \) {} ##
dotsvert {} ##
z \( 0 \) {} ##
dotsvert {} ##
z \( 2N \)
} right ]} {}
c
=
c
−
N
)
⋮
c
0
⋮
c
N
c
=
c
−
N
)
⋮
c
0
⋮
c
N
size 12{c= left [ matrix {
c rSub { size 8{ - N} } \) {} ##
dotsvert {} ##
c rSub { size 8{0} } {} ##
dotsvert {} ##
c rSub { size 8{N} }
} right ]} {}
x
=
x
(
−
N
)
0
0
…
0
0
x
(
−
N
+
1
)
x
(
−
N
)
0
…
…
…
⋮
⋮
⋮
x
(
N
)
x
(
N
−
1
)
x
(
N
−
2
)
…
x
(
−
N
+
1
)
x
(
−
N
)
⋮
⋮
⋮
0
0
0
…
x
(
N
)
x
(
N
−
1
)
0
0
0
…
0
x
(
N
)
x
=
x
(
−
N
)
0
0
…
0
0
x
(
−
N
+
1
)
x
(
−
N
)
0
…
…
…
⋮
⋮
⋮
x
(
N
)
x
(
N
−
1
)
x
(
N
−
2
)
…
x
(
−
N
+
1
)
x
(
−
N
)
⋮
⋮
⋮
0
0
0
…
x
(
N
)
x
(
N
−
1
)
0
0
0
…
0
x
(
N
)
size 12{x= left [ matrix {
x \( - N \) {} # 0 {} # 0 {} # dotslow {} # 0 {} # 0 {} ##
x \( - N+1 \) {} # x \( - N \) {} # 0 {} # dotslow {} # dotslow {} # dotslow {} ##
dotsvert {} # {} # {} # dotsvert {} # {} # dotsvert {} ##
x \( N \) {} # x \( N - 1 \) {} # x \( N - 2 \) {} # dotslow {} # x \( - N+1 \) {} # x \( - N \) {} ##
dotsvert {} # {} # {} # dotsvert {} # {} # dotsvert {} ##
0 {} # 0 {} # 0 {} # dotslow {} # x \( N \) {} # x \( N - 1 \) {} ##
0 {} # 0 {} # 0 {} # dotslow {} # 0 {} # x \( N \) {}
} right ]} {}

We can describe the relationship among
z(k)z(k) size 12{z \( k \) } {},
x(k)x(k) size 12{x \( k \) } {} and
cncn size 12{c rSub { size 8{n} } } {} more compactly as

z=x.cz=x.c size 12{z=x "." c} {}(3a)

Whenever the matrix x is square, we can find c by solving the following equation:

c=x−1zc=x−1z size 12{c=x rSup { size 8{ - 1} } z} {}(3b)

Notice that the index k was arbitrarily chosen to allow for
4N+14N+1 size 12{4N+1} {} sample points. The vectors z and c have dimensions
4N+14N+1 size 12{4N+1} {} and
2N+12N+1 size 12{2N+1} {}. Such equations are referred to as an overdetermined set. This problem can be solved in deterministic way known as the zero-forcing solution, or, in a statistical way, known as the minimum mean-square error (MSE) solution.

At first, by disposing top N rows and bottom N rows, matrix x is transformed into a square matrix of dimension
2N+12N+1 size 12{2N+1} {} by
2N+12N+1 size 12{2N+1} {}. Then equation
c=x−1zc=x−1z size 12{c=x rSup { size 8{ - 1} } z} {} is used to solve the
2N+12N+1 size 12{2N+1} {} simultaneous equations for the set of
2N+12N+1 size 12{2N+1} {} weights
cncn size 12{c rSub { size 8{n} } } {}. This solution minimizes the peak ISI distortion by selecting the
CnCn size 12{C rSub { size 8{n} } } {} weight so that the equalizer output is forced to zero at N sample points on either side of the desired pulse.

z(k)={1k=00k=±1,±2,±3z(k)={1k=00k=±1,±2,±3 size 12{z \( k \) = left lbrace matrix {
1 {} # k=0 {} ##
0 {} # k= +- 1, +- 2, +- 3{}
} right none } {}(4)

For such an equalizer with finite length, the peak distortion is guaranteed to be minimized only if the eye pattern is initially open. However, for high-speed transmission and channels introducing much ISI, the eye is often closed before equalization. Since the zero-forcing equalizer neglects the effect of noise, it is not always the best system solution.

A more robust equalizer is obtained if the
cncn size 12{c rSub { size 8{n} } } {} tap weights are chose to minimize the mean-square error (MSE) of all the ISI term plus the noise power at the out put of the equalizer. MSE is defined as the expected value of the squared difference between the desire data symbol and the estimated data symbol.

By multiplying both sides of equation (4) by
xTxT size 12{x rSup { size 8{T} } } {}, we have

xTz=xTxcxTz=xTxc size 12{x rSup { size 8{T} } z=x rSup { size 8{T} } ital "xc"} {}(5)

And

Rxz=RxxcRxz=Rxxc size 12{R rSub { size 8{ ital "xz"} } =R rSub { size 8{ ital "xx"} } c} {} (6)

Where
Rxz=xTzRxz=xTz size 12{R rSub { size 8{ ital "xz"} } =x rSup { size 8{T} } z} {} is called the cross-correlation vector and
Rxx=xTxRxx=xTx size 12{R rSub { size 8{ ital "xx"} } =x rSup { size 8{T} } x} {} is call the autocorrelation matrix of the input noisy signal. In practice,
RxzRxz size 12{R rSub { size 8{ ital "xz"} } } {} and
RxxRxx size 12{R rSub { size 8{ ital "xx"} } } {} are unknown, but they can be approximated by transmitting a test signal and using time average estimated to solve for the tap weights from equation (6) as follows:

c
=
R
xx
−
1
R
xz
c
=
R
xx
−
1
R
xz
size 12{c=R rSub { size 8{ ital "xx"} } rSup { size 8{ - 1} } R rSub { size 8{ ital "xz"} } } {}

Most high-speed telephone-line modems use an MSE weight criterion because it is superior to a zero-forcing criterion; it is more robust in the presence of noise and large ISI.