Consider the equation error residual function
e
(
ω
k
)
=
B
(
ω
k
)

D
(
ω
k
)
·
A
(
ω
k
)
=
∑
n
=
0
M
b
n
e

j
ω
k
n

D
(
ω
k
)
·
1
+
∑
n
=
1
N
a
n
e

j
ω
k
n
=
b
0
+
b
1
e

j
ω
k
+
⋯
+
b
M
e

j
ω
k
M
⋯

D
k

D
k
a
1
e

j
ω
k

⋯

D
k
a
N
e

j
ω
k
N
=
b
0
+
⋯
b
M
e

j
ω
k
M

D
k
a
1
e

j
ω
k
+
⋯
a
N
e

j
ω
k
N

D
k
e
(
ω
k
)
=
B
(
ω
k
)

D
(
ω
k
)
·
A
(
ω
k
)
=
∑
n
=
0
M
b
n
e

j
ω
k
n

D
(
ω
k
)
·
1
+
∑
n
=
1
N
a
n
e

j
ω
k
n
=
b
0
+
b
1
e

j
ω
k
+
⋯
+
b
M
e

j
ω
k
M
⋯

D
k

D
k
a
1
e

j
ω
k

⋯

D
k
a
N
e

j
ω
k
N
=
b
0
+
⋯
b
M
e

j
ω
k
M

D
k
a
1
e

j
ω
k
+
⋯
a
N
e

j
ω
k
N

D
k
(93)
with Dk=D(ωk)Dk=D(ωk). The last equation indicates that one can represent the equation error in matrix form as follows,
e
=
F
h

D
e
=
F
h

D
(94)where
F
=
1
e

j
ω
0
⋯
e

j
ω
0
M

D
0
e

j
ω
0
⋯

D
0
e

j
ω
0
N
⋮
⋮
⋮
⋮
⋮
1
e

j
ω
L

1
⋯
e

j
ω
L

1
M

D
L

1
e

j
ω
L

1
⋯

D
L

1
e

j
ω
L

1
N
F
=
1
e

j
ω
0
⋯
e

j
ω
0
M

D
0
e

j
ω
0
⋯

D
0
e

j
ω
0
N
⋮
⋮
⋮
⋮
⋮
1
e

j
ω
L

1
⋯
e

j
ω
L

1
M

D
L

1
e

j
ω
L

1
⋯

D
L

1
e

j
ω
L

1
N
(95)and
h
=
b
0
b
1
⋮
b
M
a
1
⋮
a
N
and
D
=
D
0
⋮
D
L

1
h
=
b
0
b
1
⋮
b
M
a
1
⋮
a
N
and
D
=
D
0
⋮
D
L

1
(96)Consider now the solution error residual function
s
(
ω
k
)
=
H
(
ω
k
)

D
(
ω
k
)
=
B
(
ω
k
)
A
(
ω
k
)

D
(
ω
k
)
=
1
A
(
ω
k
)
B
(
ω
k
)

D
(
ω
k
)
·
A
(
ω
k
)
=
W
(
ω
k
)
e
(
ω
k
)
s
(
ω
k
)
=
H
(
ω
k
)

D
(
ω
k
)
=
B
(
ω
k
)
A
(
ω
k
)

D
(
ω
k
)
=
1
A
(
ω
k
)
B
(
ω
k
)

D
(
ω
k
)
·
A
(
ω
k
)
=
W
(
ω
k
)
e
(
ω
k
)
(97)
Therefore one can write the solution error in matrix form as follows
s
=
W
(
F
h

D
)
s
=
W
(
F
h

D
)
(98)where WW is a diagonal matrix with 1A(ω)1A(ω) in its diagonal. From Equation 98 the leastsquared solution error εs=s*sεs=s*s can be minimized by
h
=
(
F
*
W
2
F
)

1
F
*
W
2
D
h
=
(
F
*
W
2
F
)

1
F
*
W
2
D
(99)From Equation 99 an iteration could be defined as follows
h
i
+
1
=
(
F
*
W
i
2
F
)

1
F
*
W
i
2
D
h
i
+
1
=
(
F
*
W
i
2
F
)

1
F
*
W
i
2
D
(100)by setting the weights WW in Equation 98 equal to Ak(ω)Ak(ω), the Fourier transform of the current solution for aa.
A more formal approach to minimizing εsεs consists in using a gradient method (these approaches are often referred to as Newtonlike methods). First one needs to compute the Jacobian matrix JJ of ss, where the pqpqth term of JJ is given by Jpq=∂sp∂hqJpq=∂sp∂hq with ss as defined in Equation 98. Note that the ppth element of ss is given by
s
p
=
H
p

D
p
=
B
p
A
p

D
p
s
p
=
H
p

D
p
=
B
p
A
p

D
p
(101)For simplicity one can consider these reduced form expressions for the independent components of hh,
∂
s
p
∂
b
q
=
1
A
p
∂
∂
b
q
∑
n
=
0
M
b
n
e

j
ω
p
n
=
W
p
e

j
ω
p
q
∂
s
p
∂
a
q
=
B
p
∂
∂
a
q
1
A
p
=

B
p
A
p
2
∂
∂
a
q
1
+
∑
n
=
1
N
a
n
e

j
ω
p
n
=

1
A
p
·
B
p
A
p
·
e

j
ω
p
q
=

W
p
H
p
e

j
ω
p
q
∂
s
p
∂
b
q
=
1
A
p
∂
∂
b
q
∑
n
=
0
M
b
n
e

j
ω
p
n
=
W
p
e

j
ω
p
q
∂
s
p
∂
a
q
=
B
p
∂
∂
a
q
1
A
p
=

B
p
A
p
2
∂
∂
a
q
1
+
∑
n
=
1
N
a
n
e

j
ω
p
n
=

1
A
p
·
B
p
A
p
·
e

j
ω
p
q
=

W
p
H
p
e

j
ω
p
q
(102)
Therefore on can express the Jacobian JJ as follows,
where
G
=
1
e

j
ω
0
⋯
e

j
ω
0
M

H
0
e

j
ω
0
⋯

H
0
e

j
ω
0
N
⋮
⋮
⋮
⋮
⋮
1
e

j
ω
L

1
⋯
e

j
ω
L

1
M

H
L

1
e

j
ω
L

1
⋯

H
L

1
e

j
ω
L

1
N
G
=
1
e

j
ω
0
⋯
e

j
ω
0
M

H
0
e

j
ω
0
⋯

H
0
e

j
ω
0
N
⋮
⋮
⋮
⋮
⋮
1
e

j
ω
L

1
⋯
e

j
ω
L

1
M

H
L

1
e

j
ω
L

1
⋯

H
L

1
e

j
ω
L

1
N
(104)Consider the solution error leastsquares problem given by
min
h
f
(
h
)
=
s
T
s
min
h
f
(
h
)
=
s
T
s
(105)where ss is the solution error residual vector as defined in Equation 98 and depends on hh. It can be shown [7] that the gradient of the squared error f(h)f(h) (namely ∇f∇f) is given by
∇
f
=
J
*
s
∇
f
=
J
*
s
(106)A necessary condition for a vector hh to be a local minimizer of f(h)f(h) is that the gradient ∇f∇f be zero at such vector. With this in mind and combining Equation 98 and Equation 103 in Equation 106 one gets
∇
f
=
G
*
W
2
(
F
h

D
)
=
0
∇
f
=
G
*
W
2
(
F
h

D
)
=
0
(107)Solving the system Equation 107 gives
h
=
(
G
*
W
2
F
)

1
G
*
W
2
D
h
=
(
G
*
W
2
F
)

1
G
*
W
2
D
(108)An iteration can be defined as follows
h
i
+
1
=
(
G
i
*
W
i
2
F
)

1
G
i
*
W
i
2
D
h
i
+
1
=
(
G
i
*
W
i
2
F
)

1
G
i
*
W
i
2
D
(109)where matrices WW and GG reflect their dependency on current values of aa and bb.
Atmadji Soewito [30] expanded the method of quasilinearization of Bellman and Kalaba [1] to the design of IIR filters. To understand his method consider the first order of Taylor's expansion near Hi(z)Hi(z), given by
H
i
+
1
(
z
)
=
H
i
(
z
)
+
[
B
i
+
1
(
z
)

B
i
(
z
)
]
A
i
(
z
)

[
A
i
+
1
(
z
)

A
i
(
z
)
]
B
i
(
z
)
A
i
2
(
z
)
=
H
i
(
z
)
+
B
i
+
1
(
z
)

B
i
(
z
)
A
i
(
z
)

B
i
(
z
)
[
A
i
+
1
(
z
)

A
i
(
z
)
]
A
i
2
(
z
)
H
i
+
1
(
z
)
=
H
i
(
z
)
+
[
B
i
+
1
(
z
)

B
i
(
z
)
]
A
i
(
z
)

[
A
i
+
1
(
z
)

A
i
(
z
)
]
B
i
(
z
)
A
i
2
(
z
)
=
H
i
(
z
)
+
B
i
+
1
(
z
)

B
i
(
z
)
A
i
(
z
)

B
i
(
z
)
[
A
i
+
1
(
z
)

A
i
(
z
)
]
A
i
2
(
z
)
(110)
Using the last result in the solution error residual function s(ω)s(ω) and applying simplification leads to
s
(
ω
)
=
B
i
+
1
(
ω
)
A
i
(
ω
)

H
i
(
ω
)
A
i
+
1
(
ω
)
A
i
(
ω
)
+
B
i
(
ω
)
A
i
(
ω
)

D
(
ω
)
=
1
A
i
(
ω
)
B
i
+
1
(
ω
)

H
i
(
ω
)
A
i
+
1
(
ω
)
+
B
i
(
ω
)

A
i
(
ω
)
D
(
ω
)
s
(
ω
)
=
B
i
+
1
(
ω
)
A
i
(
ω
)

H
i
(
ω
)
A
i
+
1
(
ω
)
A
i
(
ω
)
+
B
i
(
ω
)
A
i
(
ω
)

D
(
ω
)
=
1
A
i
(
ω
)
B
i
+
1
(
ω
)

H
i
(
ω
)
A
i
+
1
(
ω
)
+
B
i
(
ω
)

A
i
(
ω
)
D
(
ω
)
(111)
Equation Equation 111 can be expressed (dropping the use of ωω for simplicity) as
s
=
W
B
i
+
1

H
i
(
A
i
+
1

1
)

H
i
+
B
i

D
(
A
i

1
)

D
s
=
W
B
i
+
1

H
i
(
A
i
+
1

1
)

H
i
+
B
i

D
(
A
i

1
)

D
(112)One can recognize the two terms in brackets as Ghi+1Ghi+1 and FhiFhi respectively. Therefore Equation 112 can be represented in matrix notation as follows,
s
=
W
[
G
h
i
+
1

(
D
+
H
i

F
h
i
)
]
s
=
W
[
G
h
i
+
1

(
D
+
H
i

F
h
i
)
]
(113)with H=[H0,H1,⋯,HL1]TH=[H0,H1,⋯,HL1]T. Therefore one can minimize sTssTs from Equation 113 with
h
i
+
1
=
(
G
i
*
W
i
2
G
i
)

1
G
i
*
W
i
2
(
D
+
H
i

F
h
i
)
h
i
+
1
=
(
G
i
*
W
i
2
G
i
)

1
G
i
*
W
i
2
(
D
+
H
i

F
h
i
)
(114)since all the terms inside the parenthesis in Equation 114 are constant at the (i+1)(i+1)th iteration. In a sense, Equation 114 is similar to Equation 109, where the desired function is updated from iteration to iteration as in Equation 114.
It is important to note that any of the three algorithms can be modified to solve a weighted
l2l2 IIR approximation using a weighting function
W(ω)W(ω) by defining
V
(
ω
)
=
W
(
ω
)
A
(
ω
)
V
(
ω
)
=
W
(
ω
)
A
(
ω
)
(115)Taking Equation 115 into account, the following is a summary of the three different updates discussed so far:
SMB
Frequency
Mode1:
h
i
+
1
=
(
F
*
V
i
2
F
)

1
F
*
V
i
2
D
SMB
Frequency
Mode2:
h
i
+
1
=
(
G
i
*
V
i
2
F
)

1
G
i
*
V
i
2
D
Soewito's
quasilinearization:
h
i
+
1
=
(
G
i
*
V
i
2
G
i
)

1
G
i
*
V
i
2
(
D
+
H
i

F
h
i
)
SMB
Frequency
Mode1:
h
i
+
1
=
(
F
*
V
i
2
F
)

1
F
*
V
i
2
D
SMB
Frequency
Mode2:
h
i
+
1
=
(
G
i
*
V
i
2
F
)

1
G
i
*
V
i
2
D
Soewito's
quasilinearization:
h
i
+
1
=
(
G
i
*
V
i
2
G
i
)

1
G
i
*
V
i
2
(
D
+
H
i

F
h
i
)
(116)