Stochastic
L
2
L
2
optimal (least squares) FIR filter design problem: Given a
wide-sense stationary (WSS) input signal
x
k
x
k
and desired signal
d
k
d
k
(WSS ⇔
Eyk=Eyk+d
y
k
y
k
d
,
r
yz
l=Eykzk+l
r
yz
l
y
k
z
k
l
,
∀k,l:
r
yy
0<∞
k
l
r
yy
0
)
The Wiener filter is the linear, time-invariant filter
minimizing
Eε2
ε
2
, the variance of the error.
As posed, this problem seems slightly silly, since
d
k
d
k
is already available! However, this idea is useful in
a wide cariety of applications.
active suspension system design
optimal system may change with different road conditions or
mass in car, so an adaptive system might be
desirable.
System identification (radar, non-destructive testing,
adaptive control systems)
Usually one desires that the input signal
x
k
x
k
be "persistently exciting," which, among other things,
implies non-zero energy in all frequency bands. Why is this
desirable?
for convenience, we will analyze only the
causal, real-data case; extensions are straightforward.
y
k
=∑l=0M-1
w
l
x
k
-
l
y
k
l
0
M
1
w
l
x
k
-
l
argmin
w
l
Eε2=E
d
k
-
y
k
2=E
d
k
-∑l=0M-1
w
l
x
k
-
l
2=E
d
k
2-2∑l=0M-1
w
l
E
d
k
x
k
-
l
+∑l=0M-1∑m=0M-1
w
l
w
m
E
x
k
-
l
x
k
-
m
w
l
ε
2
d
k
y
k
2
d
k
l
M
1
0
w
l
x
k
-
l
2
d
k
2
2
l
M
1
0
w
l
d
k
x
k
-
l
l
0
M
1
m
0
M
1
w
l
w
m
x
k
-
l
x
k
-
m
Eε2=
r
dd
0-2∑l=0M-1
w
l
r
dx
l+∑l=0M-1∑m=0M-1
w
l
w
m
r
xx
l-m
ε
2
r
dd
0
2
l
M
1
0
w
l
r
dx
l
l
M
1
0
m
M
1
0
w
l
w
m
r
xx
l
m
where
r
dd
0=E
d
k
2
r
dd
0
d
k
2
r
dx
l=E
d
k
X
k
-
l
r
dx
l
d
k
X
k
-
l
r
xx
l-m=E
x
k
x
k
+
l
-
m
r
xx
l
m
x
k
x
k
+
l
-
m
This can be written in matrix form as
Eε2=
r
dd
0-2PWT+WTRW
ε
2
r
dd
0
2
P
W
W
R
W
where
P=
r
dx
0
r
dx
1⋮
r
dx
M-1
P
r
dx
0
r
dx
1
⋮
r
dx
M
1
R=
r
xx
0
r
xx
1……
r
xx
M-1
r
xx
1
r
xx
0⋱⋱⋮⋮⋱⋱⋱⋮⋮⋱⋱
r
xx
0
r
xx
1
r
xx
M-1……
r
xx
1
r
xx
0
R
r
xx
0
r
xx
1
…
…
r
xx
M
1
r
xx
1
r
xx
0
⋱
⋱
⋮
⋮
⋱
⋱
⋱
⋮
⋮
⋱
⋱
r
xx
0
r
xx
1
r
xx
M
1
…
…
r
xx
1
r
xx
0
To solve for the optimum filter, compute the gradient with
respect to the top weights vector WW
∇≐∂∂
w
0
ε2∂∂
w
1
ε2⋮∂∂
w
M
-
1
ε2
≐
∇
w
0
ε
2
w
1
ε
2
⋮
w
M
-
1
ε
2
∇=-2P+2RW
∇
2
P
2
R
W
(recall
ddWATW=AT
W
A
W
A
,
ddWWMW=2MW
W
W
M
W
2
M
W
for symmetric MM) setting the
gradient equal to zero ⇒
WoptR=P⇒Wopt=R-1P
⇒
W
opt
R
P
W
opt
R
P
Since RR is a correlation matrix,
it must be non-negative definite, so this is a minimizer. For
RR positive definite, the
minimizer is unique.
"A good introduction in adaptive filters, a major DSP application."