The matrix inversion lemma gives us a framework for efficiently updating the solution to a system of equations.
Let MM, PP, QQ, and CC be matrices where
- MM is N×NN×N and invertible,
- PP and QQ are R×NR×N, and
- CC is R×RR×R and invertible.
Then the following identity holds:
(
M
+
P
T
C
Q
)
-
1
=
M
-
1
-
M
-
1
P
T
(
C
-
1
+
Q
M
-
1
P
T
)
-
1
Q
M
-
1
(
M
+
P
T
C
Q
)
-
1
=
M
-
1
-
M
-
1
P
T
(
C
-
1
+
Q
M
-
1
P
T
)
-
1
Q
M
-
1
(1)
This is also known as the Sherman-Morrison-Woodbury identity.
The proof of Equation 1 is straightforward. Given any right-hand-side v∈RNv∈RN, we would like to solve
(
M
+
P
T
C
Q
)
w
=
v
(
M
+
P
T
C
Q
)
w
=
v
(2)
for ww. Set
z
=
C
Q
w
⇒
C
-
1
z
=
Q
w
.
z
=
C
Q
w
⇒
C
-
1
z
=
Q
w
.
(3)
Now we have two sets of equations
M
w
+
P
T
z
=
v
Q
w
-
C
-
1
z
=
0
.
M
w
+
P
T
z
=
v
Q
w
-
C
-
1
z
=
0
.
(4)
Manipulating the first equation yields
w
=
M
-
1
(
v
-
P
T
z
)
.
w
=
M
-
1
(
v
-
P
T
z
)
.
(5)
Plugging this into the second equation above gives us
Q
M
-
1
v
-
Q
M
-
1
P
T
z
-
C
-
1
z
=
0
⇒
z
=
C
-
1
+
Q
M
-
1
P
T
-
1
Q
M
-
1
v
⇒
w
=
M
-
1
v
-
M
-
1
P
T
C
-
1
+
Q
M
-
1
P
T
-
1
Q
M
-
1
v
.
Q
M
-
1
v
-
Q
M
-
1
P
T
z
-
C
-
1
z
=
0
⇒
z
=
C
-
1
+
Q
M
-
1
P
T
-
1
Q
M
-
1
v
⇒
w
=
M
-
1
v
-
M
-
1
P
T
C
-
1
+
Q
M
-
1
P
T
-
1
Q
M
-
1
v
.
(6)
As this holds for any vv, this establishes Equation 1.
The Matrix Inversion Lemma is the basis for the Recursive Least Squares algorithm (RLS) and the Kalman Filter. Both of these algorithms rely on the same essential fact: with M-1M-1 already computed, we have reduced the task of inverting the N×NN×N system (M+PTCQ)(M+PTCQ) to the easier task of inverting the R×RR×R system C-1+QM-1PTC-1+QM-1PT (much easier if R≪NR≪N).
As an example, let us consider the fundamental “rank-one update” case: R=1R=1, P=QP=Q and C=1C=1. We will set b=QT∈RNb=QT∈RN just for notational convenience. With a=M-1ba=M-1b, Equation 1 becomes
(
M
+
b
b
T
)
-
1
=
M
-
1
+
1
1
+
b
T
a
a
a
T
.
(
M
+
b
b
T
)
-
1
=
M
-
1
+
1
1
+
b
T
a
a
a
T
.
(7)
This reduces the main computations involved solving a system of equations of the form (M+bbT)w=v(M+bbT)w=v to two matrix-vector multiplies: after calculating u=M-1vu=M-1v and a=M-1ba=M-1b, we have
w
=
(
M
+
b
b
T
)
-
1
v
=
u
+
a
T
u
1
+
b
T
a
·
u
.
w
=
(
M
+
b
b
T
)
-
1
v
=
u
+
a
T
u
1
+
b
T
a
·
u
.
(8)
The computations involved are now O(N2)O(N2) rather than O(N3)O(N3).