One way to apply MM to TV denoising
is to majorize TV (x) TV (x) by a quadratic function of xx,
as described in ref. [9].
Then the TVD cost function F(x)F(x) can be majorized by a quadratic
function, which can in turn be minimized by solving a system of linear equations.
To that end, using Equation 13, we can write
1
2
|
t
k
|
t
2
+
1
2
|
t
k
|
≥
|
t
|
∀
t
∈
R
1
2
|
t
k
|
t
2
+
1
2
|
t
k
|
≥
|
t
|
∀
t
∈
R
(16)Using v(n)v(n) for tt and summing over nn gives
∑
n
1
2
|
v
k
(
n
)
|
v
2
(
n
)
+
1
2
|
v
k
(
n
)
|
≥
∑
n
|
v
(
n
)
|
∑
n
1
2
|
v
k
(
n
)
|
v
2
(
n
)
+
1
2
|
v
k
(
n
)
|
≥
∑
n
|
v
(
n
)
|
(17)which can be written compactly as
1
2
v
t
Λ
k
-
1
v
+
1
2
∥
v
k
∥
1
≥
∥
v
∥
1
1
2
v
t
Λ
k
-
1
v
+
1
2
∥
v
k
∥
1
≥
∥
v
∥
1
(18)where ΛkΛk is the diagonal matrix
Λ
k
:
=
|
v
k
(
1
)
|
|
v
k
(
2
)
|
⋱
|
v
k
(
N
)
|
=
diag
(
|
v
k
|
)
.
Λ
k
:
=
|
v
k
(
1
)
|
|
v
k
(
2
)
|
⋱
|
v
k
(
N
)
|
=
diag
(
|
v
k
|
)
.
(19)In the notation, diag (|v|) diag (|v|),
the absolute value is applied element-wise to the vector vv.
Using DxDx for vv, we can write
1
2
x
t
D
t
Λ
k
-
1
D
x
+
1
2
∥
D
x
k
∥
1
≥
∥
D
x
∥
1
1
2
x
t
D
t
Λ
k
-
1
D
x
+
1
2
∥
D
x
k
∥
1
≥
∥
D
x
∥
1
(20)where
Λ
k
:
=
diag
(
|
D
x
k
|
)
.
Λ
k
:
=
diag
(
|
D
x
k
|
)
.
(21)Note in Equation 20 that the majorizor of ∥Dx∥1∥Dx∥1 is a quadratic function of xx.
Also note that the term ∥Dxk∥1∥Dxk∥1 in Equation 20 should be considered a constant — it is fixed
as xkxk is the value of xx at the previous iteration.
Similarly, ΛkΛk in Equation 20 is also not a function of xx.
A majorizor of the TV cost function Equation 9 can be obtained from Equation 20 by
adding ∥y-x∥22∥y-x∥22 to both sides,
∥
y
-
x
∥
2
2
+
λ
1
2
x
t
D
t
Λ
k
-
1
D
x
+
λ
1
2
∥
D
x
k
∥
1
≥
∥
y
-
x
∥
2
2
+
λ
∥
D
x
∥
1
.
∥
y
-
x
∥
2
2
+
λ
1
2
x
t
D
t
Λ
k
-
1
D
x
+
λ
1
2
∥
D
x
k
∥
1
≥
∥
y
-
x
∥
2
2
+
λ
∥
D
x
∥
1
.
(22)Therefore a majorizor Gk(x)Gk(x) for the TV cost function is given by
G
k
(
x
)
=
∥
y
-
x
∥
2
2
+
λ
1
2
x
t
D
t
Λ
k
-
1
D
x
+
λ
1
2
∥
D
x
k
∥
1
,
Λ
k
=
diag
(
|
D
x
k
|
)
G
k
(
x
)
=
∥
y
-
x
∥
2
2
+
λ
1
2
x
t
D
t
Λ
k
-
1
D
x
+
λ
1
2
∥
D
x
k
∥
1
,
Λ
k
=
diag
(
|
D
x
k
|
)
(23)which satisfies Gk(xk)=F(xk)Gk(xk)=F(xk) by design.
Using MM, we obtain xkxk by minimizing Gk(x)Gk(x),
x
k
+
1
=
arg
min
x
∥
y
-
x
∥
2
2
+
λ
1
2
x
t
D
t
Λ
k
-
1
D
x
+
λ
1
2
∥
D
x
k
∥
1
.
x
k
+
1
=
arg
min
x
∥
y
-
x
∥
2
2
+
λ
1
2
x
t
D
t
Λ
k
-
1
D
x
+
λ
1
2
∥
D
x
k
∥
1
.
(24)An explicit solution to Equation 24 is given by
x
k
+
1
=
I
+
λ
2
D
t
Λ
k
-
1
D
-
1
y
.
x
k
+
1
=
I
+
λ
2
D
t
Λ
k
-
1
D
-
1
y
.
(25)A problem with update Equation 25 is that
as the iterations progress, some values of DxkDxk will generally go to zero,
and therefore some entries of Λk-1Λk-1 in Equation 25 will go to infinity.
This issue is addressed in Ref. [9] by rewriting the equation using the matrix inverse lemma.
By the matrix inverse lemma,
I
+
λ
2
D
t
Λ
k
-
1
D
-
1
=
I
-
D
t
2
λ
Λ
k
+
D
D
t
-
1
D
I
+
λ
2
D
t
Λ
k
-
1
D
-
1
=
I
-
D
t
2
λ
Λ
k
+
D
D
t
-
1
D
(26)where
Λ
k
=
diag
(
|
D
x
k
|
)
.
Λ
k
=
diag
(
|
D
x
k
|
)
.
(27)Now the update equation Equation 25 becomes
x
k
+
1
=
y
-
D
t
2
λ
diag
(
|
D
x
k
|
)
+
D
D
t
-
1
D
y
.
x
k
+
1
=
y
-
D
t
2
λ
diag
(
|
D
x
k
|
)
+
D
D
t
-
1
D
y
.
(28)Observe that even if some elements of DxkDxk are zero, no division by zero arises in Equation 28.
The update Equation 28 calls for the solution to a linear system of equations.
In general, it is desirable to avoid such a computation in an iterative filtering algorithm
due to the high computational cost of solving linear systems (especially when the
signal yy is very long and the system is very large).
However, the matrix
[2λ diag (|Dxk|)+DDt][2λ diag (|Dxk|)+DDt]
in Equation 28 is a sparse banded matrix;
it consists of only three diagonals — the main diagonal, one upper diagonal,
and one lower diagonal.
This is because DDtDDt is tridiagonal as shown in Equation 7.
Therefore, the linear system in Equation 28 can be solved very
efficiently [11].
Further, the whole matrix need not be stored in memory, only the
three diagonals.
The MATLAB function TVD_mm implements
TV denoising based on the update Equation 28.
The function uses the sparse matrix structure in MATLAB so as to avoid high memory
requirements and so as to invoke sparse banded system solvers.
MATLAB uses LAPACK [1]
to solve the sparse banded system in the program TVD_mm.
The algorithm used by MATLAB to solve a sparse linear system can be
monitored using the command spparms('spumoni',3).
Listing 1: MATLAB program for TV denoising using majorization-minimization.
The program is based on update Equation 28.
function [x, cost] = TVD_mm(y, lam, Nit)
% [x, cost] = TVD_mm(y, lam, Nit)
% Total variation denoising using majorization-minimization
% and a banded linear systems.
%
% INPUT
% y - noisy signal
% lam - regularization parameter
% Nit - number of iterations
%
% OUTPUT
% x - denoised signal
% cost - cost function history
%
% Reference
% 'On total-variation denoising: A new majorization-minimization
% algorithm and an experimental comparison with wavalet denoising.'
% M. Figueiredo, J. Bioucas-Dias, J. P. Oliveira, and R. D. Nowak.
% Proc. IEEE Int. Conf. Image Processing, 2006.
% Ivan Selesnick, selesi@poly.edu, 2011
y = y(:); % Ensure column vector
cost = zeros(1, Nit); % Cost function history
N = length(y);
e = ones(N-1, 1);
DDT = spdiags([-e 2*e -e], [-1 0 1], N-1, N-1); % D*D' (sparse matrix)
D = @(x) diff(x); % D (operator)
DT = @(x) [-x(1); -diff(x); x(end)]; % D'
x = y; % Initialization
Dx = D(x);
for k = 1:Nit
F = 2/lam * spdiags(abs(Dx), 0, N-1, N-1) + DDT; % F : Sparse matrix structure
% F = 2/lam * diag(abs(D(x))) + DDT; % Not stored as sparse matrix
x = y - DT(F\D(y)); % Solve sparse linear system
Dx = D(x);
cost(k) = sum(abs(x-y).^2) + lam * sum(abs(Dx)); % Save cost function history
end
|
An example of TV denoising is shown in Figure 3.
The history of the cost function through the progression of the algorithm is shown in the figure.
It can be seen that after 20 iterations the cost function has leveled out,
suggesting that the algorithm has practically converged.
Another algorithm for 1-D TV denoising is Chambolle's algorithm [5],
a variant of which is the `iterative clipping' algorithm [14].
This algorithm is computationally simpler than the MM algorithm
because it does not call for the solution to a linear system at each iteration.
However, it may converge slowly.
For the denoising problem illustrated in Figure 3,
the convergence of both the iterative clipping and MM algorithms
are shown in Figure 4.
It can be seen that the MM algorithm converges in fewer iterations.