As opposed to solving the constrained formulation, an alternate approach is to solve the unconstrained formulation in Equation 3. A widely used method for solving ℓ1ℓ1-minimization problems of the form
min
x
μ
∥
x
∥
1
+
H
(
x
)
,
min
x
μ
∥
x
∥
1
+
H
(
x
)
,
(5)for a convex and differentiable HH, is an iterative procedure based on shrinkage (also called soft thresholding; see Equation 6 below). In the context of solving Equation 5 with a quadratic HH, this method was independently proposed and analyzed in [1], [9], [13], [14], and then further studied or extended in [4], [5], [6], [7], [11], [16].
Shrinkage is a classic method used in wavelet-based image denoising. The shrinkage operator on any scalar component can be defined as follows:
shrink
(
t
,
α
)
=
t
-
α
if
t
>
α
,
0
if
-
α
≤
t
≤
α
,
and
t
+
α
if
t
<
-
α
.
shrink
(
t
,
α
)
=
t
-
α
if
t
>
α
,
0
if
-
α
≤
t
≤
α
,
and
t
+
α
if
t
<
-
α
.
(6)This concept can be used effectively to solve Equation 5. In particular, the basic algorithm can be written as following the fixed-point iteration: for i=1,...,Ni=1,...,N, the i th i th coefficient of xx at the (k+1) th (k+1) th time step is given by
x
i
k
+
1
=
shrink
(
(
x
k
-
τ
▽
H
(
x
k
)
)
i
,
μ
τ
)
x
i
k
+
1
=
shrink
(
(
x
k
-
τ
▽
H
(
x
k
)
)
i
,
μ
τ
)
(7)where τ>0τ>0 serves as a step-length for gradient descent (which may vary with kk) and μμ is as specified by the user. It is easy to see that the larger μμ is, the larger the allowable distance between xk+1xk+1 and xkxk. For a quadratic penalty term H(·)H(·), the gradient ▽H▽H can be easily computed as a linear function of xkxk; thus each iteration of Equation 7 essentially boils down to a small number of matrix-vector multiplications.
The simplicity of the iterative approach is quite appealing, both from a computational, as well as a code-design standpoint. Various modifications, enhancements, and generalizations to this approach have been proposed, both to improve the efficiency of the basic iteration in Equation 7, and to extend its applicability to various kinds of JJ [8], [10], [16]. In principle, the basic iteration in Equation 7 would not be practically effective without a continuation (or path-following) strategy [11], [16] in which we choose a gradually decreasing sequence of values for the parameter μμ to guide the intermediate iterates towards the final optimal solution.
This procedure is known as continuation; in [11], the performance of an algorithm known as Fixed-Point Continuation (FPC) has been compared favorably with another similar method known as Gradient Projection for Sparse Reconstruction (GPSR) [10] and “l1_ls” [12]. A key aspect to solving the unconstrained optimization problem is the choice of the parameter μμ. As discussed above, for CS recovery, μμ may be chosen by trial and error; for the noiseless constrained formulation, we may solve the corresponding unconstrained minimization by choosing a large value for μμ.
In the case of recovery from noisy compressive measurements, a commonly used choice for the convex cost function H(x)H(x) is the square of the norm of the residual. Thus we have:
H
(
x
)
=
∥
y
-
Φ
x
∥
2
2
▽
H
(
x
)
=
2
Φ
⊤
(
y
-
Φ
x
)
.
H
(
x
)
=
∥
y
-
Φ
x
∥
2
2
▽
H
(
x
)
=
2
Φ
⊤
(
y
-
Φ
x
)
.
(8)
For this particular choice of penalty function, Equation 7 reduces to the following iteration:
x
i
k
+
1
=
shrink
(
(
x
k
-
τ
▽
H
(
y
-
Φ
x
k
)
i
,
μ
τ
)
x
i
k
+
1
=
shrink
(
(
x
k
-
τ
▽
H
(
y
-
Φ
x
k
)
i
,
μ
τ
)
(9)which is run until convergence to a fixed point. The algorithm is detailed in pseudocode form below.
Inputs: CS matrix ΦΦ, signal measurements yy, parameter sequence μnμn
Outputs: Signal estimate x^x^
initialize: x^0=0x^0=0, r=yr=y, k=0k=0.
while ħalting criterion false do
1. k←k+1k←k+1
2. x←x^-τΦTrx←x^-τΦTr {take a gradient step}
3. x^← shrink (x,μkτ)x^← shrink (x,μkτ) {perform soft thresholding}
4. r←y-Φx^r←y-Φx^ {update measurement residual}
end while
return x^←x^x^←x^