While the spark, null space property (NSP), and restricted isometry property (RIP) all provide guarantees for the recovery of sparse signals, verifying that a general matrix ΦΦ satisfies any of these properties has a combinatorial computational complexity, since in each case one must essentially consider NKNK submatrices. In many settings it is preferable to use properties of ΦΦ that are easily computable to provide more concrete recovery guarantees. The coherence of a matrix is one such property [3], [9].
- Definition 1:
The coherence of a matrix ΦΦ, μ(Φ)μ(Φ), is the largest absolute inner product between any two columns φiφi, φjφj of
ΦΦ:
μ
(
Φ
)
=
max
1
≤
i
<
j
≤
N
φ
i
,
φ
j
∥
φ
i
∥
2
∥
φ
j
∥
2
.
μ
(
Φ
)
=
max
1
≤
i
<
j
≤
N
φ
i
,
φ
j
∥
φ
i
∥
2
∥
φ
j
∥
2
.
(1)
It is possible to show that the coherence of a matrix is always in the range μ(Φ)∈N-MM(N-1),1μ(Φ)∈N-MM(N-1),1; the lower bound is known as the Welch bound [7], [8], [13]. Note that when N≫MN≫M, the lower bound is approximately μ(Φ)≥1/Mμ(Φ)≥1/M.
One can sometimes relate coherence to the spark, NSP, and RIP. For example, the coherence and spark properties of a matrix can be related by employing the Gershgorin circle theorem [5], [12].
The eigenvalues of an N×NN×N matrix MM with entries mijmij, 1≤i,j≤N1≤i,j≤N, lie in the union of NN discs di=di(ci,ri)di=di(ci,ri), 1≤i≤N1≤i≤N, centered at ci=miici=mii and with radius ri=∑j≠i|mij|ri=∑j≠i|mij|.
Applying this theorem on the Gram matrix G=ΦΛTΦΛG=ΦΛTΦΛ leads to the following straightforward result.
For any matrix ΦΦ,
spark
(
Φ
)
≥
1
+
1
μ
(
Φ
)
.
spark
(
Φ
)
≥
1
+
1
μ
(
Φ
)
.
(2)Since spark (Φ) spark (Φ) does not depend on the scaling of the columns, we can assume without loss of generality that ΦΦ has unit-norm columns. Let Λ⊆{1,...,N}Λ⊆{1,...,N} with |Λ|=p|Λ|=p determine a set of indices. We consider the restricted Gram matrix G=ΦΛTΦΛG=ΦΛTΦΛ, which satisfies the following properties:
- gii=1gii=1, 1≤i≤p1≤i≤p;
- |gij|≤μ(Φ)|gij|≤μ(Φ), 1≤i,j≤p1≤i,j≤p, i≠ji≠j.
From Theorem 1, if ∑j≠i|gij|<|gii|∑j≠i|gij|<|gii| then the matrix GG is positive definite, so that the columns of ΦΛΦΛ are linearly independent. Thus, the spark condition implies (p-1)μ(Φ)<1(p-1)μ(Φ)<1 or, equivalently, p<1+1/μ(Φ)p<1+1/μ(Φ) for all p< spark (Φ)p< spark (Φ), yielding spark (Φ)≥1+1/μ(Φ) spark (Φ)≥1+1/μ(Φ).
By merging Theorem 1 from "Null space conditions" with Lemma 1, we can pose the following condition on ΦΦ that guarantees uniqueness.
If
K
<
1
2
1
+
1
μ
(
Φ
)
,
K
<
1
2
1
+
1
μ
(
Φ
)
,
(3)then for each measurement vector y∈RMy∈RM there exists at most one signal x∈ΣKx∈ΣK such that y=Φxy=Φx.
Theorem 2, together with the Welch bound, provides an upper bound on the level of sparsity KK that guarantees uniqueness using coherence: K=O(M)K=O(M). Another straightforward application of the Gershgorin circle theorem (Theorem 1) connects the RIP to the coherence property.
If ΦΦ has unit-norm columns and coherence μ=μ(Φ)μ=μ(Φ), then ΦΦ satisfies the RIP of order KK with δ=(K-1)μδ=(K-1)μ for all K<1/μK<1/μ.
The proof of this lemma is similar to that of Lemma 1.
The results given here emphasize the need for small coherence μ(Φ)μ(Φ) for the matrices used in CS. Coherence bounds have been studied both for deterministic and randomized matrices. For example, there are known matrices ΦΦ of size M×M2M×M2 that achieve the coherence lower bound μ(Φ)=1/Mμ(Φ)=1/M, such as the Gabor frame generated from the Alltop sequence [6] and more general equiangular tight frames [8]. These constructions restrict the number of measurements needed to recover a KK-sparse signal to be M=O(K2logN)M=O(K2logN). Furthermore, it can be shown that when the distribution used has zero mean and finite variance, then in the asymptotic regime (as MM and NN grow) the coherence converges to μ(Φ)=(2logN)/Mμ(Φ)=(2logN)/M [1], [2], [4]. Such constructions would allow KK to grow asymptotically as M=O(K2logN)M=O(K2logN), matching the known finite-dimensional bounds.
The measurement bounds dependent on coherence are handicapped by the squared dependence on the sparsity KK, but it is possible to overcome this bottleneck by shifting the types of guarantees from worst-case/deterministic behavior, to average-case/probabilistic behavior [10], [11]. In this setting, we pose a probabilistic prior on the set of KK-sparse signals x∈ΣKx∈ΣK. It is then possible to show that if ΦΦ has low coherence μ(Φ)μ(Φ) and spectral norm ∥Φ∥2∥Φ∥2, and if K=O(μ-2(Φ)logN)K=O(μ-2(Φ)logN), then the signal xx can be recovered from its CS measurements y=Φxy=Φx with high probability. Note that if we replace the Welch bound, then we obtain K=O(MlogN)K=O(MlogN), which returns to the linear dependence of the measurement bound on the signal sparsity that appears in RIP-based results.
-
Cai, T. and Jiang, T. (2010). Limiting Laws of Coherence of Random Matrices with Applications to Testing Covariance Structure and Construction of Compressed Sensing Matrices. [Preprint].
-
Candès, E. and Plan, Y. (2009). Near-Ideal Model Selection by minimization. Ann. Stat., 37(5A), 2145–2177.
-
Donoho, D. and Elad, M. (2003). Optimally sparse representation in general (nonorthogonal) dictionaries via minimization. Proc. Natl. Acad. Sci., 100(5), 2197–2202.
-
Donoho, D. (2006). For most large underdetermined systems of linear equations, the minimal -norm solution is also the sparsest solution. Comm. Pure Appl. Math., 59(6), 797–829.
-
Gervsgorin, S. (1931). Über die Abgrenzung der Eigenwerte einer Matrix. Izv. Akad. Nauk SSSR Ser. Fiz.-Mat., 6, 749–754.
-
Herman, M. and Strohmer, T. (2009). High-Resolution Radar via Compressed Sensing. IEEE Trans. Signal Processing, 57(6), 2275–2284.
-
Rosenfeld, M. (1996). In praise of the Gram matrix. The Mathematics of Paul Erdős II. (p. 318–323). Berlin, Germany: Springer.
-
Strohmer, T. and Heath, R. (2003, Nov.). Grassmanian frames with applications to coding and communication. Appl. Comput. Harmon. Anal., 14(3), 257–275.
-
Tropp, J. and Gilbert, A. (2007). Signal recovery from partial information via orthogonal matching pursuit. IEEE Trans. Inform. Theory, 53(12), 4655–4666.
-
Tropp, J. A. (2008). Norms of random submatrices and sparse approximation. C. R. Acad. Sci. Paris, Ser. I, 346(23–24), 1271–1274.
-
Tropp, J. A. (2008). On the conditioning of random subdictionaries. Appl. Comput. Harmon. Anal., 25, 1-24.
-
Varga, R. (2004). Geršgorin and His Circles. Berlin, Germany: Springer.
-
Welch, L. (1974). Lower bounds on the maximum cross correlation of signals. IEEE Trans. Inform. Theory, 20(3), 397–399.