We now wish to consider new model classes for signals.
Towards this end, let
{
ψ
j
}
j
=
1
∞
{
ψ
j
}
j
=
1
∞
{ lbrace ψ_j rbrace}_{j = 1}^∞
be an orthonormal basis for
L
2
(
−
T
,
T
)
L
2
(
−
T
,
T
)
L_2 { \( { - T , T} \)}
.
Thus for
f
∈
L
2
f
∈
L
2
f in L_2
we can write
f
=
∑
j
=
1
∞
c
j
(
f
)
ψ
j
f
=
∑
j
=
1
∞
c
j
(
f
)
ψ
j
f = sum_{j = 1}^∞ c_j { \( f \)} ψ_j
where
(
c
j
(
f
)
)
∈
ℓ
2
(
c
j
(
f
)
)
∈
ℓ
2
{ \( {c_j { \( f \)}} \)} in ℓ_2
.
We will now build an encoder and decoder and analyze its performance on
compact sets
K
K
K
.
For example, we might want to encode signals in the space
X
p
=
{
f
:
(
c
j
(
f
)
)
∈
ℓ
p
},
0
≤
p
≤
2
X
p
=
{
f
:
(
c
j
(
f
)
)
∈
ℓ
p
},
0
≤
p
≤
2
X_p ={ lbrace {f : { \( {c_j { \( f \)}} \)} in ℓ_p} rbrace}
0 <= p <= 2
with norm
∥
f
∥
X
p
:
=
∥
(
c
j
(
f
)
)
∥
ℓ
p
.
∥
f
∥
X
p
:
=
∥
(
c
j
(
f
)
)
∥
ℓ
p
.
However, in this space the unit ball,
U
(
X
p
)
U
(
X
p
)
U { \( X_p \)}
is not compact. To get a compact set we need more structure on the sequence
(
c
j
)
(
c
j
)
\( c_j \)
.
Hence we define
Y
α
:
=
{
f
:
∣
c
n
(
f
)
∣
≤
n
−
α
,
n
=
1
,
2
,
…
}
Y
α
:
=
{
f
:
∣
c
n
(
f
)
∣
≤
n
−
α
,
n
=
1
,
2
,
…
}
Y^α :={ lbrace {f : \lline c_n { \( f \)} \lline <=
n^{ - α} , n = 1 , 2 , dotslow} rbrace}
and we define the norm in this space as
∥
f
∥
Y
α
:
=
∥
f
∥
Y
α
:
=
the smallest
c
c
c
such that this holds. We now take
K
=
U
(
X
p
)
∩
U
(
Y
α
)
K
=
U
(
X
p
)
∩
U
(
Y
α
)
K =U { \( X_p \)} intersection U { \( Y^α \)}
to get a compact set. Notice that when
α
>
0
α
>
0
α >0
is small the requirement for membership in
Y
α
Y
α
Y^α
is very mild.
Next, suppose that we choose a target distortion level
ε
=
2
−
m
ε
=
2
−
m
ε =2^{ - m}
.
Given
f
f
f
,
let
Λ
k
:
=
Λ
k
(
f
)
=
{
j
∈
{
0
,
…
,
N
}
:
2
−
k
−
1
≤
∣
c
j
(
f
)
∣
<
2
−
k
}
Λ
k
:
=
Λ
k
(
f
)
=
{
j
∈
{
0
,
…
,
N
}
:
2
−
k
−
1
≤
∣
c
j
(
f
)
∣
<
2
−
k
}
Λ_k :=Λ_k { \( f \)} ={ lbrace {j in { lbrace {0 , dotslow , N} rbrace} : 2^{ - k - 1} <= \lline c_j { \( f \)} \lline < 2^{ - k}} rbrace}
for
0≤k≤M0≤k≤M0 <= k <= M, where
M:=⌈2m2−p⌉M:=⌈2m2−p⌉M :={⌈ {2 m} over {2 - p} ⌉}.
We then choose
NNN
as the smallest integer so that
N
−
α
≤
2
−
M
N
−
α
≤
2
−
M
and thus
log
N
≤
C
m
.
log
N
≤
C
m
.
It follows from the requirement that
f∈Yαf∈Yαf in Y^α that
Λk⊂{1,…,N}Λk⊂{1,…,N}Λ_k subset { lbrace {1 , dotslow , N} rbrace} for each
0≤k≤M0≤k≤M0 <= k <= M.
Recall that
#
Λ
k
2
(
-
k
-
1
)
p
≤
∑
c
j
∈
Λ
k
|
c
j
|
p
≤
∥
f
∥
X
p
p
.
#
Λ
k
2
(
-
k
-
1
)
p
≤
∑
c
j
∈
Λ
k
|
c
j
|
p
≤
∥
f
∥
X
p
p
.
Since
f∈U(Xp)∩U(Yα)f∈U(Xp)∩U(Yα)f in U { \( X_p \)} intersection U { \( Y^α \)},
#
Λ
k
≤
∥
f
∥
ℓ
p
2
(
k
+
1
)
p
≤
2
(
k
+
1
)
p
.
#
Λ
k
≤
∥
f
∥
ℓ
p
2
(
k
+
1
)
p
≤
2
(
k
+
1
)
p
.
Hence, the total number of indices in all of the
ΛkΛkΛ_k,
0≤k≤M0≤k≤M0 <= k <= M, is
O(2Mp)O(2Mp)O { \( 2^{M p} \)}.
To encode, for each
fff, we can send the following bits:
- Send
log
n
log
n
bits to identify each index in
ΛkΛkΛ_k, for
0≤k≤M0≤k≤M0 <= k <= M. This will require a total of
O
(
log
N
2
M
p
)
O
(
log
N
2
M
p
)
bits.
- Send one bit to identify the sign of
cj(f)cj(f)c_j { \( f \)} for each
j∈Λkj∈Λkj in Λ_k,
0≤k≤M0≤k≤M0 <= k <= M. This will require
O(2Mp)O(2Mp)O { \( 2^{M p} \)} bits.
- Send
mmm bits to describe each
cj(f),j∈Λkcj(f),j∈Λkc_j { \( f \)} ,j in Λ_k, for
0≤k≤M0≤k≤M0 <= k <= M. This will require
O(m2Mp)O(m2Mp)O { \( {m 2^{M p}} \)} bits.
Thus the total number of bits used in the encoding is
O(m2Mp)O(m2Mp)O { \( {m 2^{M p}} \)}.
Notice that for each
j∈Λkj∈Λkj in Λ_k,
0≤k≤M0≤k≤M0 <= k <= M, we can recover each
cj(f)cj(f)c_j { \( f \)} by
c
j
¯
=
±
∑
i
=
0
m
b
i
2
−
k
−
i
c
j
¯
=
±
∑
i
=
0
m
b
i
2
−
k
−
i
{overline c_j} = +- sum csub {i = 0} csup m b_i 2^{ - k - i}
where the sign is given by the sign bit. It follows that
∣cj(f)−cj¯∣≤2−m−k∣cj(f)−cj¯∣≤2−m−k \lline c_j { \( f \)} - {overline c_j} \lline <= 2^{ - m - k} for every such coefficient. Here we have used the fact that knowing that
j∈Λkj∈Λkj in Λ_k means that the first nonzero binary bit of
cj(f)cj(f)c_j { \( f \)} is the
kkk-th bit.
To decode we simply set
f
¯
=
∑
k
=
0
M
∑
j
∈
Λ
k
c
j
¯
ψ
j
f
¯
=
∑
k
=
0
M
∑
j
∈
Λ
k
c
j
¯
ψ
j
.
{overline f} = sum csub {k = 0} csup M sum csub {j in Λ_k} {overline c_j} ψ_j
We now analyze the error we have incurred in such an encoding. The square of the error will consist of two parts. The first corresponds to the
j∈Λkj∈Λkj in Λ_k,
0≤k≤M0≤k≤M0 <= k <= M. For each such
jjj we have
∣cj(f)−cj¯∣≤2−m−k∣cj(f)−cj¯∣≤2−m−k \lline c_j { \( f \)} - {overline c_j} \lline <= 2^{ - m - k} and so the total square error for this is
≤
C
∑
k
=
1
M
2
k
p
2
−
2
m
2
−
2
k
≤
c
2
−
2
m
≤
C
∑
k
=
1
M
2
k
p
2
−
2
m
2
−
2
k
≤
c
2
−
2
m
because
p≤2p≤2p <= 2. The second part of the error corresponds to all the coefficients which have magnitude
≤
2
−
M
≤
2
−
M
.
We have that this sum does not exceed
∑
∣
c
j
∣
>
2
−
M
∣
c
j
∣
2
≤
2
−
M
(
2
−
p
)
∑
j
=
1
∞
∣
c
j
∣
p
≤
2
−
2
m
.
∑
∣
c
j
∣
>
2
−
M
∣
c
j
∣
2
≤
2
−
M
(
2
−
p
)
∑
j
=
1
∞
∣
c
j
∣
p
≤
2
−
2
m
.
sum csub { \lline c_j \lline > 2^{ - M}} \lline c_j \lline^2 <= 2^{ - M { \( {2 - p} \)}} sum csub {j = 1} csup ∞ \lline c_j \lline^p <= 2^{ - 2 m} "."
Thus the total error we incur is
O(2−m)O(2−m)O { \( 2^{ - m} \)}.
In summary, by allocating
O
(
m
2
m
1
∕
p
−
1
∕
2
)
O
(
m
2
m
1
∕
p
−
1
∕
2
)
bits we achieve distortion
C2−mC2−mC 2^{ - m}. Equivalently, by allocating
n
log
n
n
log
n
bits, we achieve distortion
Cn−(1∕p−1∕2)Cn−(1∕p−1∕2)C n^{ - { \( {1 ∕ p - 1 ∕ 2} \)}}.
This is within a logarithmic factor of the optimal encoding given by Kolmogorov entropy of the class
U(Xp)∩YαU(Xp)∩YαU { \( X_p \)} intersection Y^α. A slightly more careful argument can remove this logarithm.
In the method above we failed to achieve the optimal performance because of the cost involved in identifying which indices were in each
ΛkΛkΛ_k. We will now describe a method that can do better, using the Haar basis for
L2[0,1]L2[0,1]L_2 { \[ {0 , 1} \]}. Thus, we first define the scaling function
φ
:
=
χ
[
0
,
1
]
.
φ
:
=
χ
[
0
,
1
]
.
φ :=χ_{ \[ {0 , 1} \]} "."
Next, we define the mother wavelet
ψ
:
=
χ
[
0
,
1
2
]
−
χ
[
1
2
,
1
]
.
ψ
:
=
χ
[
0
,
1
2
]
−
χ
[
1
2
,
1
]
.
ψ :=χ_{ \[ {0 , 1 over 2} \]} - χ_{ \[ {1 over 2 , 1} \]} "."
We then define the remaining wavelets recursively. They are obtained by dilations and shifts of the mother wavelet on dyadic intervals:
ψ
J
:
=
2
k
2
ψ
[
0
,
1
]
(
2
k
x
−
j
)
ψ
J
:
=
2
k
2
ψ
[
0
,
1
]
(
2
k
x
−
j
)
ψ_J :=2^k over 2 ψ_{ \[ {0 , 1} \]} { \( {2^k x - j} \)}
where
J=[j2−k,(j+1)2−k]J=[j2−k,(j+1)2−k]J ={ \[ {j 2^{ - k} , { \( {j + 1} \)} 2^{ - k}} \]} are dyadic intervals. We denote by
D+D+D_+ the collection of all dyadic intervals contained in
[0,1][0,1] \[ {0 , 1} \]. Then, the collection of functions
{
φ
}
∪
{
ψ
J
}
J
∈
D
+
{
φ
}
∪
{
ψ
J
}
J
∈
D
+
forms an orthonormal basis for
L2[0,1]L2[0,1]L_2 { \[ {0 , 1} \]}.
A key property of wavelets is that a tree structure can be placed on the coefficients due to the use of dyadic intervals in their construction. Thus, let
T
k
:
=
{
j
:
∣
c
j
∣
≥
2
−
k
}
T
k
:
=
{
j
:
∣
c
j
∣
≥
2
−
k
}
T_k :={ lbrace {j : \lline c_j \lline >= 2^{ - k}} rbrace}
and
T
k
+
1
−
T
k
=
Λ
k
.
T
k
+
1
−
T
k
=
Λ
k
.
T_{k + 1} - T_k =Λ_k "."
We define
T¯kT¯k{overline T}_k as the smallest tree containing
Tk<