In the equation from our discussion of the Haar transform:
y=TxTT
y
T
x
T
and to invert:
x=TTyT
x
T
y
T
we saw how a 1-D transform could be extended to 2-D by pre- and
post-multiplication of a square matrix xx to give a matrix result
yy. Our example then
used
2×2
2
2
matrices, but this technique applies to square
matrices of any size.
Hence the DCT may be extended into 2-D by this method.
E.g. the
8×8
8
8
DCT transforms a subimage of
8×8
8
8
pels into a matrix of
8×8
8
8
DCT coefficients.
The 2-D basis functions, from which xx may be reconstructed, are given
by the
n2
n
2
separate products of the columns of
TT
T
with the rows of TT. These are shown for
n=8
n
8
in
(a) of Figure 1 as 64 subimages of
size
8×8
8
8
pels.
The result of applying the
8×8
8
8
DCT to the Lenna image is shown in (b) of Figure 1. Here each
8×8
8
8
block of pels xx is replaced by the
8×8
8
8
block of DCT coefficients yy. This shows the
8×8
8
8
block structure clearly but is not very meaningful
otherwise.
Part(c) of Figure 1 shows the same
data, reordered into 64 subimages of
32×32
32
32
coefficients each so that each subimage contains all
the coefficients of a given type - e.g: the top left subimage
contains all the coefficients for the top left basis function
from (a) of Figure 1. The other
subimages and basis functions correspond in the same way.
We see the major energy concentration to the subimages in the
top left corner. (d) of Figure 1 is
an enlargement of the top left 4 subimages of (c) of Figure 1 and bears a strong similarity to
the group of third level Haar subimages in (b) of
this figure. To
emphasise this the
histograms and entropies of these 4 subimages are shown in Figure 2.
Comparing Figure 2 with this
figure, the Haar
transform equivalent, we see that the Lo-Lo bands have identical
energies and entropies. This is because the basis functions are
identical flat surfaces in both cases. Comparing the other 3
bands, we see that the DCT bands contain more energy and entropy
than their Haar equivalents, which means
less energy (and so hopefully less entropy)
in the higher DCT bands (not shown) because the total energy is
fixed (the transforms all preserve total energy). The mean
entropy for all 64 subimages is 1.3622 bit/pel, which compares
favourably with the 1.6103 bit/pel for the 4-level Haar
transformed subimages using the same
Qstep
=15
Qstep
15
.
This is a similar question to: What is the optimum number of
levels for the Haar transform?
We have analysed Lenna using DCT sizes from
2×2
2
2
to
16×16
16
16
to investigate this. Figure 4 shows the
4×4
4
4
and
16×16
16
16
sets of DCT subimages. The
2×2
2
2
DCT is identical to the level 1 Haar transform (so
see (b) of Figure 1) and the
8×8
8
8
set is in (c) of Figure 1.
Figure 5 and
Figure 6 show the mesh plots of
the entropies of the subimages in Figure 4.
Figure 7 compares the total
entropy per pel for the 4 DCT sizes with the equivalent 4 Haar
transform sizes. We see that the DCT is significantly better
than the rather simpler Haar transform.
As regards the optimum DCT size, from Figure 7, the
16×16
16
16
DCT seems to be marginally better than the
8×8
8
8
DCT, but subjectively this is not the case since
quantisation artefacts become more visible as the block size
increases. In practise, for a wide range of images and viewing
conditions,
8×8
8
8
has been found to be the optimum DCT block size and
is specified in most current coding standards.