Skip to content Skip to navigation

Connexions

You are here: Home » Content » The 2-dimensional DCT

Navigation

Lenses

What is a lens?

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

This content is ...

In these lenses

  • eScience, eResearch and Computational Problem Solving

    This module is included inLens: eScience, eResearch and Computational Problem Solving
    By: Jan E. OdegardAs a part of collection: "Image Coding"

    Click the "eScience, eResearch and Computational Problem Solving" link to see all content selected in this lens.

Recently Viewed

This feature requires Javascript to be enabled.
 

The 2-dimensional DCT

Module by: Nick Kingsbury. E-mail the author

Summary: This module introduces the 2-dimensional DCT.

In the equation from our discussion of the Haar transform: y=TxTT y T x T and to invert: x=TTyT x T y T we saw how a 1-D transform could be extended to 2-D by pre- and post-multiplication of a square matrix xx to give a matrix result yy. Our example then used 2×2 2 2 matrices, but this technique applies to square matrices of any size.

Hence the DCT may be extended into 2-D by this method.

E.g. the 8×8 8 8 DCT transforms a subimage of 8×8 8 8 pels into a matrix of 8×8 8 8 DCT coefficients.

The 2-D basis functions, from which xx may be reconstructed, are given by the n2 n 2 separate products of the columns of TT T with the rows of TT. These are shown for n=8 n 8 in (a) of Figure 1 as 64 subimages of size 8×8 8 8 pels.

The result of applying the 8×8 8 8 DCT to the Lenna image is shown in (b) of Figure 1. Here each 8×8 8 8 block of pels xx is replaced by the 8×8 8 8 block of DCT coefficients yy. This shows the 8×8 8 8 block structure clearly but is not very meaningful otherwise.

Part(c) of Figure 1 shows the same data, reordered into 64 subimages of 32×32 32 32 coefficients each so that each subimage contains all the coefficients of a given type - e.g: the top left subimage contains all the coefficients for the top left basis function from (a) of Figure 1. The other subimages and basis functions correspond in the same way.

We see the major energy concentration to the subimages in the top left corner. (d) of Figure 1 is an enlargement of the top left 4 subimages of (c) of Figure 1 and bears a strong similarity to the group of third level Haar subimages in (b) of this figure. To emphasise this the histograms and entropies of these 4 subimages are shown in Figure 2.

Comparing Figure 2 with this figure, the Haar transform equivalent, we see that the Lo-Lo bands have identical energies and entropies. This is because the basis functions are identical flat surfaces in both cases. Comparing the other 3 bands, we see that the DCT bands contain more energy and entropy than their Haar equivalents, which means less energy (and so hopefully less entropy) in the higher DCT bands (not shown) because the total energy is fixed (the transforms all preserve total energy). The mean entropy for all 64 subimages is 1.3622 bit/pel, which compares favourably with the 1.6103 bit/pel for the 4-level Haar transformed subimages using the same Qstep =15 Qstep 15 .

Figure 1: (a) Basis functions of the 8×8 8 8 DCT; (b) Lenna transformed by the 8×8 8 8 DCT; (c) reordered into subimages grouped by coefficient type; (d) top left 4 subimages from (c).
Figure 1 (figure2.png)
Figure 2: The probabilities pi pi and entropies hi hi for the 4 subimages from the top left of the 8×8 8 8 DCT ((d) of Figure 1).
Figure 2 (figure3.png)
Figure 3: (a) Mesh and (b) row plots of the entropies of the subimages of (c) of Figure 1.
Figure 3 (figure4.png)
Figure 4: Lenna transformed by the 4×4 4 4 DCT (a) and 16×16 16 16 DCT (b).
Figure 4 (figure5.png)

What is the optimum DCT size?

This is a similar question to: What is the optimum number of levels for the Haar transform?

We have analysed Lenna using DCT sizes from 2×2 2 2 to 16×16 16 16 to investigate this. Figure 4 shows the 4×4 4 4 and 16×16 16 16 sets of DCT subimages. The 2×2 2 2 DCT is identical to the level 1 Haar transform (so see (b) of Figure 1) and the 8×8 8 8 set is in (c) of Figure 1.

Figure 5 and Figure 6 show the mesh plots of the entropies of the subimages in Figure 4.

Figure 7 compares the total entropy per pel for the 4 DCT sizes with the equivalent 4 Haar transform sizes. We see that the DCT is significantly better than the rather simpler Haar transform.

As regards the optimum DCT size, from Figure 7, the 16×16 16 16 DCT seems to be marginally better than the 8×8 8 8 DCT, but subjectively this is not the case since quantisation artefacts become more visible as the block size increases. In practise, for a wide range of images and viewing conditions, 8×8 8 8 has been found to be the optimum DCT block size and is specified in most current coding standards.

Figure 5: (a) Mesh and (b) row plots of the entropies of the 4×4 4 4 DCT in (a) of Figure 4.
Figure 5 (figure6.png)
Figure 6: (a) Mesh and (b) row plots of the entropies of the 16×16 16 16 DCT in (b) of Figure 4.
Figure 6 (figure7.png)
Figure 7: Comparison of the mean entropies of the Haar transform of Lenna at levels 1 to 4, and of the DCT for sizes from 2×2 2 2 to 16×16 16 16 pels with Qstep =15 Qstep 15 .
Figure 7 (figure8.png)

Content actions

Download module as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Add module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks