Skip to content Skip to navigation

OpenStax-CNX

You are here: Home » Content » JPEG Entropy Coding

Navigation

Lenses

What is a lens?

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

This content is ...

In these lenses

  • eScience, eResearch and Computational Problem Solving

    This module is included inLens: eScience, eResearch and Computational Problem Solving
    By: Jan E. OdegardAs a part of collection: "Image Coding"

    Click the "eScience, eResearch and Computational Problem Solving" link to see all content selected in this lens.

Recently Viewed

This feature requires Javascript to be enabled.
 

JPEG Entropy Coding

Module by: Nick Kingsbury. E-mail the author

Summary: This module gives a clever method of coding, JPEG entropy coding.

The entropy plots of the Quantisation of the DCT coefficients show the theoretical entropies of each DCT sub-band. In practise this would be a poor way to code the data because:

  • 64 separate entropy codes would be required (each requiring many extra states to represent run-length coding of zeros).
  • The statistics for each code are likely to vary significantly from image to image.
  • To transmit the code table for each sub-band as header information would involve a large coding overhead (many extra bits).
  • Coding the sub-bands separately does not take account of the correlations which exist between the positions of the non-zero coefs in one sub-band with those of nearby sub-bands (see subfigures (c) and (d) from a previous module).
JPEG uses a clever alternative method of coding, based on combining run-length and amplitude information into a single Huffman code for the whole of the image (except the DC sub-band which is coded separately because its statistics are so different).

The code is applied to each block of 8×8 8 8 quantised DCT coefs from a single 8×8 8 8 pel region. The blocks are the coefs before reordering as shown in subfigure (b) of a previous module and comprise one coef from each of the 64 sub-bands.

Each block of 8×8 8 8 quantised coefs is formed into a 1-D vector by zig-zag scanning in the sequence: ( 015614152728 2471316262942 38121725304143 911182431404453 1019233239455254 2022333846515560 2134374750565961 3536484957586263 ) 0 1 5 6 14 15 27 28 2 4 7 13 16 26 29 42 3 8 12 17 25 30 41 43 9 11 18 24 31 40 44 53 10 19 23 32 39 45 52 54 20 22 33 38 46 51 55 60 21 34 37 47 50 56 59 61 35 36 48 49 57 58 62 63

The JPEG Code for DC coefs

The first coefficient (0) of each block (vector) is the DC coef, which represents the mean value of the pels in the block (see the top left basis function in subfigure (a) from previous discussion).

The DC coefs still exhibit significant local correlations (top left of subfigure (b)), so differential coding is used in which the value to be coded is the difference between the current DC coef and that of the previous block - the blocks are scanned from left to right, row by row. The first block in each row is coded with respect to zero.

The histogram of entropies of the DC coef differences is compared in Figure 1 with that of the raw DC coefs from this previous figure. We note the histogram peak around zero and see that the entropy is reduced from 6.42 bits to 6.07 bits.

Figure 1: Histograms of the DC coefficients from the 8×8 8 8 DCT of Lenna, showing the entropy reduction with differential coding.
Figure 1 (figure11.png)

The size of the differences can in theory be up to ±255×8=±2040 ± 255 8 ± 2040 if the input pels occupy the range -128-128 to +127 + 127 (the DCT has a gain of 8 at very low frequencies). Hence the Huffman code table would have to be quite large. JPEG adopts a much smaller code by using a form of floating-point representation, where Size is the base-2 exponent and Additional Bits are used to code the polarity and precise amplitude as follows:

Table 1
DC Coef Difference Size Typical Huffman codes for Size Additional Bits (in binary)
0 0 00 -
-1,1 1 010 0,1
-3,-2,2,3 2 011 00,01,10,11
-7,…,-4,4,…,7 3 100 000,…,011,100,…111
-15,…-8,8,…,15 4 101 0000,…,0111,1000,…,1111
-1023,…-512,512,…,1023 10 1111 1110 00 0000 0000,…,11 1111 1111
-2047,…-1024,1024,…2047 11 1 1111 1110 000 0000 0000,…,111 1111 1111

Only Size needs to be Huffman coded in the above scheme, since, within a given Size, all the input values have sufficiently similar probabilities for there to be little gain from entropy coding the Additional Bits (hence they are coded in simple binary as listed). Each coded Size is followed by the appropriate number of Additional Bits (equal to Size) to define the sign and magnitude of the coefficient difference exactly.

There are only 12 Sizes to be Huffman coded, so specifying the code table can be very simple and require relatively few bits in the header.

In JPEG all Huffman code tables are defined in the image header. Each table requires 16+n 16 n bytes, where nn is the number of codewords in the table.

The first 16 bytes list the number of codewords of each length from 1 to 16 bits (codewords longer than 16 bits are forbidden). The remaining nn bytes list the decoded output values of the nn codewords in ascending codeword order ( n<256 n 256 ).

Hence 16+12=28 16 12 28 bytes are needed to specify the code table for DC coefficients.

The JPEG Run-Amplitude Code

The remaining 63 coefs (the AC coefs) of each 64-element vector usually contain many zeros and so are coded with a combined run-amplitude Huffman code.

The codeword represents the run-length of zeros before a non-zero coef and the Size of that coef. This is then followed by the Additional Bits which define the coef amplitude and sign precisely. Size and Additional Bits are defined just as for DC coefs.

This 2-dimensional Huffman code (Run, Size) is efficient because there is a strong correlation between the Size of a coef and the expected Run of zeros which precedes it - small coefs usually follow long runs; larger coefs tend to follow shorter runs. No single 2-D event is so probable that the Huffman code becomes inefficient.

In order to keep the code table size nn below 256, only the following Run and Size values are coded: Run=015 Run 0 15 Size=110 Size 1 10 These require 160 codes. Two extra codes, corresponding to (Run,Size) = (0,0) and (15,0) are used for EOB (End-of-block) and ZRL (Zero run length).

EOB is transmitted after the last non-zero coef in a 64-vector. It is the most efficient way of coding the final run of zeros. It is omitted in the rare case that the final element of the vector is non-zero.

ZRL is transmitted whenever Run>15 Run 15 , and represents a run of 16 zeros (15 zeros and a zero amplitude coef) which can be part of a longer run of any length. Hence a run of 20 zeros followed by -5 would be coded as


	
	                   (ZRL) (4,3) 010
	
      

When the code tables are defined in the image header, each codeword is assigned to a given (Run,Size) pair by making the decoded output byte Code Byte equal to ( 16Run+Size 16 Run Size ).

The default JPEG code for (Run,Size) of AC luminance DCT coefficients is summarised below in order of decreasing code probability:

Table 2
(Run,Size) Code Byte (hex) Code Word (binary) (Run,Size) Code Byte (hex) Code Word (binary)
(0,1) 01 00 (0,6) 06 1111000
(0,2) 02 01 (1,3) 13 1111001
(0,3) 03 100 (5,1) 51 1111010
(EOB) 00 1010 (6,1) 61 1111011
(0,4) 04 1011 (0,7) 07 11111000
(1,1) 11 1100 (2,2) 22 11111001
(0,5) 05 11010 (7,1) 71 11111010
(1,2) 12 11011 (1,4) 14 111110110
(2,1) 21 11100
(3,1) 31 111010 (ZRL) F0 11111111001
(4,1) 41 111011

As an example, let us code the following 8×8 8 8 block: ( -13-3200010 60000000 00000000 -10000000 00000000 00000000 00000000 00000000 ) -13 -3 2 0 0 0 1 0 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Concerting this to (DC Size) or (Run,Size) and values for the Additional Bits gives:


	
	(4) -13 (0,2) -3 (0,3) 6  (2,2)  2  (3,1)  -1  (ZRL) (1,1) 1  (EOB)
	101 0010 01 00 100 110 11111001 10 111010 0 11111111001 1100 1 1010
	
      

The compressed bitstream for this block is listed on the lower line, assuming that the default Huffman code tables, given above, are used.

Figure 2 shows the histogram of probabilities for the (Run,Size) codewords used to code Lenna using the Qlum Qlum quantisation matrix. The bin number represents the decoded byte value.

Figure 3 shows the equivalent histogram when the quantisation matrix is 2 Qlum 2 Qlum .

Figure 2: Histogram of the (Run,Size) codewords for the DCT of Lenna, quantised using Qlum Qlum .
Figure 2 (figure12.png)
Figure 3: Histogram of the (Run,Size) codewords for the DCT of Lenna, quantised using 2 Qlum 2 Qlum .
Figure 3 (figure13.png)

Note the strong similarity between these histograms, despite the fact that Figure 3 represents only 23 2 3 as many events. Only the EOB probability changes significantly, because its probability goes up as the number of events (non-zero coefs) per block goes down.

It turns out that the (Run,Size) histogram remains relatively constant over a wide range of image material and across different regions of each image. This is because of the strong correlation between the run lengths and expected coef sizes. The number of events per block varies considerably depending on the local activity in the image, but the probability distribution of those events (except for EOB) changes much less.

Figure 2 and Figure 3 also give the mean bit rates to code Lenna for the two quantisation matrices. Comparing these with the theoretical entropies from this figure (lower row) we get:

Table 3
Q matrix Mean Entropy (b/pel) JPEG Bit Rate (b/pel) JPEG efficiency
Qlum Qlum 0.8595 0.8709 98.7%
2 Qlum 2 Qlum 0.5551 0.5595 99.21%

Hence we see the high efficiency of the (Run,Size) code at two quite different compression factors. This tends to apply over a wide range of images and compression factors and is an impressive achievement.

There is even very Little efficiency lost if a single code table is used for many images, which can avoid the need to transmit the 16+n 16 n (168 bytes) of code definition in the header of each image. Using the recommended JPEG default luminance tables (Annex K.3.3) the above efficiencies drop to 97.35% and 95.74% respectively.

Content actions

Download module as:

Add module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks