Connexions

You are here: Home » Content » Huffman Coding
Content Actions
Lenses

What is a lens?

Lenses

A lens is a custom view of Connexions content. You can think of it as a fancy kind of list that will let you see Connexions through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to Connexions materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual Connexions member, a community, or a respected organization.

This content is ...
Affiliated with (?)
This content is either by members of the organizations listed or about topics related to the organizations listed. Click each link to see a list of all content affiliated with the organization.
  • This module is included inLens: Rice University OpenCourseWare
    By: OpenCourseWare ConsortiumAs a part of collection:"Digital Communication Systems"

    Click the "Rice University OCW" link to see all content affiliated with them.

    Rice University OCW
Tags

(?)

These tags come from the endorsement, affiliation, and other lenses that include this content.

Huffman Coding

Module by: Behnaam Aazhang

Summary: A description of the Huffman source encoding algorithm.

One particular source coding algorithm is the Huffman encoding algorithm. It is a source coding algorithm which approaches, and sometimes achieves, Shannon's bound for source compression. A brief discussion of the algorithm is also given in another module.

Huffman encoding algorithm

  1. Sort source outputs in decreasing order of their probabilities
  2. Merge the two least-probable outputs into a single output whose probability is the sum of the corresponding probabilities.
  3. If the number of remaining outputs is more than 2, then go to step 1.
  4. Arbitrarily assign 0 and 1 as codewords for the two remaining outputs.
  5. If an output is the result of the merger of two outputs in a preceding step, append the current codeword with a 0 and a 1 to obtain the codeword the the preceding outputs and repeat step 5. If no output is preceded by another output in a preceding step, then stop.
Example 1 
XABCD X A B C D with probabilities { 12 1 2 , 14 1 4 , 18 1 8 , 18 1 8 }
Figure7-24.png
Figure 1
Average length=121+142+183+183=148 Average length 1 2 1 1 4 2 1 8 3 1 8 3 14 8 . As you may recall, the entropy of the source was also HX=148 H X 14 8 . In this case, the Huffman code achieves the lower bound of 148bitsoutput 14 8 bits output .
In general, we can define average code length as
¯=x X ¯ pXxx x x X ¯ p X x x (1)
where X ¯ X ¯ is the set of possible values of xx.
It is not very hard to show that
HX¯>HX+1 H X H X 1 (2)
For compressing single source output at a time, Huffman codes provide nearly optimum code lengths.
The drawbacks of Huffman coding
  1. Codes are variable length.
  2. The algorithm requires the knowledge of the probabilities, pXx p X x for all x X ¯ x X ¯ .
Another powerful source coder that does not have the above shortcomings is Lempel and Ziv.

Comments, questions, feedback, criticisms?

Send feedback