Connexions

You are here: Home » Content » Source Coding
Content Actions
Lenses

What is a lens?

Lenses

A lens is a custom view of Connexions content. You can think of it as a fancy kind of list that will let you see Connexions through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to Connexions materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual Connexions member, a community, or a respected organization.

This content is ...
Affiliated with (?)
This content is either by members of the organizations listed or about topics related to the organizations listed. Click each link to see a list of all content affiliated with the organization.
  • This module is included inLens: Rice University OpenCourseWare
    By: OpenCourseWare ConsortiumAs a part of collection:"Digital Communication Systems"

    Click the "Rice University OCW" link to see all content affiliated with them.

    Rice University OCW
Tags

(?)

These tags come from the endorsement, affiliation, and other lenses that include this content.

Source Coding

Module by: Behnaam Aazhang

Summary: An introduction to the concept of typical sequences, which lie at the heart of source coding. The idea of typical sequences leads to Shannon's source-coding Theorem.

As mentioned earlier, how much a source can be compressed should be related to its entropy. In 1948, Claude E. Shannon introduced three theorems and developed very rigorous mathematics for digital communications. In one of the three theorems, Shannon relates entropy to the minimum number of bits per second required to represent a source without much loss (or distortion).
Consider a source that is modeled by a discrete-time and discrete-valued random process X 1 X 1 , X 2 X 2 , …, X n X n , … where x i a 1 a 2 a N x i a 1 a 2 a N and define p X i x i = a j = p j p X i x i a j p j for j= 1 , 2 , , N j 1 , 2 , , N , where it is assumed that X 1 X 1 , X 2 X 2 ,… X n X n are mutually independent and identically distributed.
Consider a sequence of length nn
X= X 1 X 2 X n X X 1 X 2 X n (1)
The symbol a 1 a 1 can occur with probability p 1 p 1 . Therefore, in a sequence of length nn, on the average, a 1 a 1 will appear n p 1 n p 1 times with high probabilities if nn is very large.
Therefore,
PX=x= p X 1 x 1 p X 2 x 2 p X n x n P X x p X 1 x 1 p X 2 x 2 p X n x n (2)
PX=x p 1 n p 1 p 2 n p 2 p N n p N =i=1N p i n p i P X x p 1 n p 1 p 2 n p 2 p N n p N i 1 N p i n p i (3)
where p i =P X j = a i p i P X j a i for all jj and for all ii.
A typical sequence X X may look like
X= a 2 a 1 a N a 2 a 5 a 1 a N a 6 X a 2 a 1 a N a 2 a 5 a 1 a N a 6 (4)
where a i a i appears n p i n p i times with large probability. This is referred to as a typical sequence. The probability of X X being a typical sequence is
PX=xi=1N p i n p i =i=1N2log2 p i n p i =i=1N2n p i log2 p i =2ni=1N p i log2 p i =2-nHX P X x i 1 N p i n p i i 1 N 2 2 p i n p i i 1 N 2 n p i 2 p i 2 n i 1 N p i 2 p i 2 n H X (5)
where HX H X is the entropy of the random variables X 1 X 1 , X 2 X 2 ,…, X n X n .
For large nn, almost all the output sequences of length nn of the source are equally probably with probability2-nHX probability 2 n H X . These are typical sequences. The probability of nontypical sequences are negligible. There are Nn N n different sequences of length nn with alphabet of size NN. The probability of typical sequences is almost 1.
k=1# of typical seq.2-nHX=1 k 1 # of typical seq. 2 n H X 1 (6)
Figure7-17.png
Figure 1
Example 1 
Consider a source with alphabet {A,B,C,D} with probabilities { 12 1 2 , 14 1 4 , 18 1 8 , 18 1 8 }. Assume X 1 X 1 , X 2 X 2 ,…, X 8 X 8 is an independent and identically distributed sequence with X i ABCD X i A B C D with the above probabilities.
HX=-12log212-14log214-18log218-18log218=12+24+38+38=4+4+68=148 H X 1 2 2 1 2 1 4 2 1 4 1 8 2 1 8 1 8 2 1 8 1 2 2 4 3 8 3 8 4 4 6 8 14 8 (7)
The number of typical sequences of length 8
28148=214 2 8 14 8 2 14 (8)
The number of nontypical sequences 48-214=216-214=2144-1=3×214 4 8 2 14 2 16 2 14 2 14 4 1 3 2 14
Examples of typical sequences include those with A appearing 812=4 8 1 2 4 times, B appearing 814=2 8 1 4 2 times, etc. {A,D,B,B,A,A,C,A}, {A,A,A,A,C,D,B,B} and much more.
Examples of nontypical sequences of length 8: {D,D,B,C,C,A,B,D}, {C,C,C,C,C,B,C,C} and much more. Indeed, these definitions and arguments are valid when n is very large. The probability of a source output to be in the set of typical sequences is 1 when n n . The probability of a source output to be in the set of nontypical sequences approaches 0 as n n .
The essence of source coding or data compression is that as n n , nontypical sequences never appear as the output of the source. Therefore, one only needs to be able to represent typical sequences as binary codes and ignore nontypical sequences. Since there are only 2nHX 2 n H X typical sequences of length nn, it takes nHX n H X bits to represent them on the average. On the average it takes HX H X bits per source output to represent a simple source that produces independent and identically distributed outputs.
Theorem 1: Shannon's Source-Coding 
A source that produced independent and identically distributed random variables with entropy HH can be encoded with arbitrarily small error probability at any rate RR in bits per source output if RH R H . Conversely, if R<H R H , the error probability will be bounded away from zero, independent of the complexity of coder and decoder.
The source coding theorem proves existence of source coding techniques that achieve rates close to the entropy but does not provide any algorithms or ways to construct such codes.
If the source is not i.i.d. (independent and identically distributed), but it is stationary with memory, then a similar theorem applies with the entropy HX H X replaced with the entropy rate H=limnH X n | X 1 X 2 X n-1 H n H X n | X 1 X 2 X n-1
In the case of a source with memory, the more the source produces outputs the more one knows about the source and the more one can compress.
Example 2 
The English language has 26 letters, with space it becomes an alphabet of size 27. If modeled as a memoryless source (no dependency between letters in a word) then the entropy is HX=4.03 H X 4.03 bits/letter.
If the dependency between letters in a text is captured in a model the entropy rate can be derived to be H=1.3 H 1.3 bits/letter. Note that a non-information theoretic representation of a text may require 5 bits/letter since 25 2 5 is the closest power of 2 to 27. Shannon's results indicate that there may be a compression algorithm with the rate of 1.3 bits/letter.
Although Shannon's results are not constructive, there are a number of source coding algorithms for discrete time discrete valued sources that come close to Shannon's bound. One such algorithm is the Huffman source coding algorithm. Another is the Lempel and Ziv algorithm.
Huffman codes and Lempel and Ziv apply to compression problems where the source produces discrete time and discrete valued outputs. For cases where the source is analog there are powerful compression algorithms that specify all the steps from sampling, quantizations, and binary representation. These are referred to as waveform coders. JPEG, MPEG, vocoders are a few examples for image, video, and voice, respectively.

Comments, questions, feedback, criticisms?

Send feedback