Skip to content Skip to navigation

Connexions

You are here: Home » Content » Entropy

Navigation

Content Actions

  • Download module PDF
  • Add to ...
    Add the module to:
    • My Favorites
    • A lens
    • An external social bookmarking service
    • My Favorites (What is 'My Favorites'?)
      'My Favorites' is a special kind of lens which you can use to bookmark modules and collections directly in Connexions. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need a Connexions account to use 'My Favorites'.
    • A lens (What is a lens?)

      Definition of a lens

      Lenses

      A lens is a custom view of Connexions content. You can think of it as a fancy kind of list that will let you see Connexions through the eyes of organizations and people you trust.

      What is in a lens?

      Lens makers point to Connexions materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

      Who can create a lens?

      Any individual Connexions member, a community, or a respected organization.

      What are tags? tag icon

      Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

    • External bookmarks
  • E-mail the author
  • Rate this module (How does the rating system work?)

    Rating system

    Ratings

    Ratings allow you to judge the quality of modules. If other users have ranked the module then its average rating is displayed below. Ratings are calculated on a scale from one star (Poor) to five stars (Excellent).

    How to rate a module

    Hover over the star that corresponds to the rating you wish to assign. Click on the star to add your rating. Your rating should be based on the quality of the content. You must have an account and be logged in to rate content.

    (0 ratings)

Recently Viewed

This feature requires Javascript to be enabled.

Entropy

Module by: Behnaam Aazhang

Summary: This module presents a quantification of information by the use of entropy. Entropy, or average self-information, measures the uncertainty of a source and hence provides a measure of the information it could reveal.

Note: Your browser may not currently support MathML. See our browser support page for additional details. You can always view the correct math in the PDF version.

Information sources take very different forms. Since the information is not known to the destination, it is then best modeled as a random process, discrete-time or continuous time.

Here are a few examples:

  • Digital data source (e.g., a text) can be modeled as a discrete-time and discrete valued random process X 1 X 1 , X 2 X 2 , …, where X i ABCDE X i A B C D E with a particular p X 1 x p X 1 x , p X 2 x p X 2 x , …, and a specific p X 1 X 2 p X 1 X 2 , p X 2 X 3 p X 2 X 3 , …, and p X 1 X 2 X 3 p X 1 X 2 X 3 , p X 2 X 3 X 4 p X 2 X 3 X 4 , …, etc.
  • Video signals can be modeled as a continuous time random process. The power spectral density is bandlimited to around 5 MHz (the value depends on the standards used to raster the frames of image).
  • Audio signals can be modeled as a continuous-time random process. It has been demonstrated that the power spectral density of speech signals is bandlimited between 300 Hz and 3400 Hz. For example, the speech signal can be modeled as a Gaussian process with the shown power spectral density over a small observation period.

Figure 1
Figure 1 (Figure7-5.png)

These analog information signals are bandlimited. Therefore, if sampled faster than the Nyquist rate, they can be reconstructed from their sample values.

Example 1

A speech signal with bandwidth of 3100 Hz can be sampled at the rate of 6.2 kHz. If the samples are quantized with a 8 level quantizer then the speech signal can be represented with a binary sequence with the rate of

6.2×103log28=18600bitssamplesamplessec=18.6kbitssec 6.23 2 8 18600 bits sample samples sec 18.6 kbits sec (1)

Figure 2
Figure 2 (Figure7-6.png)

The sampled real values can be quantized to create a discrete-time discrete-valued random process. Since any bandlimited analog information signal can be converted to a sequence of discrete random variables, we will continue the discussion only for discrete random variables.

Example 2

The random variable xx takes the value of 0 with probability 0.9 and the value of 1 with probability 0.1. The statement that x=1 x 1 carries more information than the statement that x=0 x 0 . The reason is that xx is expected to be 0, therefore, knowing that x=1 x 1 is more surprising news!! An intuitive definition of information measure should be larger when the probability is small.

Example 3

The information content in the statement about the temperature and pollution level on July 15th in Chicago should be the sum of the information that July 15th in Chicago was hot and highly polluted since pollution and temperature could be independent.

Ihothigh=Ihot+Ihigh I hot high I hot I high (2)

An intuitive and meaningful measure of information should have the following properties:

  1. Self information should decrease with increasing probability.
  2. Self information of two independent events should be their sum.
  3. Self information should be a continuous function of the probability.
The only function satisfying the above conditions is the -log of the probability.

Definition 1: Entropy
1. The entropy (average self information) of a discrete random variable XX is a function of its probability mass function and is defined as
HX=-i=1NpX x i logpX x i H X i 1 N p X x i p X x i (3)
where NN is the number of possible values of XX and pX x i =PrX= x i p X x i X x i . If log is base 2 then the unit of entropy is bits. Entropy is a measure of uncertainty in a random variable and a measure of information it can reveal.
2. A more basic explanation of entropy is provided in another module.

Example 4

If a source produces binary information 01 0 1 with probabilities pp and 1p 1 p . The entropy of the source is

HX=-plog2p1plog21p H X p 2 p 1 p 2 1 p (4)
If p=0 p 0 then HX=0 H X 0 , if p=1 p 1 then HX=0 H X 0 , if p=1/2 p 12 then HX=1 H X 1 bits. The source has its largest entropy if p=1/2 p 12 and the source provides no new information if p=0 p 0 or p=1 p 1 .

Figure 3
Figure 3 (Figure7-10.png)

Example 5

An analog source is modeled as a continuous-time random process with power spectral density bandlimited to the band between 0 and 4000 Hz. The signal is sampled at the Nyquist rate. The sequence of random variables, as a result of sampling, are assumed to be independent. The samples are quantized to 5 levels -2-1012 -2 -1 0 1 2 . The probability of the samples taking the quantized values are 121418116116 1 2 1 4 1 8 1 16 1 16 , respectively. The entropy of the random variables are

HX=-12log21214log21418log218116log2116116log2116=12log22+14log24+18log28+116log2 16 +116log216=12+12+38+48=158bitssample H X 1 2 2 1 2 1 4 2 1 4 1 8 2 1 8 1 16 2 1 16 1 16 2 1 16 1 2 2 2 1 4 2 4 1 8 2 8 1 16 2 16 1 16 2 16 1 2 1 2 3 8 4 8 15 8 bits sample (5)
There are 8000 samples per second. Therefore, the source produces 8000158=15000bitssec 8000 15 8 15000 bits sec of information.

Definition 2: Joint Entropy
The joint entropy of two discrete random variables (XX, YY) is defined by
HXY=-ijpXY x i y j logpXY x i y j H X Y i i j j p X Y x i y j p X Y x i y j (6)

The joint entropy for a random vector X= X 1 X 2 X n T X X 1 X 2 X n is defined as

HX=- x 1 x 2 x n pX x 1 x 2 x n logpX x 1 x 2 x n H X x 1 x 1 x 2 x 2 x n x n p X x 1 x 2 x n p X x 1 x 2 x n (7)

Definition 3: Conditional Entropy
The conditional entropy of the random variable XX given the random variable YY is defined by
H X | Y =-ijpXY x i y j log p X | Y x i | y j H X | Y i i j j p X Y x i y j p X | Y x i | y j (8)

It is easy to show that

HX=H X 1 +H X 2 | X 1 ++H X n | X 1 X 2 X n-1 H X H X 1 H X 2 | X 1 H X n | X 1 X 2 X n-1 (9)
and
HXY=HY+H X | Y =HX+H Y | X H X Y H Y H X | Y H X H Y | X (10)
If X 1 X 1 , X 2 X 2 , …, X n X n are mutually independent it is easy to show that
HX=i=1nH X i H X i 1 n H X i (11)

Definition 4: Entropy Rate
The entropy rate of a stationary discrete-time random process is defined by
H=limnH X n | X 1 X 2 X n H n H X n | X 1 X 2 X n (12)
The limit exists and is equal to
H=limn1nH X 1 X 2 X n H n 1 n H X 1 X 2 X n (13)
The entropy rate is a measure of the uncertainty of information content per output symbol of the source.

Entropy is closely tied to source coding. The extent to which a source can be compressed is related to its entropy. In 1948, Claude E. Shannon introduced a theorem which related the entropy to the number of bits per second required to represent a source without much loss.

Comments, questions, feedback, criticisms?

Send feedback