In this and the following modules the basic concepts of information
theory will be introduced. For simplicity we assume that the signals
are time discrete. Time discrete signals often arise from sampling a time
continous signal. The assumption of time discrete signal is valid because
we will only be looking at bandlimited signals.
(Which can, as we know,
be perfectly reconstructed).
In treating time discrete signal and their information content we have
to distinguish between two types of signals:
 signals have amplitude levels belonging to
a finite set
 signals that have amplitudes
taken from the real line
In the first case we can measure the information content in terms
of
entropy, while in the second case the entropy
is infinte and we must resort to characterise the source by means of
differential entropy.
The signals treated are mainly of a stochastic nature, i.e. the signal
is unknown to us.
Since the signal is not known to the destination
(because of it's stochastic nature), it is then best
modeled as a random process, discretetime or continuous time.
Examples of information sources that we model as random processes are:
 Digital data source (e.g. a text) can be modeled as a
random process.

Video signals can be modeled as a random
process. Such signals are mainly bandlimited to
around 5 MHz (the value depends on the standards used to
raster the frames of image).

Audio signals can be modeled as a random
process. Speech is typically between 300 Hz and
3400 Hz, see Figure 1.
Video and speech are analog information signals are bandlimited. Therefore, if
sampled faster than two times the highest fequency component, they can be reconstructed
from their sample values.
A speech signal with bandwidth of 3100 Hz can be sampled at
the rate of 6.2 KHz. If the samples are quantized with a 8
level quantizer then the speech signal can be represented with
a binary sequence with bit rate
6200log
2
8=18600 bits/sec
6200
2
8
18600
bits/sec
(1)
The sampled real values can be quantized to create a discretetime
discretevalued random process.
The key observation from the discussion above is
that for a reveiver the signals are unknown.
It is exact this uncertainty that enables the signal
to transmit information. This is the core of information theory:
Information transfer happens when the receiver is
unable to know or predict at message before it is
received.
Here we present some statistics with the intent of
reviewing a few basic concepts and to introduce the notation.
Let X be a stochastic variable. Let
X=xi
X
xi
and
X=xj
X
xj
denote two outcomes of X.

Dependent outcomes implies:
PrX=xiX=xj=PrX=xiPrX=xj
xi
=PrX=xjPrX=xi
xj
X
xi
X
xj
X
xi
xi
X
xj
X
xj
xj
X
xi

Independent outcomes implies
PrX=xiX=xj=PrX=xiPrX=xj
X
xi
X
xj
X
xi
X
xj

Bayes' rule:
PrX=xj
xi
=PrX=xi
xj
PrX=xjPrX=xi
xi
X
xj
xj
X
xi
X
xj
X
xi
More about basic probability theory and a derivation of Bayes'
rule can be found
here.