It is highly recommended that the information presented in Mutual Information and in Typical Sequences be reviewed before
proceeding with this document. An introductory module on the
theorem is available at Noisy Channel
Theorems .

The capacity of a discrete-memoryless channel is given by

C=max
p
X
x
ℐX;Y
p
X
x
C
p
X
x
p
X
x
X
Y

(1)
where

ℐX;Y
X
Y
is the mutual information between the channel input

XX and the output

YY. If the transmission rate

RR is less than

CC, then for any

ε>0
ε
0
there exists a code with block length

nn large enough whose error
probability is less than

εε. If

R>C
R
C
,
the error probability of any code with any block length is
bounded away from zero.

If we have a binary symmetric channel with cross over
probability 0.1, then the capacity
C≃0.5
C
0.5
bits per transmission. Therefore, it is possible to send 0.4
bits per channel through the channel reliably. This means
that we can take 400 information bits and map them into a code
of length 1000 bits. Then the whole code can be transmitted
over the channels. One hundred of those bits may be detected
incorrectly but the 400 information bits may be decoded
correctly.

Before we consider continuous-time additive white Gaussian
channels, let's concentrate on discrete-time Gaussian channels

Y
i
=
X
i
+
η
i
Y
i
X
i
η
i

(2)
where the

X
i
X
i
's
are information bearing random variables and

η
i
η
i
is a Gaussian random variable with variance

σ
η
2
σ
η
2
.
The input

X
i
X
i
's are constrained to have power less than

PP
1n∑
i
=1n
X
i
2≤P
1
n
i
1
n
X
i
2
P

(3)
Consider an output block of size nn

For large

nn, by the Law of Large
Numbers,

1n∑
i
=1n
η
i
2=1n∑
i
=1n|
y
i
−
x
i
|2≤
σ
η
2
1
n
i
1
n
η
i
2
1
n
i
1
n
y
i
x
i
2
σ
η
2

(5)
This indicates that with large probability as

nn approaches infinity,

YY will be
located in an

nn-dimensional sphere
of radius

n
σ
η
2
n
σ
η
2
centered about

XX
since

|y−x|2≤n
σ
η
2
y
x
2
n
σ
η
2
On the other hand since
X
i
X
i
's are power constrained and
η
i
η
i
and
X
i
X
i
's are independent

1n∑
i
=1n
y
i
2≤P+
σ
η
2
1
n
i
1
n
y
i
2
P
σ
η
2

(6)
|Y|≤n(P+
σ
η
2)
Y
n
P
σ
η
2

(7)
This mean

YY
is in a sphere of radius

n(P+
σ
η
2)
n
P
σ
η
2
centered around the origin.

How many
XX's
can we transmit to have nonoverlapping
YY
spheres in the output domain? The question is how many spheres of
radius
n
σ
η
2
n
σ
η
2
fit in a sphere of radius
n(P+
σ
η
2)
n
P
σ
η
2
.

M=n(
σ
η
2+P)nn
σ
η
2n=1+P
σ
η
2n2
M
n
σ
η
2
P
n
n
σ
η
2
n
1
P
σ
η
2
n
2

(8)
How many bits of information can one send in
nn uses of the channel?

log
2
1+P
σ
η
2n2
2
1
P
σ
η
2
n
2

(9)
The capacity of a discrete-time Gaussian channel
C=12log
2
(1+P
σ
η
2)
C
1
2
2
1
P
σ
η
2
bits per channel use.

When the channel is a continuous-time, bandlimited, additive white
Gaussian with noise power spectral density
N
0
2
N
0
2
and input power constraint PP and
bandwidth WW. The system can be
sampled at the Nyquist rate to provide power per sample
PP and noise power

σ
η
2=∫−WW
N
0
2d
f
=W
N
0
σ
η
2
f
W
W
N
0
2
W
N
0

(10)
The channel capacity

12log
2
(1+P
N
0
W)
1
2
2
1
P
N
0
W
bits per transmission. Since the sampling rate is

2W
2
W
, then

C=2W2log
2
(1+P
N
0
W)
bits/trans. x trans./sec
C
2
W
2
2
1
P
N
0
W
bits/trans. x trans./sec

(11)
C=Wlog
2
(1+P
N
0
W)bitssec
C
W
2
1
P
N
0
W
bits
sec

(12)
The capacity of the voice band of a telephone channel can be
determined using the Gaussian model. The bandwidth is 3000 Hz
and the signal to noise ratio is often 30 dB. Therefore,

C=3000log
2
(1+1000)≃30000bitssec
C
3000
2
1
1000
30000
bits
sec

(13)
One should not expect to design modems faster than 30 Kbs
using this model of telephone channels. It is also
interesting to note that since the signal to noise ratio is
large, we are expecting to transmit 10 bits/second/Hertz
across telephone channels.