Skip to content Skip to navigation

Connexions

You are here: Home » Content » Network Info Theory: Introduction

Navigation

Content Actions

  • Download module PDF
  • Add to ...
    Add the module to:
    • My Favorites
    • A lens
    • An external social bookmarking service
    • My Favorites (What is 'My Favorites'?)
      'My Favorites' is a special kind of lens which you can use to bookmark modules and collections directly in Connexions. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need a Connexions account to use 'My Favorites'.
    • A lens (What is a lens?)

      Definition of a lens

      Lenses

      A lens is a custom view of Connexions content. You can think of it as a fancy kind of list that will let you see Connexions through the eyes of organizations and people you trust.

      What is in a lens?

      Lens makers point to Connexions materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

      Who can create a lens?

      Any individual Connexions member, a community, or a respected organization.

    • External bookmarks
  • E-mail the author

Recently Viewed

This feature requires Javascript to be enabled.

Network Info Theory: Introduction

Module by: Julius Kusuma

Summary: (Blank Abstract)

Introduction

In this summary we discuss two types of multiuser channels: the mult-access channel and broadcast channel. In the former many senders wish to send messages to one receiver, and in the letter one sender wishes to send messages to many receivers. We are specifically interested in a survey of tools that are used.

Information-Theoretic Tools

Here we list the tools that have been develeped over the years and used to prove various bounds

The Slepian-Wolf binning technique

First introduced by Slepian and Wolf in their landmark paper (Slepian:76) to show the capacity of source coding with side information. Also used by Gel'fand-Pinsker (Gelfand:80) to show the capacity of channel coding with side information.

The latter was extended to include the broadcast channel by Marton (Marton:79) and El Gamel (Gamel:81). Possibly also used earlier by Bergmans and Galleger in (Bergmans:73_Broadcast_Degraded, (Bergmans:74_Broadcast_AWGN) and (Gallager:74).

Interestingly also used to show and achievable region for interference channels by Carleial in (Carleial:78Interference). Probably also by Han.

Equivocation in a time series

First used by Shannon for a Fano-like lower bound on the error probability of a causal transmitter side information channel in (Shannon:58CCSI). This is similar to the technique used to bound the inner region of the two-way communication channel also by Shannon in (Shannon:61Twoway).

Images of a set via two channels

Used by Körner and Marton first in (Korner:77Images) and in (Korner:77BC_deg_mess), and finally in (Marton:79) to give a converse for the general discrete memoryless broadcast with one deterministic component channel.

In (Korner:77Images), the authors considered situation in the figure

Figure 1
Multi-source coding
Multi-source coding (figure1.png)

The problem in considering the encoder for Y Y is that it has to both satisfy the constraint of the decoder for Z Z and take advantage of the information from X as well as possible.

Moment bounding

First used by Wyner and Ziv in (Wyner:76) to show the rate-distortion function for source coding with side information. Conversely it was used to show the capacity of channel coding with side information but with power constraint by Pradhan (Sandeep) and is included in (Kusuma:01Thesis) Also used by Bergmans to prove the converse theorem for the AWGN degraded broadcast channel (with power constraint) in (Bergmans:74_Broadcast_AWGN). Also used by Gel'fand and Pinsker in (Gelfand:80) to prove the capacity of channel coding with noncausal side information.

Finally, this was used to give a simpler bound on the achievable region of the general nondegraded discrete memoryless broadcast channels by van der Meulen and El Gamal in (Gamal:81).

Method of types

The textbook of Csiszár and Körner uses the method of types to examine several multilaser information theory scenarios. Note that the method of types is applicable only to discrete systems. It appears that this sheds new light on previously difficult to understand results of Körner, Marton, Csiszár, et.al in various papers.

Multiple Access

In this section we review the basics behind the multiple access channel, including the definition, and the capacity region results, and some examples. This material is drawn directly from Cover and Thomas (Cover:91). Following the development there, we state and prove the results for the case of two senders to one receiver. The general case of dd senders then follows easily.

In addition, in order to begin suggesting some of the difficulties one encounters in the network information theory set-up, we give a simple example (again from Cover:91) that demonstrates that there is no general source--channel separation theorem in network information theory, as there is in the one--to--one communication channel.

Definition and Examples

We give the definition and examples here.

Definition 1: A Discrete Memoryless Multiple Access Channel
DMMAC consists of input alphabets 𝒳 1 𝒳 1 , 𝒳 2 𝒳 2 , and an output alphabet 𝒴 2 𝒴 2 , and a probability transition function of.

Encoding independently from each other, the two senders wish to send as much information as possible to the reciever. They encode using a code. We have the following.

Definition 2: Rate Code
A rate R 1 R 2 R 1 R 2 code for the multiple access channel, is a map from the message sets, 1 =122n R 1 1 1 2 2 n R 1 2 =122n R 2 2 1 2 2 n R 2 to 𝒳 1 n 𝒳 1 n , 𝒳 2 n 𝒳 2 n , respectively, given by encoding functions f 1 f 1 , f 2 f 2 , and a decoding function We call this a 2n R 1 2n R 2 n 2 n R 1 2 n R 2 n code.

We will assume that both senders select messages to send at random from their respective message sets, where the selected message is distributed uniformly at random. Therefore the fidelity criterion which we focus on is the average probability of error: Given this performance measure, the usual definition follow.

Definition 3: Rate Pair
A rate pair R 1 R 2 R 1 R 2 is called achievable if there exists a sequence of 2n R 1 2n R 2 n 2 n R 1 2 n R 2 n codes with as x x . The capacity region of the multiple access channel is the closure of all achievable rate pairs, R 1 R 2 R 1 R 2 .

Example

The Binary Addition Channel

Suppose that the message alphabets are both 01 0 1 , and that the receiver sees . By sending all 1's, say, from sender one, sender two can obtain a rate of 1, and similarily for sender one if sender two is restricted to only one digit. The convex hull of these extreme points defines a triangular region, which we will see is in fact the capacity region.

Example

The Binary Multiplication Channel

Now consider the simple modification of the above channel where instead of binary addition we have binary multiplication. Again the region is the same as above.

Example

The Binary Addition Channel

Now suppose that in example 1 above, the recieved sees the actual sum, rather than the mod two sum of the two messages sent. If the receiver sees 0 or 2, the decoding is unambiguous. A recieved 1, however, could be the result of a 10 1 0 , or a 01 0 1 from the senders. Suppose user 1 sends at full rate. Then, in fact, sender two can still send across some information. Indeed, in this case, from the second sender's perspective, the channel looks like and erasure channel with probability one half, and therefore sender two can transmit at rate R 2 =1-12 R 2 1 1 2 . This follows because we assumed the sent messages are uniformly distributed. Therefore, the second sender's message will be ambiguously received exactly when sender one sends a 1, and sender two a zero, and vice versa. This happens with probability one half. Graphically, this is given in the figure . We now state the theorem that gives the capacity region of the multiple access region. We give the proof in the next section.

Figure 2
Binary Erasure MAC
Binary Erasure MAC (multiaccess.png)

Some Comments on Channel Source Separartion

Here we give and example that shows that it is no longer obvious what we mean by source and channel separation, in the network information theory case. In the single sender and single receiver case, this theorem tells us that we may encode the source as we wish, meaning we may compress it in any optimal way we can, and then channel code optimally, with possibly no knowledge of the source coding used. In particular then, this implies we can transmit a source reliably across a channel with capacity CC, if and only if we can compress it to a source with rate R<CRC. Analogously, we would expect that a source may be transmitted across a network, if the compression region and capacity region overlap.

Consider the example of the binary multiple access channel given above: The first two senders send from the binary alphabet X=01 X 0 1 , and the receiver receives y= x 1 + x 2 y x 1 x 2 taking values from the ternary alphabet 012 0 1 2 . Recall that this has a capacity region given in the figure. Now consider the source that puts equal weight on the three points 000101 0 0 0 1 0 1 . Using Slepian-Wolf encoding, we can encode this source with rate . This is log31.58 3 1.58 . Notice then, that this point is not in the capacity region of the multiple access channel, as there is no point R 1 R 2 R 1 R 2 for which R 1 + R 2 >1.5 R 1 R 2 1.5 . However, it is clear that this channel can be transmitted without error (and in fact without coding) through the given multiple access channel.

The Broadcast Channel

The broadcast channel was first introduced by Cover in his famous work (Cover:72). Our expostion is based on Cover's review paper of the broadcast channel (Cover:98).

Definition 4: A Discrete Memoryless Broadcast Channel
DMBC consist of an input alphabet 𝒳𝒳 and two alphabets 𝒴𝒴 and 𝒵𝒵 and a probability transition function . The broadcast channel is said to be memoryless if:

The encoder wishes to find a codeword X X to be sent through the channel such that the two recievers can receive their messages reliably. The figure shows the block diagram.

Figure 3
The broadcast channel
The broadcast channel (figure3.png)

An interpretation of the encoding process is that there are two encoders for the two sets of messages that must collaborate to jointly encode their messages in the best way possible. Here we make a note of the special case where only one part of the encoder is allowed to adapt to the other part, instead of both parts of the encoder jointly adapting their encoding. As will be formalized in the next section, this is the same as if the codebook of the encoder which is not allowed to adapt is partitioned into bins with exactly one codeword each. This is a corner point of the achievable region that we will derive next, and is exactly channel coding with side information introduced by Gel'fand and Pinsker (Gelfand:80), illustrated in thefigure.

Figure 4
Channel coding with side information
Channel coding with side information (figure4.png)

The Degraded Broadcast Channel

The first popular special case of the broadcast channel is the degraded broadcast channel, introduced by Cover in (Cover:72).

Definition 5: Degraded Broadcast Channel
A broadcast channel is degraded if there exists a distribution such that:

Cover gave two examples of such channels: a binary symmetric broadcast channel, and the Gaussian broadcast channel. By definition, it is always possible to construct a degraded version of these channels. He also gave a construction and provided an achievable rate region, along with a conjecture of the achievable region for the more general degraded broadcast channel.

aside:

Note that due to the construction of the degraded broadcast channel, the stronger user always decodes the weaker user's message first before decoding its own message

Bergmans(Bergmans:73_Broadcast_Degraded) gave the proof of the achievable region of a general degraded broadcast channel. The converse was given by Bergmans(Bergmans:74_Broadcast_AWGN) and Gallager(Gallager:74). The Bergmans proof applies to the Gaussian broadcast channel, i.e. one with power constraint. Gallager's proof on the other hand uses an auxiliary variable U Uin terms of the collection of all the outputs up to the current time. Gallager's proof does not apply to power-constrained cases, and Bergmans' proof does not apply to the general case.

aside:

Note that the idea of using UU and defining it in terms of the collection of all outputs up to the currnet time is also used in the converse proof of the nondegraded broadcast channel with one deterministic component by Körner and Marton in (Marton:79). We will see this in SectionThe Discrete Memoryless Broadcast Channel.

Binary Symmetric Broadcast

Our first example is based on the first case explored in Cover's landmark paper(Cover:72). Suppose the channel to the first, stronger receiver is a noiseless Binary Symmetric Channel (BSC), and the channel to the second, weaker receiver is a BSC with crossover probability p p. Let α α be the fraction of the total power allocated to the stronger user, and α ̂ α ̂ the fraction of the total power allocated to the weaker user. The variable p ̂ p ̂ denotes 1-α 1 α .

In the exposition, Cover constructed 2nC α ̂ p+α p ̂ -ε 2 n C α ̂ p α p ̂ ε cloud centers in the form of codes for a compound channel with a higher crossover probability α ̂ p α ̂ p . With each cloud codewords, he sets aside all codewords of Hamming distance αn α n away from the cloud centers. Note that there are nαn n α n such satellite codewords for each cloud. Note that the limit of this expression as n n approaches 2nHα 2 n H α .

The stronger noiseless receiver will receive information at rate:

R 1 =C α ̂ p+α p ̂ +Hα-ε R 1 C α ̂ p α p ̂ H α ε (1)
Due to the embedding of the satellites, the weaker receiver perceives the cloud center as if if had been sent through an additional, cascaded BSC of parameter α α. Because of our design choice, it can still reliably decode the clouds and receive information at rate:
R 2 =C α ̂ p+α p ̂ -ε R 2 C α ̂ p α p ̂ ε (2)

What we would like to do is to be able to get the same capacity without requiring that the stronger user decode the message for the weaker user. We use a binning argument to show that this is possible. We retstrict ourselves to the same BSC scenario. Since the stronger user is not allowed to know the codebook of the second user, we can only assume that there are 2α p ̂ + α ̂ p 2 α p ̂ α ̂ p codewords of the weaker user, taken randomly from the 2n 2 n all possible binary n n-tuples.

The stronger user constructs its codebook as follows: it builds a main codebook from all possible 2n 2 n binary n n-tuples, and randomly partitions this into 2nHα 2 n H α bins. Each bin contains 2nCα 2 n C α codewords. At the transmitter, the message for the stronger user selects on of the bins. It then looks at the codeword selected by the weaker user and looks inside the selected bin of user 1 to find a codeword that is closest to the codeword of the user 2. It will almost surely find a codeword with Hamming distance of at most Hα H α away. The transmitter than sends this codeword into the channel.

The stronger user will be able to decode this from its noiseless channel, and receives information at rate:

R 1 =C α ̂ p+α p ̂ +Hα-ε R 1 C α ̂ p α p ̂ H α ε (3)
while the second user perceives its codeword as if it had first been subjected to a BSC with parameter α α, and then a BSC with parameter p p. Thus it can reliably decode its message at rate:
R 2 =C α ̂ p+α p ̂ -ε R 2 C α ̂ p α p ̂ ε (4)
We note however, that this example works only for the case where the channel for the first user is noiseless. In general Cover's capacity region is superior to that of Marton's for the degraded broadcast.

Gaussian Broadcast Channel

Let there be two users with noise N 1 N 1 and N 2 N 2 respectively, and let N 1 < N 2 N 1 N 2 . The first user is the better user and the second user is the worse user. We let the worse user pick any codeword into the one selected by the worse user. The better user considers the codeword selected by the worse user as side information of the additional noise the channel.

We now examine the achievable rates in a Guassian broadcast channel. From the Gel'fand and Pinsker framework, we know the following capacities. The capacity of one user is:

C 1 =12log1+ P 1 N 1 C 1 1 2 1 P 1 N 1 (5)
and the other user is:
C 2 =12log1+ P 2 P 1 + N 2 C 2 1 2 1 P 2 P 1 N 2 (6)
where we impose a total power constraint P total = P 1 + P 2 P total P 1 P 2 . This is consistent with the formulation in Cover if we consider only the nonoverlapping part of the message set. Note that the ordering of the stronger and weaker user is critical.

Fig ?? gives the achievable rate in case of using CCSI as an enabling component in the coding. Although the concavity of the achievable rate region when N 1 < N 2 N 1 N 2 is the most interesting motivation in the study of broadcast channels, we will nevertheless examine the case when N 1 = N 2 N 1 N 2 , where the time sharing region is exactly the same as the prescribed limit in the achievable rate. For equal quality channels, adding the two expressions gives:

C total =12log1+ P 1 + P 2 N C total 1 2 1 P 1 P 2 N (7)
Because we are limiting the total power to P total = P 1 + P 1 P total P 1 P 1 , and assign P 1 =α P total P 1 α P total and P 2 =1-α P total P 2 1 α P total respetively, we can see that we have also achieve the best achievable allocation of rates for the case when the noise powers are equal.

Achievable region of the degraded broadcast channel

This result is due to Bergmans (Bergmans:73_Broadcast_Degraded). The method that he used is to consider and artificial channel which we discussed in the above example as shown in the figure

Using typical set arguments, he obtained the achievable region for the degraded broadcast channel.

Figure 5
Artificial channel for degraded broadcast
Artificial channel for degraded broadcast (figure6.png)

Converse theorem for the degraded broadcast channel

This result is due to Bergmans(Bergmans:74_Broadcast_AWGN) for the AWGN channel, and Gallager(Gallager:74) for the more general case without power limitation.

The Deterministic Broadcast Channel

The capacity is given by Gel'fand and Pinsker in (Gelfand:80BC_det).

An important example is the Blackwell channel, where the transition matrix is defined by a multiplication rule.

The Discrete Memoryless Broadcast Channel

Here we do not allow the channel decomposition as for the degraded broadcast channel in the section Degraded Broadcast Channel.

Achievable region of the general broadcast channel

The first result is due to T.M.Cover (Cover:72BC_gen_ach) and E.C. van der Meulen (Meulen:75), who discovered it simultaneously and independently.

It was later refined by Hajek and Pursley (Hajek:79), who together with Cover and van der Meulen started the efforts during the 1976 ISIT. Their contribution is the evaluation of the region. Up to this point the evaluated region is defined by R x R y R xy R x R y R xy where the last term is the rate of the common message meant for both receivers. They show that in evaluating R x R y 0 R x R y 0 then one will get a region that is strictly larger. However, in doing so one has to insist on the independence of UVW U V W which denote the message for receiver 1, receiver 2, and the common message between the two.

A Brief History fo Broadcast

In the process of reading the relevant papers, we have made an attempt to graphically describe the history behind the development of the broadcast channel as shown in the figure

Conclusions

By now it should be clear that there have been several tools developed for the multiuser information theory problem. But the primary question remains of whether the formulation is the correct one.

For the multiaccess channel, the problem has largely been solved. Coding frameworks for multiaccess have been developed by various researchers, and multiaccess on fading channels has been studied by many researchers here at MIT. Unfortunately it appears that nonsynchronicity between the signal transmissions across many senders are discouraging the implementation of the more sophisticated coding techniques.

For the broadcast channel, Cover's formulation sparked a strong interest in the broadcast channel problem, with the degraded and deterministic cases having been solved. However, the nondegraded case is still an open problem. And the set of tools used has transitioned from the early works where it depends on casting the problem as a special case of the degraded broadcast, to using newer but less intuitive tools. Is the nondegraded broadcast channel an important problem to solve? It is worth noting that this is the dual of the symmetric Slepian-Wolf problem, which is also of great interest to information theorists. Also note that because the sender is a single entity, we do not have the problem with nonsynchronicity for the broadcast channel.

Finally, we should remark that many researchers have left information theory due to frustrations arising from the broadcast problem.

Comments, questions, feedback, criticisms?

Send feedback