Connexions

You are here: Home » Content » Probability Topics: Homework

Recently Viewed

This feature requires Javascript to be enabled.

Tags

(What is a tag?)

These tags come from the endorsement, affiliation, and other lenses that include this content.

Probability Topics: Homework

Summary: This module provides a number of homework exercises related to Probability.

Note: You are viewing an old version of this document. The latest version is available here.

3mm

[0]Questions or comments concerning this laboratory should be directed to Prof. Charles A. Bouman, School of Electrical and Computer Engineering, Purdue University, West Lafayette IN 47907; (765) 494-0340; bouman@ecn.purdue.edu

EE438 - Laboratory 7:

Discrete-Time Random Processes (Week 1)

2008/08/12 17:21:43 date

Introduction

Many of the phenomena that occur in nature have uncertainty and are best characterized statistically as random processes. For example, the thermal noise in electronic circuits, radar detection, and games of chance can only be modeled and analyzed in terms of statistical averages.

This lab will cover some basic methods of analyzing random processes. "Random Variables" reviews some basic definitions and terminology associated with random variables, observations, and estimation. "Estimating the Cumulative Distribution Function" investigates a common estimate of the cumulative distribution function. "Generating Samples from a Given Distribution" discusses the problem of transforming a random variable so that it has a given distribution, and lastly, "Estimating the Probability Density Function" illustrates how the histogram may be used to estimate the probability density function.

Note that this lab assumes an introductory background in probability theory. Some review is provided, but it is unfeasible to develop the theory in detail. A secondary reference such as (Reference) is strongly encouraged.

Random Variables

The following section contains an abbreviated review of some of the basic definitions associated with random variables. Then we will discuss the concept of an observation of a random event, and introduce the notion of an estimator.

Basic Definitions

A random variable is a function that maps a set of possible outcomes of a random experiment into a set of real numbers. The probability of an event can then be interpreted as the probability that the random variable will take on a value in a corresponding subset of the real line. This allows a fully numerical approach to modeling probabilistic behavior.

A very important function used to characterize a random variable is the cumulative distribution function (CDF), defined as

F X ( x ) = P ( X x ) x ( - , ) . F X ( x ) = P ( X x ) x ( - , ) .
(1)

Here, X is the random variable, and FX(x)FX(x) is the probability that X will take on a value in the interval (-,x](-,x]. It is important to realize that xx is simply a dummy variable for the function FX(x)FX(x), and is therefore deterministic.

The derivative of the probability distribution function, if it exists, is known as the probability density function, denoted as fX(x)fX(x). By the fundamental theorem of calculus, the probability density has the following property:

t 0 t 1 f X ( x ) d x = F X ( t 1 ) - F X ( t 0 ) = P ( t 0 < X t 1 ) . t 0 t 1 f X ( x ) d x = F X ( t 1 ) - F X ( t 0 ) = P ( t 0 < X t 1 ) .
(2)

Since the probability that XX lies in the interval (-,)(-,) equals one, the area under the density function must also equal one.

Expectations are fundamental quantities associated with random variables. The expected value of some function of X, call it g(X)g(X), is defined by the following.

E [ g ( X ) ] = - g ( x ) f X ( x ) d x (for X continuous) E [ g ( X ) ] = x = - g ( x ) P ( X = x ) (for X discrete) E [ g ( X ) ] = - g ( x ) f X ( x ) d x (for X continuous) E [ g ( X ) ] = x = - g ( x ) P ( X = x ) (for X discrete)
(3)

Note that expected value of g(X) is a deterministic number. Note also that due to the properties of integration, expectation is a linear operator.

The two most common expectations are the mean μXμX and variance σX2σX2 defined by

μ X = E [ X ] = - x f X ( x ) d x μ X = E [ X ] = - x f X ( x ) d x
(4)
σ X 2 = E [ ( X - μ X ) 2 ] = - ( x - μ X ) 2 f X ( x ) d x . σ X 2 = E [ ( X - μ X ) 2 ] = - ( x - μ X ) 2 f X ( x ) d x .
(5)

A very important type of random variable is the Gaussian or normal random variable. A Gaussian random variable has a density function of the following form:

f X ( x ) = 1 2 π σ X exp - 1 2 σ X 2 ( x - μ X ) 2 . f X ( x ) = 1 2 π σ X exp - 1 2 σ X 2 ( x - μ X ) 2 .
(6)

Note that a Gaussian random variable is completely characterized by its mean and variance. This is not necessarily the case for other types of distributions. Sometimes, the notation XN(μ,σ2)XN(μ,σ2) is used to identify XX as being Gaussian with mean μμ and variance σ2σ2.

Samples of a Random Variable

Suppose some random experiment may be characterized by a random variable XX whose distribution is unknown. For example, suppose we are measuring a deterministic quantity vv, but our measurement is subject to a random measurement error εε. We can then characterize the observed value, XX, as a random variable, X=v+εX=v+ε.

If the distribution of XX does not change over time, we may gain further insight into XX by making several independent observations {X1,X2,,XN}{X1,X2,,XN}. These observations XiXi, also known as samples, will be independent random variables and have the same distribution FX(x)FX(x). In this situation, the XiXi's are referred to as i.i.d., for independent and identically distributed. We also sometimes refer to {X1,X2,,XN}{X1,X2,,XN} collectively as a sample, or observation, of size N.

Suppose we want to use our observation {X1,X2,,XN}{X1,X2,,XN} to estimate the mean and variance of XX. Two estimators which should already be familiar to you are the sample mean and sample variance defined by

μ ^ X = 1 N i = 1 N X i μ ^ X = 1 N i = 1 N X i
(7)
σ ^ X 2 = 1 N - 1 i = 1 N ( X i - μ ^ X ) 2 . σ ^ X 2 = 1 N - 1 i = 1 N ( X i - μ ^ X ) 2 .
(8)

It is important to realize that these sample estimates are functions of random variables, and are therefore themselves random variables. Therefore we can also talk about the statistical properties of the estimators. For example, we can compute the mean and variance of the sample mean μ^Xμ^X.

E μ ^ X = E 1 N i = 1 N X i = 1 N i = 1 N E X i = μ X E μ ^ X = E 1 N i = 1 N X i = 1 N i = 1 N E X i = μ X
(9)
V a r μ ^ X = V a r 1 N i = 1 N X i = 1 N 2 V a r i = 1 N X i = 1 N 2 i = 1 N V a r X i = σ X 2 N V a r μ ^ X = V a r 1 N i = 1 N X i = 1 N 2 V a r i = 1 N X i = 1 N 2 i = 1 N V a r X i = σ X 2 N
(10)

In both (Equation 9) and (Equation 10) we have used the i.i.d. assumption. We can also show that E[σ^X2]=σX2E[σ^X2]=σX2.

An estimate a^a^ for some parameter aa which has the property E[a^]=aE[a^]=a is said to be an unbiased estimate. An estimator such that Var[a^]0Var[a^]0 as NN is said to be consistent. These two properties are highly desirable because they imply that if a large number of samples are used the estimate will be close to the true parameter.

Suppose XX is a Gaussian random variable with mean 0 and variance 1. Use the Matlab function random or randn to generate 1000 samples of X, denoted as X1X1, X2X2, ..., X1000X1000. See the online help for the blue random function. Plot them using the Matlab function plot. We will assume our generated samples are i.i.d.

Write Matlab functions to compute the sample mean and sample variance of equations (Equation 7) and (Equation 8) without using the predefined mean and var functions. Use these functions to compute the sample mean and sample variance of the samples you just generated.

INLAB REPORT:

1. Submit the plot of samples of XX.
2. Submit the sample mean and the sample variance that you calculated. Why are they not equal to the true mean and true variance?

Linear Transformation of a Random Variable

A linear transformation of a random variable XX has the following form

Y = a X + b Y = a X + b
(11)

where aa and bb are real numbers, and a0a0. A very important property of linear transformations is that they are distribution-preserving, meaning that YY will be random variable with a distribution of the same form as XX. For example, in (Equation 11), if XX is Gaussian then YY will also be Gaussian, but not necessarily with the same mean and variance.

Using the linearity property of expectation, find the mean μYμY and variance σY2σY2 of YY in terms of aa, bb, μXμX, and σX2σX2. Show your derivation in detail.

Hint : First find the mean, then substitute the result when finding the variance.

Consider a linear transformation of a Gaussian random variable XX with mean 0 and variance 1. Calculate the constants aa and bb which make the mean and the variance of YY 3 and 9, respectively. Using equation (Equation 6), find the probability density function for YY.

Generate 1000 samples of XX, and then calculate 1000 samples of YY by applying the linear transformation in equation (Equation 11), using the aa and bb that you just determined. Plot the resulting samples of YY, and use your functions to calculate the sample mean and sample variance of the samples of YY.

INLAB REPORT:

1. Submit your derivation of the mean and variance of YY.
2. Submit the transformation you used, and the probability density function for YY.
3. Submit the plot of samples of YY and the Matlab code used to generate YY. Include the calculated sample mean and sample variance for YY.

Estimating the Cumulative Distribution Function

Suppose we want to model some phenomenon as a random variable XX with distribution FX(x)FX(x). How can we assess whether or not this is an accurate model? One method would be to make many observations and estimate the distribution function based on the observed values. If the distribution estimate is “close” to our proposed model FX(x)FX(x), we have evidence that our model is a good characterization of the phenomenon. This section will introduce a common estimate of the cumulative distribution function.

Given a set of i.i.d. random variables {X1,X2,...,XN}{X1,X2,...,XN} with CDF FX(x)FX(x), the empirical cumulative distribution function F^X(x)F^X(x) is defined as the following.

F ^ X ( x ) = 1 N i = 1 N I { X i x } I { X i x } = 1 , if X i x 0 , otherwise F ^ X ( x ) = 1 N i = 1 N I { X i x } I { X i x } = 1 , if X i x 0 , otherwise
(12)

In words, F^X(x)F^X(x) is the fraction of the XiXi's which are less than or equal to xx.

To get insight into the estimate F^X(x)F^X(x), let's compute its mean and variance. To do so, it is easiest to first define NxNx as the number of XiXi's which are less than or equal to xx.

N x = i = 1 N I { X i x } = N F ^ X ( x ) N x = i = 1 N I { X i x } = N F ^ X ( x )
(13)

Notice that P(Xix)=FX(x)P(Xix)=FX(x), so

P ( I { X i x } = 1 ) = F X ( x ) P ( I { X i x } = 0 ) = 1 - F X ( x ) P ( I { X i x } = 1 ) = F X ( x ) P ( I { X i x } = 0 ) = 1 - F X ( x )
(14)

Now we can compute the mean of F^X(x)F^X(x) as follows,

E F ^ X ( x ) = 1 N E [ N x ] = 1 N i = 1 N E I { X i x } = 1 N N E I { X i x } = 0 · P I { X i x } = 0 + 1 · P I { X i x } = 1 = F X ( x ) . E F ^ X ( x ) = 1 N E [ N x ] = 1 N i = 1 N E I { X i x } = 1 N N E I { X i x } = 0 · P I { X i x } = 0 + 1 · P I { X i x } = 1 = F X ( x ) .
(15)

This shows that F^X(x)F^X(x) is an unbiased estimate of FX(x)FX(x). By a similar approach, we can show that

V a r F ^ X ( x ) = 1 N F X ( x ) ( 1 - F X ( x ) ) . V a r F ^ X ( x ) = 1 N F X ( x ) ( 1 - F X ( x ) ) .
(16)

Therefore the empirical CDF F^X(x)F^X(x) is both an unbiased and consistent estimate of the true CDF.

Exercise

Write a function F=empcdf(X,t) to compute the empirical CDF F^X(t)F^X(t) from the sample vector XX at the points specified in the vector tt. (Hint: The expression sum(X<=s) will return the number of elements in the vector XX which are less than or equal to ss.)

To test your function, generate a sample of Uniform[0,1] random variables using the function X=rand(1,N)X=rand(1,N). Plot two CDF estimates: one using a sample size N=20N=20, and one using N=200N=200. Plot these functions in the range t=[-1:0.001:2] , and on each plot superimpose the true distribution for a Uniform[0,1] random variable.

INLAB REPORT: Hand in your empcdfempcdf function and the two plots.

Generating Samples from a Given Distribution

It is oftentimes necessary to generate samples from a particular distribution. For example, we might want to run simulations to test how an algorithm performs on noisy inputs. In this section we will address the problem of generating random numbers from a given distribution FX(x)FX(x).

Suppose we have a continuous random variable XX with distribution FX(x)FX(x), and we form the new random variable Y=FX(X)Y=FX(X). In other words YY is a function of XX, and the particular function is the CDF of the random variable XX.

X F X ( · ) Y X F X ( · ) Y
(17)

How is YY distributed? First notice that FX(·)FX(·) is a probability, so that YY can only take values in the interval [0,1][0,1].

P ( Y y ) = 0 , for y < 0 1 , for y > 1 P ( Y y ) = 0 , for y < 0 1 , for y > 1
(18)

Since FX(x)FX(x) is a monotonically increasing function of xx, the event {Yy}{Yy} is equivalent to {Xx}{Xx} if we define y=FX(x)y=FX(x). This implies that for 0y10y1,

F Y ( y ) = P ( Y y ) = P ( F X ( X ) F X ( x ) ) = P ( X x ) (monotonicity) = F X ( x ) = y . F Y ( y ) = P ( Y y ) = P ( F X ( X ) F X ( x ) ) = P ( X x ) (monotonicity) = F X ( x ) = y .
(19)

Therefore YY is uniformly distributed on the interval [0,1].

Conversely, if FX(·)FX(·) is a one-to-one function, we may use the inverse transformation FX-1(U)FX-1(U) to transform a Uniform[0,1] random variable UU to a random variable with distribution FX(·)FX(·).

U F X - 1 ( · ) X U F X - 1 ( · ) X
(20)

Note that combining these results allows us to transform any continuous random variable XFX(x)XFX(x) to any other continuous random variable ZFZ(z)ZFZ(z), provided that FZ(·)FZ(·) is a one-to-one function.

X F X ( · ) U F Z - 1 ( · ) Z X F X ( · ) U F Z - 1 ( · ) Z
(21)

Exercise

Your task is to use i.i.d. Uniform[0,1] random variables to generate a set of i.i.d. exponentially distributed random variables with CDF

F X ( x ) = ( 1 - e - 3 x ) u ( x ) . F X ( x ) = ( 1 - e - 3 x ) u ( x ) .
(22)

Derive the required transformation.

Generate the Uniform[0,1] random variables using the function rand(1,N)rand(1,N). Use your empcdf function to plot two CDF estimates for the exponentially distributed random variables: one using a sample size N=20N=20, and one using N=200N=200. Plot these functions in the range x=[-1:0.001:2] , and on each plot superimpose the true exponential distribution of equation (Equation 22).

INLAB REPORT:

• Hand in the derivation of the required transformation, and your Matlab code.
• Hand in the two plots.

Estimating the Probability Density Function

The statistical properties of a random variable are completely described by its probability density function (assuming it exists, of course). Therefore, it is oftentimes useful to estimate the PDF, given an observation of a random variable. For example, similar to the empirical CDF, probability density estimates may be used to test a proposed model. They may also be used in non-parametric classification problems, where we need to classify data as belonging to a particular group but without any knowledge of the true underlying class distributions.

Notice that we cannot form a density estimate by simply differentiating the empirical CDF, since this function contains discontinuities at the sample locations XiXi. Rather, we need to estimate the probability that a random variable will fall within a particular interval of the real axis. In this section, we will describe a common method known as the histogram.

The Histogram

Our goal is to estimate an arbitrary probability density function, fX(x)fX(x), within a finite region of the xx-axis. We will do this by partitioning the region into LL equally spaced subintervals, or “bins”, and forming an approximation for fX(x)fX(x) within each bin. Let our region of support start at the value x0x0, and end at xLxL. Our LL subintervals of this region will be [x0,x1][x0,x1], (x1,x2](x1,x2], ..., (xL-1,xL](xL-1,xL]. To simplify our notation we will define bin(k)bin(k) to represent the interval (xk-1,xk](xk-1,xk], k=1,2,,Lk=1,2,,L, and define the quantity ΔΔ to be the length of each subinterval.

b i n ( k ) = ( x k - 1 , x k ] k = 1 , 2 , , L Δ = x L - x 0 L b i n ( k ) = ( x k - 1 , x k ] k = 1 , 2 , , L Δ = x L - x 0 L
(23)

We will also define f˜(k)f˜(k) to be the probability that XX falls into bin(k)bin(k).

f ˜ ( k ) = P ( X b i n ( k ) ) = x k - 1 x k f X ( x ) d x f X ( x ) Δ for x b i n ( k ) f ˜ ( k ) = P ( X b i n ( k ) ) = x k - 1 x k f X ( x ) d x f X ( x ) Δ for x b i n ( k )
(24)

The approximation in ((Reference)) only holds for an appropriately small bin widthΔΔ.

Next we introduce the concept of a histogram of a collection of i.i.d. random variables {X1,X2,,XN}{X1,X2,,XN}. Let us start by defining a function that will indicate whether or not the random variable XnXn falls within bin(k)bin(k).

I n ( k ) = 1 , if X n b i n ( k ) 0 , if X n b i n ( k ) I n ( k ) = 1 , if X n b i n ( k ) 0 , if X n b i n ( k )
(25)

The histogram of XnXn at bin(k)bin(k), denoted as H(k)H(k), is simply the number of random variables that fall within bin(k)bin(k). This can be written as

H ( k ) = n = 1 N I n ( k ) . H ( k ) = n = 1 N I n ( k ) .
(26)

We can show that the normalized histogram, H(k)/NH(k)/N, is an unbiased estimate of the probability of XX falling in bin(k)bin(k). Let us compute the expected value of the normalized histogram.

E H ( k ) N = 1 N n = 1 N E [ I n ( k ) ] = 1 N n = 1 N { 1 · P ( X n b i n ( k ) ) + 0 · P ( X n b i n ( k ) ) } = f ˜ ( k ) E H ( k ) N = 1 N n = 1 N E [ I n ( k ) ] = 1 N n = 1 N { 1 · P ( X n b i n ( k ) ) + 0 · P ( X n b i n ( k ) ) } = f ˜ ( k )
(27)

The last equality results from the definition of f˜(k)f˜(k), and from the assumption that the XnXn's have the same distribution. A similar argument may be used to show that the variance of H(k)H(k) is given by

V a r H ( k ) N = 1 N f ˜ ( k ) ( 1 - f ˜ ( k ) ) . V a r H ( k ) N = 1 N f ˜ ( k ) ( 1 - f ˜ ( k ) ) .
(28)

Therefore, as NN grows large, the bin probabilities f˜(k)f˜(k) can be approximated by the normalized histogram H(k)/NH(k)/N.

f ˜ ( k ) H ( k ) N f ˜ ( k ) H ( k ) N
(29)

Using ((Reference)), we may then approximate the density function fX(x)fX(x) within bin(k)bin(k) by

f X ( x ) H ( k ) N Δ for x b i n ( k ) . f X ( x ) H ( k ) N Δ for x b i n ( k ) .
(30)

Notice this estimate is a staircase function of xx which is constant over each interval bin(k)bin(k). It can also easily be verified that this density estimate integrates to 1.

Exercise

Let UU be a uniformly distributed random variable on the interval [0,1][0,1] with the following cumulative probability distribution, FU(u)FU(u):

F U ( u ) = 0 , if u < 0 u , if 0 u 1 1 , if u > 1 F U ( u ) = 0 , if u < 0 u , if 0 u 1 1 , if u > 1
(31)

We can calculate the cumulative probability distribution for the new random variable X=U13X=U13.

F X ( x ) = P ( X x ) = P ( U 1 3 x ) = P ( U x 3 ) = F U ( u ) u = x 3 = 0 , if x < 0 x 3 , if 0 x 1 1 , if x > 1 F X ( x ) = P ( X x ) = P ( U 1 3 x ) = P ( U x 3 ) = F U ( u ) u = x 3 = 0 , if x < 0 x 3 , if 0 x 1 1 , if x > 1
(32)

Plot FX(x)FX(x) for x[0,1]x[0,1]. Also, analytically calculate the probability density fX(x)fX(x), and plot it for x[0,1]x[0,1].

Using L=20L=20, x0=0x0=0 and xL=1xL=1, use Matlab to compute f˜(k)f˜(k), the probability of XX falling into bin(k)bin(k). (Hint: Use the fact that f˜(k)=FX(xk)-FX(xk-1)f˜(k)=FX(xk)-FX(xk-1).) Plot f˜(k)f˜(k) using stem for k=1,,Lk=1,,L.

INLAB REPORT:

1. Submit your plots of FX(x)FX(x), fX(x)fX(x) and f˜(k)f˜(k). Use stem to plot f˜(k)f˜(k), and put all three plots on a single figure using subplot.
2. Show (mathematically) how fX(x)fX(x) and f˜(k)f˜(k) are related.

Generate 1000 samples of a random variable UU that is uniformly distributed between 0 and 1. (Hint: Use the command rand.) Then form the random vector XX by computing X=U13X=U13.

Use the Matlab function hist to plot a normalized histogram for your samples of XX, using 20 bins uniformly spaced on the interval [0,1][0,1]. (Hint: Use the Matlab command H=hist(X,(0.5:19.5)/20) to obtain the histogram, and then normalize H.) Use the stem command to plot the normalized histogram H(k)/NH(k)/N and f˜(k)f˜(k) together on the same figure using subplot.

INLAB REPORT:

1. Submit your two stem plots of H(k)/NH(k)/N and f˜(k)f˜(k). How do these plots compare?
2. Discuss the tradeoffs (advantages and the disadvantages) between selecting a very large or very small bin-width.

Content actions

Give feedback:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags?

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks