Note: Your browser may not currently support MathML. See our browser support page for additional details. You can always view the correct math in the PDF version.
Consider the following two-model evaluation problem
(van Trees; prob.2.2.1).
Prove that the likelihood ratio test reduces to
Find
Now assume that we need a Neyman-Pearson test. Find
The two models describe different equi-variance statistical
models for the observations (van
Trees; Prob. 2.2.11).
Find the likelihood ratio.
Compute the decision regions for various values of the threshold in the likelihood ratio test.
Assuming these two densities are equally likely, find the probability of making an error in distinguishing between them.
A hypothesis testing criterion radically different from
those discussed in this section and this section is minimum
equivocation. In this information theoretic
approach, the two-model testing problem is modeled as a
digital channel, shown in this figure. The channel's inputs,
generically represented by the
The quality of such information theoretic channels is
quantified by the mutual information
Non-Gaussian statistical models sometimes yield surprising
results in comparison to Gaussian ones. Consider the
following hypothesis testing problem where the observations
have a Laplacian probability distribution.
Find the sufficient statistic for the optimal decision rule.
What decision rule guarantees that the miss probability will be less than 0.1?
Developing a Neyman-Pearson decision rule for more than two
models has not been detailed. Assume
Formulate the optimization problem that simultaneously
maximizes
Show that your solution can be expressed as choosing the
largest of the sufficient statistics
Pattern recognition relies heavily on ideas derived from the
principles of statistical model testing. Measurements are
made of a test object and these are compared with those of
"standard" objects to determine which the test object most
closely resembles. Assume that the measurement vector
How is the minimum probability of error choice of object
determined from the observation of
Assuming that only two equally likely objects are possible
(
The expense of making measurements is always a practical consideration. Assuming each measurement costs the same to perform, how would you determine the effectiveness of a measurement vector's component?
Define
Based upon the observation of
One observation of the random variable
Suppose there are two terms in the aforementioned sum. Assuming that the two models are equally likely, find the minimum probability of error decision rule.
Compute the resulting probability of error of your decision rule.
Show that the decision rule found in this previous part applies no matter how many terms are assumed present in the sum.
The observed random variable
Draw the decision regions on the
Compute the probability of error.
Let
The goal is to choose which of the following four models is
true upong the reception of the three-dimensional vector
Assuming equally likely models, find the minimum
Calculate the resulting error probability.
Show that neither the decision rule nor the probability of
error do not depend on
To gain some appreciation of some of the issues in
implementing a detector, this problem asks you to program
(preferably in Matlab) a simple detector and numerically
compare its performance with theoretical predictions. Let
the observations consist of a signal contained in additive
Gaussian white noise.
What is the theoretical false-alarm probability of the
minimum
Write a Matlab program that estimates the false-alarm
probability. How many simulation trials are needed to
accurately estimate the false-alarm probability? Choose
values for
Calculate the Kullback-Leibler distance between the
following pairs of densities. Use these results to find the
Fisher information for the mean parameter
Jointly Gaussian random vectors having the same covariance matrix but dissimilar mean vectors.
Two Poisson random variables having average rates
Two sequences of statistically independent Laplacian random variables having the same variance but different means.
Plot the Kullback-Leibler distances for the Laplacian case
and for the Gaussian case of statistically independent
random variables. Set the variance equal to
The Kullback-Leibler and Chernoff distances can be related
to the Fisher information matrix. Let
Show that
What is the Chernoff distance between these distributions?
Deduce from the Kullback-Leibler distance for the Gaussian and Poisson cases what the Cramér-Rao bound is for estimating the mean and average rate, respectively.
Insights into certain detection problems can be gained by examining the Kullback-Leibler distance and the properties of Fisher information. We begin by first showing that the Gaussian distribution has the smallest Fisher information for the mean parameter for all differentiable distributions having the same variance.
Show that if
Use this property to show that the Fisher information is a convex function of the probability density.
Define
Because of the Fisher information's convexity, a given
distribution
What does this result suggest about the performance probabilities for problems wherein the models differ in mean?
Find the Chernoff distance between the following distributions.
Two Gaussian distributions having the same variance but different means.
Two Poisson distributions having differing parameter values.
Let's explore how well Stein's Lemma predicts optimal
detector performance probabilities. Consider the two-model
detection problem wherein
![]() |
Find an expression for the false-alarm probability
Find the Kullback-Leibler distance corresponding to the false-alarm probability's exponent.
Plot the exact error for values of
We observe a Gaussian random variable
What is an expression for the probability of error for
the minimum
Assume one can perform statistically independent
observations of
What is the sufficient statistic in this previous part and sketch how the
thresholds for this statistic vary with the number of
trials. Assume that
The optimum reception of binary information can be viewed as
a model testing problem. Here, equally-likely binary data
(a "zero" or a "one") is transmitted through a binary
symmetric channel. The indicated parameters denote the
probabilities of receiving a binary digit given that a
particular digit was sent. Assume that
Assuming a single transmission for each digit, what is the minimum probability of error receiver and what is the resulting probability of error?
One method of improving the probability of error is to
repeat the digit to be transmitted
Assume that we desire the probability of error to be
Construct a sequential algorithm which achieves the required probability of error. Assume that the transmitter will repeat each digit until informed by the receiver that it has determined what digit was sent. What is the expected length of the repetition code in this instance?
You have accepted the (dangerous) job of determining whether
the radioactivity levels at the Chernobyl reactor are
elevated or not. Because you want to stay as short a time
as possible to render you professional opinion, you decide
to use a sequential-like decision rule. Radioactivity is
governed by Poisson probability laws, which means that the
probability that
Construct a sequential decision rule to determine whether it is safe or not. Assume you have defined false-alarm and miss probabilities according to accepted "professional standards." According to these standards, these probabilities equal each other.
What is the expected time it will take to render a decision?
Sequential tests can be used to advantage in situations
where analytic difficulties obscure the problem. Consider
the case where the observations either contain no signal
(
Find the likelihood ratio for the observations.
Develop a sequential test that would determine whether a signal is present or not.
Find a formula for the test's thresholds in terms of
How does the average number of observations vary with
In some cases it might be wise to not
make a decision when the data do not justify it. Thus, in
addition to declaring that one of two models occurred, we
might declare "no decision" when the data are indecisive.
Assume you observe
Construct a hypothesis testing rule that yields a
probability of no-decision no larger than some specified
value
What is the probability of a correct decision for your rule?
You decide to flip coins with Sleazy Sam. If heads is the
result of a coin flip, you win one dollar; if tails, Sam
wins a dollar. However, Sam's reputation has preceded him.
You suspect that the probability of tails,
You suspect that
Using your decision rule, what is the probability that your determination is incorrect?
One potential flaw with your decision rule is that a
specific value of
When a patient is screened for the presence of a disease in
an organ, a section of tissue is viewed under a microscope
and a count of abnormal cells made. Even under healthy
conditions, a small number of abnormal cells will be
present. Presumably a much larger number will be present if
the organ is diseased. Assume that the number
Assuming that the value of the parameter
Using your method, a patient was said to have a diseased organ. In this case, what is the probability that the organ is diseased?
Assume that
How can the standard sequential test be extended to unknown parameter situations? Formulate the theory, determine the formulas for the thresholds. How would you approach finding the average number of observations?
A common situation in statistical signal processing problems
is that the variance of the observations is unknown (there
is no reason that noise should be nice to us!). Consider
the two Gaussian model testing problem where the models
differ in their means and have a common, but unknown
variance.
Show that the unknown variance enters into the optimum decision only in the threshold term.
In the (happy) situation where the threshold
Consider the following composite hypothesis testing problem
(van Trees; Prob. 2.5.2).
Does an UMP test exist for this problem? If it does, find it.
Construct a generalized likelihood ratio test for this problem. Under what conditions can the requirement on the false-alarm probability be met?
Data are often processed "in the field," with the results
from several systems sent to a central place for final
analysis. Consider a detection system wherein each of
Find the optimal detection strategy for making a final
determination that maximizes the probability of making a
correct decision. Assume that the a
priori probabilities
How does the airplane detection system change when the
a priori probabilities are not known?
Require that the central judgment have a false-alarm
probability no bigger than
Mathematically, a preconception is a model for the "world" that you believe applies over a broad class of circumstances. Clearly, you should be vigilant and continually judge your assumption's correctness.
Let
Based on the sample average, develop a procedure that
test for each
To judge the efficacy of this test, assume the elements
of the actual sequence have the assumed distribution,
but that they are correlated with correlation
coefficient
Is the test based on the sample average optimal? If so, prove it so; if not, find the optimal one.
Modern management styles tend to want decisions to be made locally (by people at the scene) rather than by "the boss." While this approach might be considered more democratic, we should understand how to make decisions under such organizational constraints and what the performance might be.
Let three "local" systems separately make observations.
Each local system's observations are identically distributed
and statistically independent of the others, and based on
the observations, each system decides which of two models
applies best. The judgments are relayed to the central
manager who must make the final decision. Assume the local
observations consist either of white Gaussian noise or of a
signal having energy
What decision rule should each local system use?
Assuming the observation models are equally likely, how should the central management make its decision so as to minimize the probability of error?
Is this decentralized decision system optimal (i.e., the probability of error for the final decision is minimized)? If so, demonstrate optimality; if not, find the optimal system.