Skip to content Skip to navigation

Connexions

You are here: Home » Content » The Neyman-Pearson Criterion

Navigation

Content Actions

  • Download module PDF
  • Add to ...
    Add the module to:
    • My Favorites
    • A lens
    • An external social bookmarking service
    • My Favorites (What is 'My Favorites'?)
      'My Favorites' is a special kind of lens which you can use to bookmark modules and collections directly in Connexions. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need a Connexions account to use 'My Favorites'.
    • A lens (What is a lens?)

      Definition of a lens

      Lenses

      A lens is a custom view of Connexions content. You can think of it as a fancy kind of list that will let you see Connexions through the eyes of organizations and people you trust.

      What is in a lens?

      Lens makers point to Connexions materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

      Who can create a lens?

      Any individual Connexions member, a community, or a respected organization.

    • External bookmarks
  • E-mail the author

Recently Viewed

This feature requires Javascript to be enabled.

The Neyman-Pearson Criterion

Module by: Clayton Scott

In hypothesis testing, as in all other areas of statistical inference, there are two major schools of thought on designing good tests: Bayesian and frequentist (or classical). Consider the simple binary hypothesis testing problem 0 : x f 0 x 0 : x f 0 x 1 : x f 1 x 1 : x f 1 x In the Bayesian setup, the prior probability π i =Pr i π i i of each hypothesis occurring is assumed known. This approach to hypothesis testing is represented by the minimum Bayes risk criterion and the minimum probability of error criterion.

In some applications, however, it may not be reasonable to assign an a priori probability to a hypothesis. For example, what is the a priori probability of a supernova occurring in any particular region of the sky? What is the prior probability of being attacked by a ballistic missile? In such cases we need a decision rule that does not depend on making assumptions about the a priori probability of each hypothesis. Here the Neyman-Pearson criterion offers an alternative to the Bayesian framework.

The Neyman-Pearson criterion is stated in terms of certain probabilities associated with a particular hypothesis test. The relevant quantities are summarized in Table 1. Depending on the setting, different terminology is used.

Statistics Signal Processing
Probability Name Notation Name Notation
P 0 declare 1 P 0 declare 1 size α α false-alarm probability P F P F
P 1 declare 1 P 1 declare 1 power β β detection probability P D P D

Here P i declare j P i declare j dentoes the probability that we declare hypothesis j j to be in effect when i i is actually in effect. The probabilities P 0 declare 0 P 0 declare 0 and P 1 declare 0 P 1 declare 0 (sometimes called the miss probability), are equal to 1- P F 1 P F and 1- P D 1 P D , respectively. Thus, P F P F and P D P D represent the two degrees of freedom in a binary hypothesis test. Note that P F P F and P D P D do not involve a priori probabilities of the hypotheses.

These two probabilities are related to each other through the decision regions. If R 1 R 1 is the decision region for 1 1 , we have P F = R 1 f 0 xdx P F x R 1 f 0 x P D = R 1 f 1 xdx P D x R 1 f 1 x The densities f i x f i x are nonnegative, so as R 1 R 1 shrinks, both probabilities tend to zero. As R 1 R 1 expands, both tend to one. The ideal case, where P D =1 P D 1 and P F =0 P F 0 , cannot occur unless the distributions do not overlap (i.e., f 0 x f 1 xdx=0 x f 0 x f 1 x 0 ). Therefore, in order to increase P D P D , we must also increase P F P F . This represents the fundamental tradeoff in hypothesis testing and detection theory.

Example 1

Consider the simple binary hypothesis test of a scalar measurement xx: 0 : x01 0 : x 0 1 1 : x11 1 : x 1 1 Suppose we use a threshold test x 0 1 γ x 0 1 γ where γ γ is a free parameter. Then the false alarm and detection probabilities are P F =γ12π-t22dt=Qγ P F t γ 1 2 t 2 2 Q γ P D =γ12π-t-122dt=Qγ-1 P D t γ 1 2 t 1 2 2 Q γ 1 where QQ denotes the Q-function. These quantities are depicted in Figure 1.

Figure 1: False alarm and detection values for a certain threshold.
Subfigure 1.1
Subfigure 1.1 (GaussUncMeanComVarPd.png)
Subfigure 1.2
Subfigure 1.2 (GaussUncMeanComVarPf.png)
Since the QQ-function is monotonicaly decreasing, it is evident that both P D P D and P F P F decay to zero as γγ increases. There is also an explicit relationship P D =QQ-1 P F -1 P D Q Q P F 1 A common means of displaying this relationship is with a receiver operating characteristic (ROC) curve, which is nothing more than a plot of P D P D versus P F P F (Figure 2).
Figure 2: ROC curve for this example.
Figure 2 (ROC.png)

The Neyman-Pearson Lemma: A First Look

The Neyman-Pearson criterion says that we should construct our decision rule to have maximum probability of detection while not allowing the probability of false alarm to exceed a certain value αα. In other words, the optimal detector according to the Neyman-Pearson criterion is the solution to the following constrainted optimization problem:

Neyman-Pearson Criterion

max{ P D } , such that P F α P D , such that P F α (1)

The maximization is over all decision rules (equivalently, over all decision regions R 0 R 0 , R 1 R 1 ). Using different terminology, the Neyman-Pearson criterion selects the most powerful test of size (not exceeding) αα.

Fortunately, the above optimization problem has an explicit solution. This is given by the celebrated Neyman-Pearson lemma, which we now state. To ease the exposition, our initial statement of this result only applies to continuous random variables, and places a technical condition on the densities. A more general statement is given later in the module.

theorem 1: Neyman-Pearson Lemma: initial statement

Consider the test 0 : x f 0 x 0 : x f 0 x 1 : x f 1 x 1 : x f 1 x where f i x f i x is a density. Define Λx= f 1 x f 0 x Λ x f 1 x f 0 x , and assume that Λx Λ x satisfies the condition that for each γ γ , Λx Λ x takes on the value γγ with probability zero under hypothesis 0 0 . The solution to the optimization problem in Equation 1 is given by Λx= f 1 x f 0 x 0 1 η Λ x f 1 x f 0 x 0 1 η where ηη is such that P F =x,Λx>η f 0 xdx=α P F x x Λ x η f 0 x α If α=0 α 0 , then η= η . The optimal test is unique up to a set of probability zero under 0 0 and 1 1 .

The optimal decision rule is called the likelihood ratio test. Λx Λ x is the likelihood ratio, and ηη is a threshold. Observe that neither the likelihood ratio nor the threshold depends on the a priori probabilities Pr i i . they depend only on the conditional densities f i f i and the size constraint αα. The threshold can often be solved for as a function of αα, as the next example shows.

Example 2

Continuing with Example 1, suppose we wish to design a Neyman-Pearson decision rule with size constraint αα. We have

Λx=12π-x-12212π-x22=x-12 Λ x 1 2 x 1 2 2 1 2 x 2 2 x 1 2 (2)
By taking the natural logarithm of both sides of the LRT and rarranging terms, the decision rule is not changed, and we obtain x 0 1 lnη+12γ x 0 1 η 1 2 γ Thus, the optimal rule is in fact a thresholding rule like we considered in Example 1. The false-alarm probability was seen to be P F =Qγ P F Q γ Thus, we may express the value of γγ required by the Neyman-Pearson lemma in terms of αα: γ=Q-1α γ Q α

Sufficient Statistics and Monotonic Transformations

For hypothesis testing involving multiple or vector-valued data, direct evaluation of the size ( P F P F ) and power ( P D P D ) of a Neyman-Pearson decision rule would require integration over multi-dimensional, and potentially complicated decision regions. In many cases, however, this can be avoided by simplifying the LRT to a test of the form t 0 1 γ t 0 1 γ where the test statistic t=Tx t T x is a sufficient statistic for the data. Such a simplified form is arrived at by modifying both sides of the LRT with montonically increasing transformations, and by algebraic simplifications. Since the modifications do not change the decision rule, we may calculate P F P F and P D P D in terms of the sufficient statistic. For example, the false-alarm probability may be written

P F =Pr declare 1 =t,t>γ f 0 tdt P F declare 1 t t t γ f 0 t (3)
where f 0 t f 0 t denotes the density of t t under 0 0 . Since t t is typically of lower dimension than xx, evaluation of P F P F and P D P D can be greatly simplified. The key is being able to reduce the LRT to a threshold test involving a sufficient statistic for which we know the distribution.

Example 3

Common Variances, Uncommon Means

Let's design a Neyman-Pearson decision rule of size αα for the problem 0 : x0σ2I 0 : x 0 σ 2 I 1 : xμ1σ2I 1 : x μ 1 σ 2 I where μ>0 μ 0 , σ2>0 σ 2 0 are known, 0=00T 0 0 0 , 1=11T 1 1 1 are NN-dimensional vectors, and II is the N N×NN identity matrix. The likelihood ratio is

Λx=n=1N12πσ2- x n -μ22σ2n=1N12πσ2- x n 22σ2=-n=1N x n -μ22σ2-n=1N x n 22σ2=12σ2n=1N2 x n μ-μ2=1σ2-Nμ22+μn=1N x n Λ x n 1 N 1 2 σ 2 x n μ 2 2 σ 2 n 1 N 1 2 σ 2 x n 2 2 σ 2 n 1 N x n μ 2 2 σ 2 n 1 N x n 2 2 σ 2 1 2 σ 2 n 1 N 2 x n μ μ 2 1 σ 2 N μ 2 2 μ n 1 N x n (4)
To simplify the test further we may apply the natural logarithm and rearrange terms to obtain tn=1N x n 0 1 σ2μlnη+Nμ2γ t n 1 N x n 0 1 σ 2 μ η N μ 2 γ
Note:
We have used the assumption μ>0 μ 0 . If μ<0 μ 0 , then division by μμ is not a monotonically increasing operation, and the inequalities would be reversed.
The test statistic tt is sufficient for the unknown mean. To set the threshold γγ, we write the false-alarm probability (size) as P F =Prt>γ=t,t>γ f 0 tdt P F t γ t t t γ f 0 t To evaluate P F P F , we need to know the density of tt under 0 0 . Fortunately, tt is the sum of normal variates, so it is again normally distributed. In particular, we have t=Ax t A x , where A=1T A 1 , so tA0Aσ2IAT=0Nσ2 t A 0 A σ 2 I A 0 N σ 2 under 0 0 . Therefore, we may write P F P F in terms of the Q-function as P F =QγNσ P F Q γ N σ The threshold is thus determined by γ=NσQ-1α γ N σ Q α Under 1 1 , we have tA1Aσ2IAT=NμNσ2 t A 1 A σ 2 I A N μ N σ 2 and so the detection probability (power) is P D =Prt>γ=Qγ-NμNσ P D t γ Q γ N μ N σ Writing P D P D as a function of P F P F , the ROC curve is given by P D =QQ-1 P F -Nμσ P D Q Q P F N μ σ The quantity Nμσ N μ σ is called the signal-to-noise ratio. As its name suggests, a larger SNR corresponds to improved performance of the Neyman-Pearson decision rule.
remark:
In the context of signal processing, the foregoing problem may be viewed as the problem of detecting a constant (DC) signal in additive white Gaussian noise: 0 : x n = w n , n = 1 , , N 0 : x n w n , n = 1 , , N 1 : x n =A+ w n , n = 1 , , N 1 : x n A w n , n = 1 , , N where AA is a known, fixed amplitude, and w n 0σ2 w n 0 σ 2 . Here AA corresponds to the mean μμ in the example.

The Neyman-Pearson Lemma: General Case

In our initial statement of the Neyman-Pearson Lemma, we assumed that for all ηη, the set {x|Λx=η} x Λ x η x had probability zero under 0 0 . This eliminated many important problems from consideration, including tests of discrete data. In this section we remove this restriction.

It is helpful to introduce a more general way of writing decision rules. Let φφ be a function of the data xx with φx01 φ x 0 1 . φφ defines the decision rule "declare 1 1 with probability φx φ x ." In other words, upon observing xx