Skip to content Skip to navigation

Connexions

You are here: Home » Content » Sequential Hypothesis Testing

Navigation

Content Actions

  • Download module PDF
  • Add to ...
    Add the module to:
    • My Favorites
    • A lens
    • An external social bookmarking service
    • My Favorites (What is 'My Favorites'?)
      'My Favorites' is a special kind of lens which you can use to bookmark modules and collections directly in Connexions. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need a Connexions account to use 'My Favorites'.
    • A lens (What is a lens?)

      Definition of a lens

      Lenses

      A lens is a custom view of Connexions content. You can think of it as a fancy kind of list that will let you see Connexions through the eyes of organizations and people you trust.

      What is in a lens?

      Lens makers point to Connexions materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

      Who can create a lens?

      Any individual Connexions member, a community, or a respected organization.

    • External bookmarks
  • E-mail the author

Recently Viewed

This feature requires Javascript to be enabled.

Sequential Hypothesis Testing

Module by: Don Johnson

In many circumstances, the observations to be used in evaluating models arrive sequentially rather than all at once. For example, passive sonar systems may well "listen" over a period of time to an array's output while the array is steered in a particular direction. The decision rules we have derived implicitly assume the entire block of data - the array output observed over a long period of time - is available. You might wonder whether a hypothesis test could be developed that takes the sequential arrival of data into account, making decisions as the data arrive, with the possibility of determining early in the data collection procedure the validity of one model, while maintaining the same performance specifications. Answering this question leads to the formulation of sequential hypothesis testing (Poor: 136-156 , Wald). Not only do sequential tests exist, they can provide performance superior to that of block tests in certain cases.

To make decisions as the data become available, we must generalize the decision-making process. Assume as before that the observed data comprise an observation vector rr of length LL. The decision rule (in the two-model case) now consists of determining which model is valid or that more data are required. Thus, the range of values of rr is partitioned into three regions 0 0 , 1 1 , and ? ? . Making the latter decision implies that the data gathered to that point is insufficient to meet the performance requirements. More data must be obtained to achieve the required performance and the test re-applied once these additional data become available. Thus, a variable number of observations are required to make a decision. An issue in this kind of procedure is the number of observations required to satisfy the performance criteria: for a common set of performance specifications, does this procedure result in a decision rule requiring, on the average, fewer observations than does a fixed-length block test?

Sequential Likelihood Ratio Test

In a manner similar to the Neyman-Pearson criterion, we specify the false-alarm probability P F P F ; in addition, we need to specify the detection probability P D P D . These constraints over-specify the model evaluation problem where the number of observations is fixed: enforcing one constraint forces violation of the other. In contrast, both may be specified as the sequential test as we shall see.

Assuming a likelihood ratio test, two thresholds are required to define the three decision regions. Λ L r< η 0 say 0 η 0 < Λ L r< η 1 say "need more data" η 1 < Λ L r say 1 Λ L r η 0 say 0 η 0 Λ L r η 1 say "need more data" η 1 Λ L r say 1 where Λ L r Λ L r is the usual likelihood ratio where the dimension LL of the vector rr is explicitly denoted. The threshold values η 0 η 0 and η 1 η 1 are found from the constraints, which are expressed as P F = 1 pr| 0 rdr=α P F r 1 p r 0 r α and P D = 1 pr| 1 rdr=β P D r 1 p r 1 r β Here, αα and ββ are design constants that you choose according to the application. Note that the probabilities P F P F , P D P D are associated not with what happens on a given trial, but what the sequential test yields in terms of performance when a decision is made. Thus, P M =1- P D P M 1 P D although the probability of correctly saying 1 1 on a given trial does not equal one minus the probability of incorrectly saying 0 0 is true: The "need more data" region must be accounted for on an individual trial but not when considering the sequential test's performance when it terminates.

Rather explicitly attempting to relate thresholds to performance probabilities, we obtain simpler results by using bounds and approximations. Note that the expression for P D P D may be written as P D = 1 pr| 1 rpr| 0 rpr| 0 rdr= 1 Λ L rpr| 0 rdr P D r 1 p r 1 r p r 0 r p r 0 r r 1 Λ L r p r 0 r In the decision region 1 1 , Λ L r η 1 Λ L r η 1 ; thus, a lower bound on the detection probability can be established by substituting this inequality into the integral. P D η 1 1 pr| 0 rdr P D η 1 r 1 p r 0 r The integral is the false-alarm probability P F P F of the test when it terminates. In this way, we find that P D P F η 1 P D P F η 1 . Using similar arguments on the miss probability, we obtain a similar bound on the threshold η 0 η 0 . These inequalities are summarized as η 0 1- P D 1- P F η 0 1 P D 1 P F and η 1 P D P F η 1 P D P F These bounds, which relate the thresholds in the sequential likelihood ratio test with the false-alarm and detection probabilities, are general, applying even when sequential tests are not being used. In the usual likelihood ratio test, there is a single threshold ηη; these bounds apply to it as well, implying that in a likelihood ratio test the error probabilities will always satisfy

P D P F 1- P D 1- P F P D P F 1 P D 1 P F (1)
This relationship can be manipulated to show P D P F P D P F , indicating that the likelihood ratio test is, at the very least, a reasonable decision rule and that the ROC curves have the right general form.

Only with difficulty can we solve the inequality constraints on the sequential test's thresholds in the general case. Surprisingly, by approximating the inequality constraints by equality constraints we can obtain a result having pragmatically important properties. As an approximation, we thus turn to solving for η 0 η 0 and η 1 η 1 under the conditions η 0 =1-β1-α η 0 1 β 1 α and η 1 =βα η 1 β α In this way, the threshold values are explicitly specified in terms of the desired performance probabilities. We use the criterion values for the false-alarm and detection probabilities because when we use these equalities, the test's resulting performance probabilities P F P F and P D P D usually do not satisfy the design criteria. For example, equating η 1 η 1 to a value potentially larger than its desired value might result in a smaller detection probability and a larger false-alarm rate. We will want to understand how much actual performance departs from what we want.

The relationships derived above between the performance levels and the thresholds apply no matter how the thresholds are chosen. 1-β1-α1- P D 1- P F 1 β 1 α 1 P D 1 P F βα P D P F β α P D P F From these inequalities, two important results follow: P F αβ P F α β 1- P D 1-β1-α 1 P D 1 β 1 α and P F +1- P D α+1-β P F 1 P D α 1 β The first result follows directly from the threshold bounds. To derive the second result, we must work a little harder. Multiplying the first inequality by 1-α1- P F 1 α 1 P F yields 1-β1- P F 1- P D 1-α 1 β 1 P F 1 P D 1 α . Considering the reciprocal of the second inequality and multiplying it by β P D β P D yields P D αβ P F P D α β P F . Adding the two inequalities yields the second result.

The first set of inequalities suggest that the false-alarm and miss (which equals 1- P D 1 P D ) probabilities will increase only slightly from their specified values: the denominators on the right sides are very close to unity in the interesting cases (e.g., small error probabilities like 0.01). The second inequality suggests that the sum of the false-alarm and miss probabilities obtained in practice will be less than the sum of the specified error probabilities. Taking these results together, one of two situations will occur when we approximate the inequality criterion by equality: either the false alarm probability will decrease and the detection probability increase (a most pleasing but unlikely circumstance) or one of the error probabilities will increase while the other decreases. The false-alarm and miss probabilities cannot both increase. Furthermore, whichever one increases, the first inequalities suggest that the incremental change will be small. Our ad hoc approximation to the thresholds does indeed yield a level of performance close to that specified

Usually, the likelihood is manipulated to derive a sufficient statistic. The resulting sequential decision rule is ϒ L r< γ 0 L say 0 γ 0 L< ϒ L r< γ 1 L say "need more data" γ 1 L< ϒ L r say 1 ϒ L r γ 0 L say 0 γ 0 L ϒ L r γ 1 L say "need more data" γ 1 L ϒ L r say 1 Note that the thresholds γ 0 L γ 0 L and γ 1 L γ 1 L , derived from the thresholds η 0 η 0 and η 1 η 1 , usually depend on the number of observations used in the decision rule.

Example 1

Let rr be a Gaussian random vector as in our previous examples with statistically independent components. 0 : r0σ2I 0 : r 0 σ 2 I 1 : rmσ2I 1 : r m σ 2 I The mean vector mm is assumed for simplicity to consist of equal positive values: M=mmT M m m , m>0 m 0 . Using the previous derivations, our sequential test becomes l=0L-1 r l <σ2mlog η 0 +Lm2 say 0 σ2mlog η 0 +Lm2<l=0L-1 r l <σ2mlog η 1 +Lm2 say "need more data" σ2mlog η 1 +Lm2<l=0L-1 r l say 1 l 0 L 1 r l σ 2 m η 0 L m 2 say 0 σ 2 m η 0 L m 2 l 0 L 1 r l σ 2 m η 1 L m 2 say "need more data" σ 2 m η 1 L m 2 l 0 L 1 r l say 1 Starting with L=1 L 1 , we gather the data and compute the sum. The sufficient statistic will lie in the middle range between the two thresholds until one of them is exceeded as shown in Figure 1.

Figure 1: Example of the sequential likelihood ratio test. The sufficient statistic wanders between the two thresholds in a sequential decision rule until one of them is crossed by the statistic. The number of observations used to obtain a decision is L0 L0 .
Figure 1 (seq.png)
The model evaluation procedure then terminates and the chosen model announced. Note how the thresholds depend on the amount of data available (as expressed by LL). This variation typifies the sequential hypothesis tests.

Average Number of Required Observations

The awake reader might wonder whether that the sequential likelihood ratio test just derived has the disturbing property that it may never terminate: can the likelihood ratio wander between the two thresholds forever? Fortunately, the sequential likelihood ratio test has been shown to terminate with probability one (Wald). Confident of eventual termination, we need to explore how many observations are required to meet performance specifications. The number of observations is variable, depending on the observed data and the stringency of the specifications. The average number of observations required can be determined in the interesting case when the observations are statistically independent.

Assuming that the observations are statistically independent and identically distributed, the likelihood ratio is equal to the product of the likelihood ratios evaluated at each observation. Considering ln Λ L 0 r Λ L 0 r , the logarithm of the likelihood ratio when a decision is made on observation L 0 L 0 , we have ln Λ L 0 r=l=0 L 0 -1lnΛ r l Λ L 0 r l 0 L 0 1 Λ r l where Λ r l Λ r l is the likelihood ratio corresponding to the l th l th observation. We seek an expression for E L 0 L 0 , the expected value of the number of observations required to make the decision. To derive this quantity, we evaluate the expected value of the likelihood ratio when the decision is made. This value will usually vary with which model is actually valid; we must consider both models separately. Using the laws of conditional expectation (see Joint Distributions), we find that the expected value of Λ L 0 r Λ L 0 r , assuming that model 1 1 was true, is given by Eln Λ L 0 r| 1 =EEln Λ L 0 r| 1 L 0 1 Λ L 0 r 1 L 0 Λ L 0 r The outer expected value is evaluated with respect to the probability distribution of L 0 L 0 ; the inner expected value is average value of the log-likelihood assuming that L 0 L 0 observations were required to choose model 1 1 . In the latter case, the log-likelihood is the sum of L 0 L 0 component log-likelihood ratios Eln Λ L 0 r| 1 L 0 = L 0 ElnΛ r l | 1 1 L 0 Λ L 0 r L 0 1 Λ r l Noting that the expected value on the right is a constant with respect to the outer expected value, we find that Eln Λ L 0 r| 1 =E L 0 | 1 ElnΛ r l | 1 1 Λ L 0 r 1 L 0 1 Λ r l The average number of observations required to make a decision, correct or incorrect, assuming that 1 1 is true is thus expressed by E L 0 | 1 =Eln Λ L 0 r| 1 ElnΛ r l | 1 1 L 0 1 Λ L 0 r 1 Λ r l Assuming that the other model was true, we have the complementary result E L 0 | 0 =Eln Λ L 0 r| 0 ElnΛ r l | 0 0 L 0 0 Λ L 0 r 0 Λ r l

The numerator is difficult to calculate exactly but easily approximated; assuming that the likelihood ratio equals its threshold value when the decision is made, Eln Λ L 0 r| 0 P F ln η 1 +1- P F ln η 0 = P F ln P D P F +1- P F ln1- P D 1- P F 0 Λ L 0 r P F η 1 1 P F η 0 P F P D P F 1 P F 1 P D 1 P F Eln Λ L 0 r| 1 P D ln η 1 +1- P D ln η 0 = P D ln P D P F +1- P D ln1- P D 1- P F 1 Λ L 0 r P D η 1 1 P D η 0 P D P D P F 1 P D 1 P D 1 P F Note these expressions are not problem dependent; they depend only on the specified probabilities. The denominator cannot be approximated in a similar way with such generality; it must be evaluated for each problem.

Example 2

In the Gaussian example we have been exploring, the log-likelihood of each component observation r l r l is given by lnΛ r l =m r l σ2-m22σ2 Λ r l m r l σ 2 m 2 2 σ 2 The conditional expected values required to evaluate the expression for the average number of required observations are ElnΛ r l | 0 =-m22σ2 0 Λ r l m 2 2 σ 2 ElnΛ r l | 1 =m22σ2 1 Λ r l m 2 2 σ 2 For simplicity, let's assume that the false-alarm and detection probabilities are symmetric (i.e. P F =1- P D P F 1 P D ). The expressions for the average number of observations are equal for each model and we have E L 0 | 0 =E L 0 | 1 =f P F σ2m2 0 L 0 1