Connexions

You are here: Home » Content » Non-Random Parameters

Recently Viewed

This feature requires Javascript to be enabled.

Non-Random Parameters

Module by: Don Johnson. E-mail the author

In those cases where a probability density for the parameters cannot be assigned, the model evaluation problem can be solved in several ways; the methods used depend on the form of the likelihood ratio and the way in which the parameter(s) enter the problem. In the Gaussian problem we have discussed so often, the threshold used in the likelihood ratio test η η may be unity. In this case, examination of the resulting computations required reveals that implementing the test does not require knowledge of the variance of the observations (see this problem). Thus, if the common variance of the underlying Gaussian distributions is not known, this lack of knowledge has no effect on the optimum decision rule. This happy situation - knowledge of the value of a parameter is not required by the optimum decision rule - occurs rarely, but should be checked before using more complicated procedures.

A second fortuitous situation occurs when the sufficient statistic as well as its probability density under one of the models do not depend on the unknown parameter(s). Although the sufficient statistic's threshold γ γ expressed in terms of the likelihood ratio's threshold η η depends on the unknown parameters, γ γ may be computed as a single value using the Neyman-Pearson criterion if the computation of the false-alarm probability does not involve the unknown parameters.

Example 1

Continuing the example of the previous section, let's consider the situation where the value of the mean of each observation under model 1 1 is not known. The sufficient statistic is the sum of the observations (that quantity doesn't depend on m m) and the distribution of the observation vector under model 0 0 does not depend on m m (allowing computation of the false-alarm probability). However, a subtlety emerges; in the derivation of the sufficient statistic, we had to divide by the value of the mean. The critical step occurs once the logarithm of the likelihood ratio is manipulated to obtain ml=0L1 r l 0 1 (σ2lnη+Lm22) m l 0 L 1 r l 0 1 σ 2 η L m 2 2 Recall that only positively monotonic transformations can be applied; if a negatively monotonic operation is applied to this inequality (such as multiplying both sides by -1), the inequality reverses. If the sign of m m is known, it can be taken into account explicitly and a sufficient statistic results. If, however, the sign is not known, the above expression cannot be manipulated further and the left side constitutes the sufficient statistic for this problem. The sufficient statistic then depends on the unknown parameter and we cannot develop a decision rule in this case. If the sign is known, we can proceed. Assuming the sign of m m is positive, the sufficient statistic is the sum of the observations and the threshold γ γ is found by γ=LσQ-1 P F γ L σ Q P F Note that if the variance σ2 σ 2 instead of the mean were unknown, we could not compute the threshold. The difficulty lies not with the sufficient statistic (it doesn't depend on the variance), but with the false alarm probability as the expression indicates. Another approach is required to deal with the unknown-variance problem.

When this situation occurs - the sufficient statistic and the false-alarm probability can be computed without needing the parameter in question, we have established what is known as a uniformly most powerful test (or UMP test) (Cramér; p.529-531), (van Trees; p.89ff). If an UMP test does not exist, which can only be demonstrated by explicitly finding the sufficient statistic and evaluating its probability distribution, then the composite hypothesis testing problem cannot be solved without some value for the parameter being used.

This seemingly impossible situation - we need the value of the parameter that is assumed unknown - can be approached by noting that some data is available for "guessing" the value of the parameter. If a reasonable guess could be obtained, it could then be used in our model evaluation procedures developed in this chapter. The data available for estimating unknown parameters are precisely the data used in the decision rule. Procedures intended to yield "good" guesses of the value of a parameter are said to be parameter estimates. Estimation procedures are the topic of the next chapter; there we will explore a variety of estimation techniques and develop measure of estimate quality. For the moment, these issues are secondary; even if we knew the size of the estimation error, for example, the more pertinent issue is how the imprecise parameter value affects the performance probabilities. We can compute these probabilities without explicitly determining the estimate's error characteristics.

One parameter estimation procedure that fits nicely into the composite hypothesis testing problem is the maximum likelihood estimate. 1 Letting r r denote the vector of observables and θ θ a vector of parameters, the maximum likelihood estimate of θ θ, θ^ML θ ML , is that value of θ θ that maximizes the conditional density p r | θ r p r θ r of the observations given the parameter values. To use θ^ML θ ML in our decision rule, we estimate the parameter vector separately for each model, use the estimated value in the conditional density of the observations, and compute the likelihood ratio. This procedure is termed the generalized likelihood ratio test for the unknown parameter problem in hypothesis testing (Lehmann; p.16), (van Trees; p.92ff).

Λr=max θ θ p r | 1 θ rmax θ θ p r | 0 θ r Λ r θ p r 1 θ r θ p r 0 θ r
(1)
Note that we do not find that value of the parameter that (necessarily) maximizes the likelihood ratio. Rather, we estimate the parameter value most consistent with the observed data in the context of each assumed model (hypothesis) of data generation. In this way, the estimate conforms with each potential model rather than being determined by some amalgam of supposedly mutually exclusive models.

Example 2

Returning to our Gaussian example, assume that the variance σ2 σ 2 is known but that the mean under 1 1 is unknown. 0 :   r0σ2I 0 :   r 0 σ 2 I 1 :   rmσ2I 1 :   r m σ 2 I m=mm ,   m=? m m m ,   m ? The unknown quantity occurs only in the exponent of the conditional density under 1 1 ; to maximize this density, we need only to maximize the exponent. Thus, we consider the derivative of the exponent with respect to m m. (((12σ2)l=0L1 r l m2)m|m=m^ML=0)(l=0L1 r l m^ML=0) m m ML m 1 2 σ 2 l 0 L 1 r l m 2 0 l 0 L 1 r l m ML 0 The solution of this equation is the average value of the observations m^ML=1Ll=0L1 r l m ML 1 L l 0 L 1 r l To derive the decision rule, we substitute this estimate in the conditional density for 1 1 . The critical term, the exponent of this density, is manipulated to obtain (12σ2)l=0L1 r l 1Lk=0L1 r k 2=(12σ2)(l=0L1 r l 2 1Ll=0L1 r l 2) 1 2 σ 2 l 0 L 1 r l 1 L k 0 L 1 r k 2 1 2 σ 2 l 0 L 1 r l 2 1 L l 0 L 1 r l 2 Noting that the first term in this exponent is identical to the exponent of the denominator in the likelihood ratio, the generalized likelihood ratio becomes Λr=e12Lσ2l=0L1 r l 2 Λ r 1 2 L σ 2 l 0 L 1 r l 2 The sufficient statistic thus becomes the square (or equivalently the magnitude) of the summed observations. Compare this result with that obtained in Example 1. There, an UMP test existed if we knew the sign of m m and the sufficient statistic was the sum of the observations. Here, where we employed the generalized likelihood ratio test, we made no such assumptions about m m; this generality accounts for the difference in sufficient statistic. Which test do you think would lead to a greater detection probability for a given false-alarm probability?

Once the generalized likelihood ratio is determined, we need to determine the threshold. If the a priori probabilities π 0 π 0 and π 1 π 1 are known, the evaluation of the threshold proceeds in the usual way. If they are not known, all of the conditional densities must not depend on the unknown parameters lest the performance probabilities also depend upon them. In most cases, the original model evaluation problem is posed in such a way that one of the models does not depend on the unknown parameter; a criterion on the performance probability related to that model can then be established via the Neyman-Pearson procedure. If not the case, the threshold cannot be computed and the threshold must be set experimentally: we force one of the models to be true and modify the threshold on the sufficient statistic until the desired level of performance is reached. Despite this non-mathematical approach, the overall performance of the model evaluation procedure will be optimum because of the results surrounding the Neyman-Pearson criterion.

Footnotes

1. The maximum likelihood estimation procedure and its characteristics are fully described in this section.

References

1. H. Cramér. (1946). Mathematical Methods of Statistics. Princeton, NJ: Princeton University Press.
2. H.L. van Trees. (1968). Detection, Estimation, and Modulation Theory, Part I. New York: John Wiley and Sons.
3. E.L. Lehmann. (1986). Testing Statistical Hypotheses. (second edition). New York: John Wiley and Sons.

Content actions

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags?

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks