Skip to content Skip to navigation

Connexions

You are here: Home » Content » Partial Knowledge of Probability Distributions

Navigation

Content Actions

  • Download module PDF
  • Add to ...
    Add the module to:
    • My Favorites
    • A lens
    • An external social bookmarking service
    • My Favorites (What is 'My Favorites'?)
      'My Favorites' is a special kind of lens which you can use to bookmark modules and collections directly in Connexions. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need a Connexions account to use 'My Favorites'.
    • A lens (What is a lens?)

      Definition of a lens

      Lenses

      A lens is a custom view of Connexions content. You can think of it as a fancy kind of list that will let you see Connexions through the eyes of organizations and people you trust.

      What is in a lens?

      Lens makers point to Connexions materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

      Who can create a lens?

      Any individual Connexions member, a community, or a respected organization.

    • External bookmarks
  • E-mail the author

Recently Viewed

Partial Knowledge of Probability Distributions

Module by: Don Johnson

In previous chapters, we assumed we knew the mathematical form of the probability distribution for the observations under each model; some of these distribution's parameters were not known and we developed decision rules to deal with this uncertainty. A more difficult problem occurs when the mathematical form is not known precisely. For example, the data may be approximately Gaussian, containing slight departures from the ideal. More radically, so little may be known about an accurate model for the data that we are only willing to assume that they are distributed symmetrically about some value. We develop model evaluation algorithms in this section that tackle both kinds of problems. However, be forewarned that solutions to such general models come at a price: the more specific a model can be that accurately describes a given problem, the better the performance. In other words, the more specific the model, the more the signal processing algorithms can be tailored to fit it with the obvious result that we enhance the performance. However, if our specific model is in error, our neatly tailored algorithms can lead us drastically astray. Thus, the best approach is to relax those aspects of the model which seem doubtful and to develop algorithms that will cope well with worst-case situations should they arise ("And they usually do," echoes every person experienced in the vagaries of data). These considerations lead us to consider nonparametric variations in the probability densities compatible with out assessment of model accuracy and to derive decision rules that minimize the impact of the worse-case situation.

Worst-Case Probability Distributions

In model evaluation problems, there are "optimally" hard problems, those where the models are the most difficult to distinguish. The impossible problem is to distinguish models that are identical. In this situation, the conditional densities of the observed data are equal and the likelihood ratio is constant for all possible values of the observations. It is obvious that identical models are indistinguishable; this elaboration suggest that in terms of the likelihood ratio, hard problems are those in which the likelihood ratio is constant. Thus, "hard problems" are those in which the class of conditional probability densities has a constant ratio for wide ranges of observed data values.

The most relevant model evaluation problem for us is the discrimination between two models that differ only in the means of statistically independent observations: the conditional densities of each observation are related as p r l | 1 r l =p r l | 0 r l -m p r l 1 r l p r l 0 r l m . Densities that would make this model evaluation problem hard would satisfy the functional equation x,m,xm:px-m=Cmpx x m x m p x m C m p x where Cm C m is quantity depending on the mean mm, but not the variable xx.1 For the probability densities satisfying this equation, any value of the observed datum which has a value greater than mm cannot be used to distinguish the two models. If one considers only those zero-mean densities p· p · which are symmetric about the origin, then by symmetry the likelihood ratio would also be constant for x0 x 0 . Hypotheses having these densities could only be distinguished when the oberservations lay in the interval 0m 0 m ; such model evaluation problems are hard!

From the functional equation, we see that the quantity Cm C m must be inversely proportional to pm p m (substitute x=m x m into the equation). Incorporating this fact into our functional equation, we find that the only solution is the exponential function. z,z0:pz-m=Cmpzpz-z z z 0 p z m C m p z p z z If we insist that the density satisfying the functional equation by symmetric, the solution is the so-called Laplacian (or double-exponential) density. pzz=12σ2-|z|σ22 p z z 1 2 σ 2 z σ 2 2 When this density serves as the underlying density for our hard model-testing problem, the likelihood ratio has the form (Huber; 1965, Huber; 1981, Poor pp.175-187) lnΛ r l =-mσ22if r l <02 r l -mσ22if0< r l <mmσ22ifm< r l Λ r l m σ 2 2 r l 0 2 r l m σ 2 2 0 r l m m σ 2 2 m r l Indeed, the likelihood ratio is constant over much of the range of values of r l r l , implying that the two models are very similar over those ranges. This worst-case result will appear repeatedly as we embark on searching for the model evaluation rules that minimize the effect of modeling errors on performance.

Footnotes

  1. The uniform density does not satisfy this equation as the domain of the function p· p · is assumed to be infinite.

References

  1. P.J. Huber. (1965). A robust version of the probability ratio test. Ann. Math. Stat., 36, 1753-1758.
  2. P.J. Huber. (1981). Robust Statistics. New York: John Wiley and Sons.
  3. H.V. Poor. (1988). An Introduction to Signal Detection and Estimation. New York: Springer-Verlag.

Comments, questions, feedback, criticisms?

Send feedback