Skip to content Skip to navigation

Connexions

You are here: Home » Content » Model Consistency Testing

Navigation

Content Actions

  • Download module PDF
  • Add to ...
    Add the module to:
    • My Favorites
    • A lens
    • An external social bookmarking service
    • My Favorites (What is 'My Favorites'?)
      'My Favorites' is a special kind of lens which you can use to bookmark modules and collections directly in Connexions. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need a Connexions account to use 'My Favorites'.
    • A lens (What is a lens?)

      Definition of a lens

      Lenses

      A lens is a custom view of Connexions content. You can think of it as a fancy kind of list that will let you see Connexions through the eyes of organizations and people you trust.

      What is in a lens?

      Lens makers point to Connexions materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

      Who can create a lens?

      Any individual Connexions member, a community, or a respected organization.

    • External bookmarks
  • E-mail the author

Recently Viewed

This feature requires Javascript to be enabled.

Model Consistency Testing

Module by: Don Johnson

In many situations, we seek to check consistency of the observations with some preconceived model. Alternative models are usually difficult to describe parametrically since inconsistency may be beyond our modeling capabilities. We need a test that accepts consistency of observations with a model or rejects the model without pronouncing a more favored alternative. Assuming we know (or presume to know) the probability distribution of the observations under 0 0 , the models are

  • 0 0 : rpr| 0 r r p r 0 r
  • 1 1 : rpr| 0 r r p r 0 r
Null hypothesis testing seeks to determine if the observations are consistent with this description. The best procedure for consistency testing amounts to determining whether the observations lie in a highly probable region as defined by the null probability distribution. However, no one region defines a probability that is less than unity. We must restrict the size of the region so that it best represents those observations maximally consistent with the model while satisfying a performance criterion. Letting P F P F be a false-alarm probability established by us, we define the decision region 0 0 to satisfy Prr 0 | 0 = 0 pr| 0 rdr=1- P F 0 r 0 r 0 p r 0 r 1 P F and min 0 { 0 dr} 0 r 0 Usually, this region is located about the mean, but may not be symmetrically centered if the probability density is skewed. Our null hypothesis test for model consistency becomes r 0 "say observations are consistent" r 0 "say observations are consistent" r 0 "say observations are not consistent" r 0 "say observations are not consistent"

Example 1

Consider the problem of determining whether the sequence r l r l , l1L l 1 L , is white and Gaussian with zero mean and unit variance. Stated this way, the alternative model is not provided: is this model correct or not? We could estimate the probability density function of the observations and test the estimate for consistency. Here we take the null-hypothesis testing approach of converting this problem into a one-dimensional one by considering the statistic r=l r l 2 r l l r l 2 , which has a χ L 2 χ L 2 . Because this probability distribution is unimodal, the decision region can be safely assumed to be an interval r r ′′ r r ′′ .1 In this case, we can find an analytic solution to the problem of determining the decision region. Letting R= r ′′ - r R r ′′ r denote the width of the interval, we seek the solution of the constrained optimization problem min r {R}   subject to   P r r +R- P r r =1- P F r R   subject to   P r r R P r r 1 P F We convert the constrained problem into an unconstrained one using Lagrange multipliers. min r {R+λ P r r +R- P r r -1- P F } r R λ P r r R P r r 1 P F Evaluation of the derivative of this quantity with respect to r r yields the result p r r +R= p r r p r r R p r r : to minimize the interval's width, the probability density function's values at the interval's endpoints must be equal. Finding these endpoints to satisfy the constraints amounts to searching the probability distribution at such points for increasing values of R R until the required probability is contained within. For L=100 L 100 and P F =0.05 P F 0.05 , the optimal decision region for the χ L 2 χ L 2 distribution is 78.82128.5 78.82 128.5 . Figure 1 demonstrates ten testing trials for observations that fit the model and for observations that don't.

Figure 1: Ten trials of testing a 100-element sequence for consistency with a white, Gaussian model - r l 01 r l 0 1 - for three situations. In the first (shown by the circles), the observations do conform to the model. In the second (boxes), the observations are zero-mean Gaussian but with variance two. Finally, the third example (crosses) has white observations with a density closely resembling the Gaussian: a hyperbolic secant density having zero mean and unit variance. The sum of squared observations for each example are shown with the optimal χ 100 2 χ 100 2 interval displayed. Note how dramatically the test statistic departs from the decision interval when parameters disagree.
Figure 1 (consistency.jpg)

Footnotes

  1. This one-dimensional result for the consistency test may extend to the multi-dimensional case in the obvious way.

Comments, questions, feedback, criticisms?

Send feedback