Skip to content Skip to navigation

Connexions

You are here: Home » Content » The Fisher-Neyman Factorization Theorem

Navigation

Content Actions

  • Download module PDF
  • Add to ...
    Add the module to:
    • My Favorites
    • A lens
    • An external social bookmarking service
    • My Favorites (What is 'My Favorites'?)
      'My Favorites' is a special kind of lens which you can use to bookmark modules and collections directly in Connexions. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need a Connexions account to use 'My Favorites'.
    • A lens (What is a lens?)

      Definition of a lens

      Lenses

      A lens is a custom view of Connexions content. You can think of it as a fancy kind of list that will let you see Connexions through the eyes of organizations and people you trust.

      What is in a lens?

      Lens makers point to Connexions materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

      Who can create a lens?

      Any individual Connexions member, a community, or a respected organization.

      What are tags? tag icon

      Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

    • External bookmarks
  • E-mail the authors
  • Rate this module (How does the rating system work?)

    Rating system

    Ratings

    Ratings allow you to judge the quality of modules. If other users have ranked the module then its average rating is displayed below. Ratings are calculated on a scale from one star (Poor) to five stars (Excellent).

    How to rate a module

    Hover over the star that corresponds to the rating you wish to assign. Click on the star to add your rating. Your rating should be based on the quality of the content. You must have an account and be logged in to rate content.

    (0 ratings)

Recently Viewed

This feature requires Javascript to be enabled.

The Fisher-Neyman Factorization Theorem

Module by: Clayton Scott, Robert Nowak

Note: Your browser may not currently support MathML. See our browser support page for additional details. You can always view the correct math in the PDF version.

Determining a sufficient statistic directly from the definition can be a tedious process. The following result can simplify this process by allowing one to spot a sufficient statistic directly from the functional form of the density or mass function.

Theorem 1: Fisher-Neyman Factorization Theorem

Let fθx f θ x be the density or mass function for the random vector x x, parametrized by the vector θ θ. The statistic t=Tx t T x is sufficient for θ θ if and only if there exist functions ax a x (not depending on θ θ) and bθt b θ t such that fθx=axbθt f θ x a x b θ t for all possible values of x x.

In an earlier example we computed a sufficient statistic for a binary communication source (independent Bernoulli trials) from the definition. Using the above result, this task becomes substantially easier.

Example 1

Bernoulli Trials Revisited

Suppose x n Bernoulliθ x n Bernoulli θ are IID, n,n= 1 , , N n n 1 , , N . Denote x= x 1 x n T x x 1 x n . Then

fθx=n=1Nθ x n 1θ1 x n =θk1θNk=axbθk f θ x n 1 N θ x n 1 θ 1 x n θ k 1 θ N k a x b θ k (1)
where k=n=1N x n k n 1 N x n , ax=1 a x 1 , and bθk=θk1θNk b θ k θ k 1 θ N k . By the Fisher-Neyman factorization theorem, k k is sufficient for θ θ.

The next example illustrates the appliction of the theorem to a continuous random variable.

Example 2

Normal Data with Unknown Mean

Consider a normally distributed random sample x 1 , , x N θ1 x 1 , , x N θ 1 , IID, where θθ is unknown. The joint pdf of x= x 1 x n T x x 1 x n is fθx=n=1Nfθ x n =12πN2-12n=1N x n θ2 f θ x n 1 N f θ x n 1 2 N 2 -1 2 n 1 N x n θ 2 We would like to rewrite fθx f θ x is the form of axbθt a x b θ t , where dimt<N dim t N . At this point we require a trick-one that is commonly used when manipulating normal densities, and worth remembering. Define x¯=1Nn=1N x n x 1 N n 1 N x n , the sample mean. Then

fθx=12πN2-12n=1N x n x¯+x¯θ2=12πN2-12n=1N x n x¯2+2 x n x¯x¯θ+x¯θ2 f θ x 1 2 N 2 -1 2 n 1 N x n x x θ 2 1 2 N 2 -1 2 n 1 N x n x 2 2 x n x x θ x θ 2 (2)
Now observe
n=1N x n x¯x¯θ=x¯θn=1N x n x¯=x¯θx¯x¯=0 n 1 N x n x x θ x θ n 1 N x n x x θ x x 0 (3)
so the middle term vanishes. We are left with fθx=12πN2-12n=1N x n x¯2-12n=1Nx¯θ2 f θ x 1 2 N 2 -1 2 n 1 N x n x 2 -1 2 n 1 N x θ 2 where ax=12πN2-12n=1N x n x¯2 a x 1 2 N 2 -1 2 n 1 N x n x 2 , bθt=-12n=1Nx¯θ2 b θ t -1 2 n 1 N x θ 2 , and t=x t x . Thus, the sample mean is a one-dimensional sufficient statistic for the mean.

Proof of Theorem

First, suppose t=Tx t T x is sufficient for θ θ. By definition, fθ|Tx=tx f θ T x t x is independent of θ θ. Let fθxt f θ x t denote the joint density or mass function for ( X , T ( X ) ) ( X , T ( X ) ) . Observe fθx=fθxt f θ x f θ x t . Then

fθx=fθxt=fθ|txfθt=axbθt f θ x f θ x t f θ t x f θ t a x b θ t (4)
where ax=fθ|tx a x f θ t x and bθt=fθt b θ t f θ t . We prove the reverse implication for the discrete case only. The continuous case follows a similar argument, but requires a bit more technical work (Scharf, pp.82; Kay, pp.127).

Suppose the probability mass function for xx can be written fθx=axbθx f θ x a x b θ x where t=Tx t T x . The probability mass function for tt is obtained by summing fθxt f θ x t over all xx such that Tx=t T x t :

fθt=Tx=tfθxt=Tx=tfθx=Tx=taxbθt f θ t x T x t f θ x t x T x t f θ x x T x t a x b θ t (5)
Therefore, the conditional mass function of xx, given tt, is
fθ|tx=fθxtfθt=fθxfθt=axTx=tax f θ t x f θ x t f θ t f θ x f θ t a x x T x t a x (6)
This last expression does not depend on θθ, so tt is a sufficient statistic for θθ. This completes the proof.

Remark:

From the proof, the Fisher-Neyman factorization gives us a formula for the conditional probability of xx given tt. In the discrete case we have fx|t=axTx=tax f t x a x x T x t a x An analogous formula holds for continuous random variables (Scharf, pp.82).

Further Examples

The following exercises provide additional examples where the Fisher-Neyman factorization may be used to identify sufficient statistics.

Exercise 1

Uniform Measurements

Suppose x 1 , , x N x 1 , , x N are independent and uniformly distributed on the interval θ 1 θ 2 θ 1 θ 2 . Find a sufficient statistic for θ= θ 1 θ 2 T θ θ 1 θ 2 .

Hint:
Express the likelihood fθx f θ x in terms of indicator functions.

Exercise 2

Poisson

Suppose x 1 , , x N x 1 , , x N are independent measurements of a Poisson random variable with intensity parameter θθ: x,x= 0 , 1 , 2 , :fθx=-θθxx! x x 0 , 1 , 2 , f θ x θ θ x x

2.a)

Find a sufficient statistic tt for θθ.

2.b)

What is the conditional probability mass function of xx, given tt, where x= x 1 x N T x x 1 x N ?

Exercise 3

Normal with Unknown Mean and Variance

Consider x 1 , , x N μσ2 x 1 , , x N μ σ 2 , IID, where θ 1 =μ θ 1 μ and θ 2 =σ2 θ 2 σ 2 are both unknown. Find a sufficient statistic for θ= θ 1 θ 2 T θ θ 1 θ 2 .

Hint:
Use the same trick as in Example 2.

References

  1. L. Scharf. (1991). Statistical Signal Processing. Addison-Wesley.
  2. Kay. (1993). Estimation Theory. Prentice Hall.

Comments, questions, feedback, criticisms?

Send feedback