Skip to content Skip to navigation

Connexions

You are here: Home » Content » The Fisher-Neyman Factorization Theorem

Navigation

Content Actions

  • Download module PDF
  • Add to ...
    Add the module to:
    • My Favorites
    • A lens
    • An external social bookmarking service
    • My Favorites (What is 'My Favorites'?)
      'My Favorites' is a special kind of lens which you can use to bookmark modules and collections directly in Connexions. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need a Connexions account to use 'My Favorites'.
    • A lens (What is a lens?)

      Definition of a lens

      Lenses

      A lens is a custom view of Connexions content. You can think of it as a fancy kind of list that will let you see Connexions through the eyes of organizations and people you trust.

      What is in a lens?

      Lens makers point to Connexions materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

      Who can create a lens?

      Any individual Connexions member, a community, or a respected organization.

    • External bookmarks
  • E-mail the authors

Recently Viewed

The Fisher-Neyman Factorization Theorem

Module by: Clayton Scott, Robert Nowak

Determining a sufficient statistic directly from the definition can be a tedious process. The following result can simplify this process by allowing one to spot a sufficient statistic directly from the functional form of the density or mass function.

theorem 1: Fisher-Neyman Factorization Theorem

Let fθx f θ x be the density or mass function for the random vector x x, parametrized by the vector θ θ. The statistic t=Tx t T x is sufficient for θ θ if and only if there exist functions ax a x (not depending on θ θ) and bθt b θ t such that fθx=axbθt f θ x a x b θ t for all possible values of x x.

In an earlier example we computed a sufficient statistic for a binary communication source (independent Bernoulli trials) from the definition. Using the above result, this task becomes substantially easier.

Example 1

Bernoulli Trials Revisited

Suppose x n Bernoulliθ x n Bernoulli θ are IID, n,n= 1 , , N n n 1 , , N . Denote x= x 1 x n T x x 1 x n . Then

fθx=n=1Nθ x n 1-θ1- x n =θk1-θN-k=axbθk f θ x n 1 N θ x n 1 θ 1 x n θ k 1 θ N k a x b θ k (1)
where k=n=1N x n k n 1 N x n , ax=1 a x 1 , and bθk=θk1-θN-k b θ k θ k 1 θ N k . By the Fisher-Neyman factorization theorem, k k is sufficient for θ θ.

The next example illustrates the appliction of the theorem to a continuous random variable.

Example 2

Normal Data with Unknown Mean

Consider a normally distributed random sample x 1 , , x N θ1 x 1 , , x N θ 1 , IID, where θθ is unknown. The joint pdf of x= x 1 x n T x x 1 x n is fθx=n=1Nfθ x n =12πN2-12n=1N x n -θ2 f θ x n 1 N f θ x n 1 2 N 2 -1 2 n 1 N x n θ 2 We would like to rewrite fθx f θ x is the form of axbθt a x b θ t , where dimt<N dim t N . At this point we require a trick-one that is commonly used when manipulating normal densities, and worth remembering. Define x¯=1Nn=1N x n x 1 N n 1 N x n , the sample mean. Then

fθx=12πN2-12n=1N x n -x¯+x¯-θ2=12πN2-12n=1N x n -x¯2+2 x n -x¯x¯-θ+x¯-θ2 f θ x 1 2 N 2 -1 2 n 1 N x n x x θ 2 1 2 N 2 -1 2 n 1 N x n x 2 2 x n x x θ x θ 2 (2)
Now observe
n=1N x n -x¯x¯-θ=x¯-θn=1N x n -x¯=x¯-θx¯-x¯=0 n 1 N x n x x θ x θ n 1 N x n x x θ x x 0 (3)
so the middle term vanishes. We are left with fθx=12πN2-12n=1N x n -x¯2-12n=1Nx¯-θ2 f θ x 1 2 N 2 -1 2 n 1 N x n x 2 -1 2 n 1 N x θ 2 where ax=12πN2-12n=1N x n -x¯2 a x 1 2 N 2 -1 2 n 1 N x n x 2 , bθt=-12n=1Nx¯-θ2 b θ t -1 2 n 1 N x θ 2 , and t=x t x . Thus, the sample mean is a one-dimensional sufficient statistic for the mean.

Proof of Theorem

First, suppose t=Tx t T x is sufficient for θ θ. By definition, fθ|Tx=tx f θ T x t x is independent of θ θ. Let fθxt f θ x t denote the joint density or mass function for ( X , T ( X ) ) ( X , T ( X ) ) . Observe fθx=fθxt f θ x f θ x t . Then

fθx=fθxt=fθ|txfθt=axbθt f θ x f θ x t f θ t x f θ t a x b θ t (4)
where ax=fθ|tx a x f θ t x and bθt=fθt b θ t f θ t . We prove the reverse implication for the discrete case only. The continuous case follows a similar argument, but requires a bit more technical work (Scharf, pp.82; Kay, pp.127).

Suppose the probability mass function for xx can be written fθx=axbθx f θ x a x b θ x where t=Tx t T x . The probability mass function for tt is obtained by summing fθxt f θ x t over all xx such that Tx=t T x t :

fθt=Tx=tfθxt=Tx=tfθx=Tx=taxbθt f θ t x T x t f θ x t x T x t f θ x x T x t a x b θ t (5)
Therefore, the conditional mass function of xx, given tt, is
fθ|tx=fθxtfθt=fθxfθt=axTx=tax f θ t x f θ x t f θ t f θ x f θ t a x x T x t a x (6)
This last expression does not depend on θθ, so tt is a sufficient statistic for θθ. This completes the proof.

Remark:

From the proof, the Fisher-Neyman factorization gives us a formula for the conditional probability of xx given tt. In the discrete case we have fx|t=axTx=tax f t x a x x T x t a x An analogous formula holds for continuous random variables (Scharf, pp.82).

Further Examples

The following exercises provide additional examples where the Fisher-Neyman factorization may be used to identify sufficient statistics.

Exercise 1: Uniform Measurements

Suppose x 1 , , x N x 1 , , x N are independent and uniformly distributed on the interval θ 1 θ 2 θ 1 θ 2 . Find a sufficient statistic for θ= θ 1 θ 2 T θ θ 1 θ 2 .

Hint:

Express the likelihood fθx f θ x in terms of indicator functions.

Exercise 2: Poisson

Suppose x 1 , , x N x 1 , , x N are independent measurements of a Poisson random variable with intensity parameter θθ: x,x= 0 , 1 , 2 , :fθx=-θθxx! x x 0 , 1 , 2 , f θ x θ θ x x

2.a)

Find a sufficient statistic tt for θθ.

2.b)

What is the conditional probability mass function of xx, given tt, where x= x 1 x N T x x 1 x N ?

Exercise 3: Normal with Unknown Mean and Variance

Consider x 1 , , x N μσ2 x 1 , , x N μ σ 2 , IID, where θ 1 =μ θ 1 μ and θ 2 =σ2 θ 2 σ 2 are both unknown. Find a sufficient statistic for θ= θ 1 θ 2 T θ θ 1 θ 2 .

Hint:

Use the same trick as in Example 2.

References

  1. L. Scharf. (1991). Statistical Signal Processing. Addison-Wesley.
  2. Kay. (1993). Estimation Theory. Prentice Hall.

Comments, questions, feedback, criticisms?

Send feedback