# OpenStax_CNX

You are here: Home » Content » Signal and Information Processing for Sonar » Sufficient Statistics

### Recently Viewed

This feature requires Javascript to be enabled.

Inside Collection (Course):

Course by: Laurence Riddle. E-mail the author

# Sufficient Statistics

Module by: Clayton Scott, Robert Nowak. E-mail the authors

## Introduction

Sufficient statistics arise in nearly every aspect of statistical inference. It is important to understand them before progressing to areas such as hypothesis testing and parameter estimation.

Suppose we observe an NN-dimensional random vector XX, characterized by the density or mass function f θ x f θ x , where θθ is a pp-dimensional vector of parameters to be estimated. The functional form of fx f x is assumed known. The parameter θθ completely determines the distribution of XX. Conversely, a measurement xx of XX provides information about θθ through the probability law f θ x f θ x .

### Example 1

Suppose X= X 1 X 2 X X 1 X 2 , where X i 𝒩θ1 X i θ 1 are IID. Here θθ is a scalar parameter specifying the mean. The distribution of XX is determined by θθ through the density f θ x=12πe x 1 θ2212πe x 2 θ22 f θ x 1 2 x 1 θ 2 2 1 2 x 2 θ 2 2 On the other hand, if we observe x=100102 x 100 102 , then we may safely assume θ=0 θ 0 is highly unlikely.

The NN-dimensional observation XX carries information about the pp-dimensional parameter vector θθ. If p<N p N , one may ask the following question: Can we compress xx into a low-dimensional statistic without any loss of information? Does there exist some function t=Tx t T x , where the dimension of tt is M<N M N , such that tt carries all the useful information about θθ?

If so, for the purpose of studying θθ we could discard the raw measurements xx and retain only the low-dimensional statistic tt. We call tt a sufficient statistic. The following definition captures this notion precisely:

Definition 1:
Let X 1 , , X M X 1 , , X M be a random sample, governed by the density or probability mass function fx| θ f θ x . The statistic Tx T x is sufficient for θθ if the conditional distribution of xx, given Tx=t T x t , is independent of θθ. Equivalently, the functional form of f θ | t x f θ t x does not involve θθ.
How should we interpret this definition? Here are some possibilities:

1. Let f θ xt f θ x t denote the joint density or probability mass function on ( X , T ( X ) ) ( X , T ( X ) ) . If TX T X is a sufficient statistic for θθ, then

f θ x=f θ xTx=f θ | t xf θ t=fx| t f θ t f θ x f θ x T x f θ t x f θ t f t x f θ t
(1)
Therefore, the parametrization of the probability law for the measurement xx is manifested in the parametrization of the probability law for the statistic Tx T x .

2. Given t=Tx t T x , full knowledge of the measurement xx brings no additional information about θθ. Thus, we may discard xx and retain on the compressed statistic tt.

3. Any inference strategy based on f θ x f θ x may be replaced by a strategy based on f θ t f θ t .

### Example 2

#### Binary Information Source

(Scharf, pp.78) Suppose a binary information source emits a sequence of binary (0 or 1) valued, independent variables x 1 , , x N x 1 , , x N . Each binary symbol may be viewed as a realization of a Bernoulli trial: x n Bernoulliθ x n Bernoulli θ , iid. The parameter θ 0 1 θ 0 1 is to be estimated.

The probability mass function for the random sample x= x 1 x N T x x 1 x N is

f θ x= n =1Nf θ x n n =1Nθk1θNk f θ x n 1 N f θ x n n 1 N θ f θ x x n 1 θ 1 x n θ k 1 θ N k
(2)
where k= n =1N x n k n 1 N x n is the number of 1's in the sample.

We will show that kk is a sufficient statistic for xx. This will entail showing that the conditional probability mass function f θ | k x f θ k x does not depend on θθ.

The distribution of the number of ones in NN independent Bernoulli trials is binomial: f θ k=Nkθk1θNk f θ k N k θ k 1 θ N k Next, consider the joint distribution of ( x , x n ) ( x , x n ) . We have f θ x=f θ x x n f θ x f θ x x n Thus, the conditional probability may be written

f θ | k x=f θ xkf θ k=f θ xf θ k=θk1θNkNkθk1θNk=1Nk f θ k x f θ x k f θ k f θ x f θ k θ k 1 θ N k N k θ k 1 θ N k 1 N k
(3)
This shows that kk is indeed a sufficient statistic for θθ. The NN values x 1 , , x N x 1 , , x N can be replaced by the quantity kk without losing information about θθ.

### Exercise 1

In the previous example, suppose we wish to store in memory the information we possess about θθ. Compare the savings, in terms of bits, we gain by storing the sufficient statistic kk instead of the full sample x 1 , , x N x 1 , , x N .

## Determining Sufficient Statistics

In the example above, we had to guess the sufficient statistic, and work out the conditional probability by hand. In general, this will be a tedious way to go about finding sufficient statistics. Fortunately, spotting sufficient statistics can be made easier by the Fisher-Neyman Factorization Theorem.

## Uses of Sufficient Statistics

Sufficient statistics have many uses in statistical inference problems. In hypothesis testing, the Likelihood Ratio Test can often be reduced to a sufficient statistic of the data. In parameter estimation, the Minimum Variance Unbiased Estimator of a parameter θθ can be characterized by sufficient statistics and the Rao-Blackwell Theorem.

## Minimality and Completeness

Minimal sufficient statistics are, roughly speaking, sufficient statistics that cannot be compressed any more without losing information about the unknown parameter. Completeness is a technical characterization of sufficient statistics that allows one to prove minimality. These topics are covered in detail in this module.

Further examples of sufficient statistics may be found in the module on the Fisher-Neyman Factorization Theorem.

## References

1. L. Scharf. (1991). Statistical Signal Processing. Addison-Wesley.

## Content actions

EPUB (?)

### What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

PDF | EPUB (?)

### What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

#### Collection to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

#### Definition of a lens

##### Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

##### What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

##### Who can create a lens?

Any individual member, a community, or a respected organization.

##### What are tags?

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks

#### Module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

#### Definition of a lens

##### Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

##### What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

##### Who can create a lens?

Any individual member, a community, or a respected organization.

##### What are tags?

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks