Skip to content Skip to navigation

Connexions

You are here: Home » Content » Cramer-Rao Bound

Navigation

Content Actions

  • Download module PDF
  • Add to ...
    Add the module to:
    • My Favorites
    • A lens
    • An external social bookmarking service
    • My Favorites (What is 'My Favorites'?)
      'My Favorites' is a special kind of lens which you can use to bookmark modules and collections directly in Connexions. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need a Connexions account to use 'My Favorites'.
    • A lens (What is a lens?)

      Definition of a lens

      Lenses

      A lens is a custom view of Connexions content. You can think of it as a fancy kind of list that will let you see Connexions through the eyes of organizations and people you trust.

      What is in a lens?

      Lens makers point to Connexions materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

      Who can create a lens?

      Any individual Connexions member, a community, or a respected organization.

    • External bookmarks
  • E-mail the author

Recently Viewed

This feature requires Javascript to be enabled.

Cramer-Rao Bound

Module by: Don Johnson

The mean-squared error for any estimate of a nonrandom parameter has a lower bound, the Cramér-Rao Bound (Cramér (1946) pp. 474-477), which defines the ultimate accuracy of any estimation procedure. This lower bound, as shown later, is intimately related to the maximum likelihood estimator.

We seek a "bound" on the mean-squared error matrix MM defined to be M=E θ ̂-θ θ ̂-θT=EεεT M θ θ θ θ ε ε A matrix is "lower bounded" by a second matrix if the difference between the two is a non-negative definite matrix. Define the column matrix xx to be x= θ ̂-θ-bθθlnpr|θr x θ θ b θ θ p r θ r where bθ b θ denotes the column matrix of estimator biases. To derive the Cramér-Rao bound, evaluate ExxT x x . ExxT=M-bbTI+θbI+θbTF x x M b b I θ b I θ b F where θb θ b represents the matrix of partial derivatives of the bias θ j b i θ j b i and the matrix FF is the Fisher information matrix

F=Eθlnpr|θrθlnpr|θrT F θ p r θ r θ p r θ r (1)
Note that this matrix can alternatively be expressed as F=-E θ θ T lnpr|θr F θ θ T p r θ r The notation θ θ T θ θ T means the matrix of all second partials of the quantity it operates on (the gradient of the gradient). The matrix is known as the Hessian. Demonstrating the equivalence of these two forms for the Fisher information is quite easy. Because pr|θrdr=1 r p r θ r 1 for all choices of the parameter vector, the gradient of the expression equals zero. Furthermore, θlnpr|θr=θpr|θrpr|θr θ p r θ r θ p r θ r p r θ r . Combining these results yields θlnpr|θrpr|θrdr=0 r θ p r θ r p r θ r 0 Evaluating the gradient for this quantity (using the chain rule) also yields zero. θ θ T lnpr|θrpr|θr+θlnpr|θrθlnpr|θrTpr|θrdr=0 r θ θ T p r θ r p r θ r θ p r θ r θ p r θ r p r θ r 0 or Eθlnpr|θrθlnpr|θrT=-E θ θ T lnpr|θr θ p r θ r θ p r θ r θ θ T p r θ r Calculating the expected value for the Hessian for is somewhat easier than finding the expected value of the outer product of the gradient with itself. In the scalar case, we have Eθlnpr|θr2=-E2θ2lnpr|θr θ p r θ r 2 θ 2 2 p r θ r

Returning to the derivation, the matrix ExxT x x is non-negative definite because it is a correlation matrix. Thus, for any column matrix, αα, the quadratic form αTExxTα α x x α is non-negative. Choose a form for αα that simplifies the quadratic form. A convenient choice is α=β-F-1I+θbTβ α β F I θ b β where ββ is an arbitrary column matrix. The quadratic form becomes in this case αTExxTα=βTM-bbT-I+θbF-1I+θbTβ α x x α β M b b I θ b F I θ b β As this quadratic form must be non-negative, the matrix expression enclosed in brackets must be non-negative definite. We thus obtain the well-known Cramér-Rao bound on the mean-square error matrix.

EεεTbθbθT+I+θbF-1I+θbT ε ε b θ b θ I θ b F I θ b (2)

This form for the Cramér-Rao Bound does not mean that each term in the matrix of squared errors is greater than the corresponding term in the bounding matrix. As stated earlier, this expression means that the difference between these matrices is non-negative definite. For a matrix to be non-negative definite, each term on the main diagonal must be non-negative. The elements of the main diagonal of EεεT ε ε are the squared errors of the estimate of the individual parameters. Thus, for each parameter, the mean-squared estimation error can be no smaller than Eθ̂i- θ i 2 b i 2θ+I+θbF-1I+θbTii θ i θ i 2 b i θ 2 I θ b F I θ b i i

This bound simplifies greatly if the estimator is unbiased ( b=0 b 0 ). In this case, the Cramér-Rao bound becomes Eθ̂i- θ i 2F-1ii θ i θ i 2 F i i Thus, the mean-squared error for each parameter in a multiple-parameter, unbiased-estimator problem can be no smaller than the corresponding diagonal term in the inverse for the Fisher information matrix. In such problems, the estimate's error characteristics of any parameter become intertwined with the other parameters in a complicated way. Any estimator satisfying the Cramér-Rao bound with equality is said to be efficient.

Example 1

Let's evaluate the Cramér-Rao bound for the example we have been discussing: the estimation of the mean and variance of a length LL sequence of statistically independent Gaussian random variables. Let the estimate of the mean θ 1 θ 1 be the sample average θ̂1=rlL θ 1 r l L ; as shown in the last example, this estimate is unbiased. Let the estimate of the variance θ 2 θ 2 be the unbiased estimate θ̂2=rl-θ̂12L-1 θ 2 r l θ 1 2 L 1 . Each term in the Fisher information matrix FF is given by the expected value of the paired products of derivatives of the logarithm of the likelihood function. Fij=E θ i lnpr|θr θ j lnpr|θr F i j θ i p r θ r θ j p r θ r The logarithm of the likelihood function is lnpr|θr=-L2ln2π θ 2 -12 θ 2 l=0L-1rl- θ 1 2 p r θ r L 2 2 θ 2 1 2 θ 2 l 0 L 1 r l θ 1 2 its partial derivatives are

θ 1 lnpr|θr=1 θ 2 l=0L-1rl- θ 1 θ 1 p r θ r 1 θ 2 l 0 L 1 r l θ 1 (3)
θ 2 lnpr|θr=-L2 θ 2 +12 θ 2 2l=0L-1rl- θ 1 2 θ 2 p r θ r L 2 θ 2 1 2 θ 2 2 l 0 L 1 r l θ 1 2 (4)
and its second partials are 2 θ 1 2lnpr|θr=-L θ 2 θ 1 2 2 p r θ r L θ 2 2 θ 1 θ 2 lnpr|θr=-1 θ 2 2l=0L-1rl- θ 1 θ 1 θ 2 2 p r θ r 1 θ 2 2 l 0 L 1 r l θ 1 2 θ 2 θ 1 lnpr|θr=-1 θ 2 2l=0L-1rl- θ 1 θ 2 θ 1 2 p r θ r 1 θ 2 2 l 0 L 1 r l θ 1 2 θ 2 2lnpr|θr=L2 θ 2 2-1 θ 2 3l=0L-1rl- θ 1 2 θ 2 2 2 p r θ r L 2 θ 2 2 1 θ 2 3 l 0 L 1 r l θ 1 2 The Fisher information matrix has the surprisingly simple form F=L θ 2 00L2 θ 2 2 F L θ 2 0 0 L 2 θ 2 2 its inverse is also a diagonal matrix with the elements on the main diagonal equalling the reciprocal of those in the original matrix. Because of the zero-values off-diagonal entries in the Fisher information matrix, the errors between the corresponding estimates are not inter-dependent. In this problem, the mean-square estimation error can be no smaller than Eθ̂1- θ 1 2 θ 2 L θ 1 θ 1 2 θ 2 L Eθ̂2- θ 2 22 θ 2 2L θ 2 θ 2 2 2 θ 2 2 L

Note that nowhere in the preceding example did the form of the estimator enter into the computation of the bound. The only quantity used in the computation of the Cramér-Rao bound is the logarithm of the likelihood function, which is a consequence of the problem statement, not how it is solved. Only in the case of unbiased estimators is the bound independent of the estimators used.1 Because of this property, the Cramér-Rao bound is frequently used to assess the performance limits that can be obtained with an unbiased estimator in a particular problem. When bias is present, the exact form of the estimator's bias explicitly enters the computation of the bound. All too frequently, the unbiased form is used in situations where the existence of an unbiased estimator can be questioned. As we shall see, one such problem is time delay estimation, presumably of some importance to the reader. This misapplication of the unbiased Cramér-Rao arises from desperation: the estimator is so complicated and nonlinear that computing the bias is nearly impossible. As shown in this problem, biased estimators can yield mean-squared error smaller as well as larger than the unbiased version of the Cramér-Rao bound. Consequently, desperation can yield misinterpretation when a general result is misapplied.

In the single-parameter estimation problem, the Cramér-Rao bound incorporating bias has the well-known form 2

Eε2b2+1+ddθb2Eθpr|θr2 ε 2 b 2 1 θ b 2 θ p r θ r 2 (5)
Note that the sign of the bias's derivative determines whether this bound is larger or potentially smaller than the unbiased version, which is obtained by setting the bias term to zero.

Efficiency

An interesting question arises: when, if ever, is the bound satisfied with equality? Recalling the details of the derivation of the bound, equality results when the quantity EαTxxTα α x x α equals zero. As this quantity is the expected value of the square of αTx α x , it can only equal zero if αTx=0 α x 0 . Substituting in the form of the column matrices αα and xx, equality in the Cramér-Rao bound results whenever

θlnpr|θr=I+θbT-1F θ ̂r-θ-b θ p r θ r I θ b F θ r θ b (6)
This complicated expression means that only if estimation problems (as expressed by the a priori density have the form of the right side of this equation can the mean-squared error equal the Cramér-Rao bound. In particular, the gradient of the log likelihood function can only depend on the observations through the estimator. In all other problems, the Cramér-Rao bound is a lower bound but not a tight one no estimator can have error characteristics that equal it. In such cases, we have limited insight into ultimate limitations on estimation error size with the Cramér-Rao bound. However, consider the case where the estimator is unbiased ( b=0 b 0 ). In addition, note the maximum likelihood estimate occurs when the gradient of the logarithm of the likelihood function equals zero: θlnpr|θr=0 θ p r θ r 0 when θ=θ̂ML θ θ ML . In this case, the condition for equality in the Cramér-Rao bound becomes F θ ̂-θ̂ML=0 F θ θ ML 0 As the Fisher information matrix is positive-definite, we conclude that if the estimator equals the maximum likelihood estimator, equality in the Cramér-Rao bound can be satisfied with equality, only the maximum likelihood estimate will achieve it. To use estimation theoretic terminology, if an efficient estimate exists, it is the maximum likelihood estimate. This result stresses the importance of maximum likelihood estimates, despite the seemingly ad hoc manner by which they are defined.

Example 2

Consider the Gaussian example being examined so frequently in this section. The components of the gradient of the logarithm of the likelihood function were given earlier by Equation 3 and Equation 4. These expressions can be rearranged to reveal

θ 1 lnpr|θr θ 2 lnpr|θr=L θ 2 1Llrl- θ 1 -L</