Skip to content Skip to navigation

Connexions

You are here: Home » Content » Linear Estimators

Navigation

Content Actions

  • Download module PDF
  • Add to ...
    Add the module to:
    • My Favorites
    • A lens
    • An external social bookmarking service
    • My Favorites (What is 'My Favorites'?)
      'My Favorites' is a special kind of lens which you can use to bookmark modules and collections directly in Connexions. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need a Connexions account to use 'My Favorites'.
    • A lens (What is a lens?)

      Definition of a lens

      Lenses

      A lens is a custom view of Connexions content. You can think of it as a fancy kind of list that will let you see Connexions through the eyes of organizations and people you trust.

      What is in a lens?

      Lens makers point to Connexions materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

      Who can create a lens?

      Any individual Connexions member, a community, or a respected organization.

      What are tags? tag icon

      Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

    • External bookmarks
  • E-mail the author
  • Rate this module (How does the rating system work?)

    Rating system

    Ratings

    Ratings allow you to judge the quality of modules. If other users have ranked the module then its average rating is displayed below. Ratings are calculated on a scale from one star (Poor) to five stars (Excellent).

    How to rate a module

    Hover over the star that corresponds to the rating you wish to assign. Click on the star to add your rating. Your rating should be based on the quality of the content. You must have an account and be logged in to rate content.

    (0 ratings)

Recently Viewed

This feature requires Javascript to be enabled.

Linear Estimators

Module by: Don Johnson

Note: Your browser may not currently support MathML. See our browser support page for additional details. You can always view the correct math in the PDF version.

We derived the minimum mean-squared error estimator in the previous section with no constraint on the form of the estimator. Depending on the problem, the computations could be a linear function of the observations (which is always the case in Gaussian problems) or nonlinear. Deriving this estimator is often difficult, which limits its application. We consider here a variation of MMSE estimation by constraining the estimator to be linear while minimizing the mean-squared estimation error. Such linear estimators may not be optimum; the conditional expected value may be nonlinear and it always has the smallest mean-squared error. Despite this occasional performance deficit, linear estimators have well-understood properties, they interact will with other signal processing algorithms because of linearity, and they can always be derived, no matter what the problem.

Let the parameter estimate θ ̂r θ r be expressed as r r where · · is a linear operator: a 1 r1+ a 2 r2= a 1 r1+ a 2 r2 a 1 r 1 a 2 r 2 a 1 r 1 a 2 r 2 where a 1 a 1 , a 2 a 2 are scalars. Although all estimators of this form are obviously linear, the term linear estimator denotes that member of this family that minimizes the mean-squared error.

argminrEεTε=θ̂LINr r ε ε θ LIN r (1)

Because of the transformation's linearity, the theory of linear vector spaces can be fruitfully used to derive the estimator and to specify its properties. One result of that theoretical framework is the well-known Orthogonality Principle (Papoulis, pp. 407-414) The linear estimator is that particular linear transformation that yields an estimation error orthogonal to all linear transformations of the data. The orthogonality of the error to all linear transformations is termed the universality constraint. This principle provides us not only with a formal definition of the linear estimator but also with the mechanism to derive it. To demonstrate this intriguing result, let <·,·> · · denote the absract inner product between two vectors and · · the associated norm.

x2=<x,x> x 2 x x (2)
For example, if xx and yy are each column matrices having only one column,1 their inner product might be defined as <x,x>=xTy x x x y . Thus, the linear estimator as defined by the Orthogonality Principle must satisfy
,  for all linear transformations  ·:E<θ̂LINrθ,r>=0   for all linear transformations   · θ LIN r θ r 0 (3)
To see that this principle produces the MMSE linear estimator, we express the mean-squared estimation error EεTε=Eε2 ε ε ε 2 for any choice of linear estimator θ ̂ θ as
E θ ̂θ2=Eθ̂LINθ(θ̂LIN θ ̂)2=Eθ̂LINθ2+Eθ̂LIN θ ̂22E<θ̂LINθ,θ̂LIN θ ̂> θ θ 2 θ LIN θ θ LIN θ 2 θ LIN θ 2 θ LIN θ 2 2 θ LIN θ θ LIN θ (4)
As θ̂LIN θ ̂ θ LIN θ is the difference of two linear transformations, it too is linear and is orthogonal to the estimation error resulting from θ̂LIN θ LIN . As a result, the last term is zero and the mean-squared estimation error is the sum of two squared norms, each of which is, of course, nonnegative. Only the second norm varies with estimator choice; we minimize the mean-squared estimation error by choosing the estimator θ ̂ θ to be the estimator θ̂LIN θ LIN , which sets the second term to zero.

The estimation error for the minimum mean-squared linear estimator can be calculated to some degree without knowledge of the form of the estimator. The mean-squared estimation error is given by

Eθ̂LINθ2=E<θ̂LINθ,θ̂LINθ>=E<θ̂LINθ,θ̂LIN>+E<θ̂LINθ,-θ> θ LIN θ 2 θ LIN θ θ LIN θ θ LIN θ θ LIN θ LIN θ θ (5)
The first term is zero because of the Orthogonality Principle. Rewriting the second term yields a general expression for the MMSE linear estimator's mean-squared error.
Eε2=Eθ2E<θ̂LIN,θ> ε 2 θ 2 θ LIN θ (6)
This error is the difference of two terms. The first, the mean-squared value of the parameter, represents the largest value that the estimation error can be for any reasonable estimator. That error can be obtained by the estimator that ignores the data and has a value of zero. The second term reduces this maximum error and represents the degree to which the estimate and the parameter agree on the average.

Note that the definition of the minimum mean-squared error linear estimator makes no explicit assumptions about the parameter estimation problem being solved. This property makes this kind of estimator attractive in many applications where neither the a priori density of the parameter vector nor the density of the observations is known precisely. Linear transformations, however, are homogeneous: A zero-values input yields a zero output. Thus, the linear estimator is especially pertinent to those problems where the expected value of the parameter is zero. If the expected value is nonzero, the linear estimator would not necessarily yield the best result (See this problem)

Example 1

Express the first example in vector notation so that the observation vector is written as r=Aθ+n r A θ n where the matrix AA has the form A=11T A 1 1 . The expected value of the parameter is zero. The linear estimator has the form θ̂LIN=Lr θ LIN L r , where LL is a 1 × L 1 × L matrix. The orthogonality Principle states that the linear estimator satisfies ,for all  1 × L  matricies M:ELrθTMr=0 for all  1 × L  matricies  M L r θ M r 0 To use the Orthogonality Principle to derive an equation implicitly specifying the linear estimator, the "for all linear transformations" phrase must be interpreted. Usually the quantity specifying the linear transformation must be removed from the constraining inner product by imposing a very stringent but equivalent condition. In this example, this phrase becomes one about matrices. The elements of the matrix MM can be such that each element of the observation vector multiplies each element of the estimation error. Thus, in this problem the Othogonality Principle means that the expected value of the matrix consisting of all pairwise priducts of these elements must be zero. ELrθrT=0 L r θ r 0 Thus, two terms must equal each other: ELrrT=EθrT L r r θ r . The second term equals Eθ2AT θ 2 A as the additive noise and the parameter are assumed to be statistically independent quantities. The quantity ErrT r r in the first term is the correlation matrix of the observations, which is given by AATEθ2+ K n A A θ 2 K n . Here, K n K n is the noise covariance matrix, and Eθ2 θ 2 is the parameter's variance. The quantity AAT A A is a L × L L × L matrix with each element equaling 1. The noise vector has independent components; the covariance matrix thus equals σ n 2I σ n 2 I . The equation that LL must satisfy is therefore given by L 1 L L σ n 2+ σ θ 2 σ θ 2 σ θ 2 σ θ 2 σ n 2+ σ θ 2 σ θ 2 σ θ 2 σ θ 2 σ n 2+ σ θ 2= σ θ 2 σ θ 2 L 1 L L σ n 2 σ θ 2 σ θ 2 σ θ 2 σ θ 2 σ n 2 σ θ 2 σ θ 2 σ θ 2 σ θ 2 σ n 2 σ θ 2 σ θ 2 σ θ 2 The components of LL are equal and are given by L i = σ θ 2 σ n 2+L σ θ 2 L i σ θ 2 σ n 2 L σ θ 2 . Thus, the minimum mean-squared error linear estimator has the form θ̂LINr= σ θ 2 σ θ 2+ σ n 2L1Llrl θ LIN r σ θ 2 σ θ 2 σ n 2 L 1 L l l r l

Note that this result equals the minimum mean-squared error estimate derived earlier under the condition that Eθ=0 θ 0 . Mean-squared error, linear estimators, and Gaussian problems are intimately related to each other. The linear minimum mean-squared error solution to a problem is optimal if the underlying distributions are Gaussian.

Footnotes

  1. There is a confusion as to what a vector it. "Matricies having one column" are colloquially termed vectors as are the field quantities such as electric and magnetic fields. "Vectors" and their associated inner products are taken to be much more general mathematical objects than these. Hence the prose in this section is rather contorted.

References

  1. A. Papoulis. (1984). Probability, Random Variables, and Stochastic Processes. (second edition). New York: McGraw-Hill.

Comments, questions, feedback, criticisms?

Send feedback