Skip to content Skip to navigation

Connexions

You are here: Home » Content » The maximum likelihood estimation method

Navigation

Recently Viewed

This feature requires Javascript to be enabled.
 

The maximum likelihood estimation method

Module by: Christopher Curran. E-mail the author

Summary: This module offers a brief and generally intuitive introduction to maximum likelihood estimation methods. It is intended as a guide for advanced undergraduates.

The Maximum Likelihood Method

Introduction

The maximum likelihood (ML) method is an alternative to ordinary least squares (OLS) and offers a more general approach to the problem of finding estimators of unknown population parameters. In these notes we present an intuitive introduction to the ML technique. We begin our discussion with a description of continuous random variables.

Continuous random variables

Assume that x x MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiEaaaa@36F1@ is a continuous random variable over the interval x. x. MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeyOeI0IaeyOhIuQaeyizImQaamiEaiabgsMiJkabg6HiLkaac6caaaa@3EDC@ Because of the assumption of continuity we need some special definitions.

Probability density function. Any function f( x ) f( x ) MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOzamaabmaabaGaamiEaaGaayjkaiaawMcaaaaa@3965@ that has the following characteristics is a probability density function (pdf): (1) f( x )>0 f( x )>0 MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOzamaabmaabaGaamiEaaGaayjkaiaawMcaaiabg6da+iaaicdaaaa@3B27@ and (2) f( x )dx =1. f( x )dx =1. MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaa8qCaeaacaWGMbWaaeWaaeaacaWG4baacaGLOaGaayzkaaGaamizaiaadIhaaSqaaiabgkHiTiabg6HiLcqaaiabg6HiLcqdcqGHRiI8aOGaeyypa0JaaGymaiaac6caaaa@4400@ The probability that x has a value between a and b is given by Pr( axb )= a b f( x )dx . Pr( axb )= a b f( x )dx . MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaciiuaiaackhadaqadaqaaiaadggacqGHKjYOcaWG4bGaeyizImQaamOyaaGaayjkaiaawMcaaiabg2da9maapehabaGaamOzamaabmaabaGaamiEaaGaayjkaiaawMcaaiaadsgacaWG4baaleaacaWGHbaabaGaamOyaaqdcqGHRiI8aOGaaiOlaaaa@4ACB@ Here are two examples of the probability density functions (pdf) of continuous random variables.

Example 1: Uniform distribution

Let f( x )= 1 α f( x )= 1 α MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOzamaabmaabaGaamiEaaGaayjkaiaawMcaaiabg2da9maalaaabaGaaGymaaqaaiabeg7aHbaaaaa@3CD5@ for 0xα 0xα MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaaGimaiabgsMiJkaadIhacqGHKjYOcqaHXoqyaaa@3CB4@ and 0 elsewhere, where α>0. α>0. MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqySdeMaeyOpa4JaaGimaiaac6caaaa@3A07@ A graph of the pdf for this distribution is shown in Figure 1.

Figure 1: The probability x falls between a and b is given by the colored in area.
Probability distribution function of a uniform distribution.
A graph of the uniform distribution.

It is easy to see from the graph that f( x )= 1 α >0 f( x )= 1 α >0 MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOzamaabmaabaGaamiEaaGaayjkaiaawMcaaiabg2da9maalaaabaGaaGymaaqaaiabeg7aHbaacqGH+aGpcaaIWaaaaa@3E97@ and Pr( axb )= f( x )dx = 0 α 1 α dx =1. Pr( axb )= f( x )dx = 0 α 1 α dx =1. MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaaeiiaiaabccaciGGqbGaaiOCamaabmaabaGaamyyaiabgsMiJkaadIhacqGHKjYOcaWGIbaacaGLOaGaayzkaaGaeyypa0Zaa8qCaeaacaWGMbWaaeWaaeaacaWG4baacaGLOaGaayzkaaGaamizaiaadIhaaSqaaiabgkHiTiabg6HiLcqaaiabg6HiLcqdcqGHRiI8aOGaeyypa0Zaa8qCaeaadaWcaaqaaiaaigdaaeaacqaHXoqyaaGaamizaiaadIhaaSqaaiaaicdaaeaacqaHXoqya0Gaey4kIipakiabg2da9iaaigdacaGGUaaaaa@59F7@ Moreover, as shown in Figure 1, the area under the pdf curve between a and b is equal to the probability that x lies between a and b; that is, Pr( axb )= a b ( 1 α )dx = x α | a b = ba α . Pr( axb )= a b ( 1 α )dx = x α | a b = ba α . MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaciiuaiaackhadaqadaqaaiaadggacqGHKjYOcaWG4bGaeyizImQaamOyaaGaayjkaiaawMcaaiabg2da9maapehabaWaaeWaaeaadaWcaaqaaiaaigdaaeaacqaHXoqyaaaacaGLOaGaayzkaaGaamizaiaadIhaaSqaaiaadggaaeaacaWGIbaaniabgUIiYdGccqGH9aqpdaabcaqaamaalaaabaGaamiEaaqaaiabeg7aHbaaaiaawIa7amaaDaaaleaacaWGHbaabaGaamOyaaaakiabg2da9maalaaabaGaamOyaiabgkHiTiaadggaaeaacqaHXoqyaaGaaiOlaaaa@5809@

The calculation of the mean and variance of this distribution is relatively simple. The population mean is given by μ x =E( x )= 0 α xf( x )dx μ x =E( x )= 0 α xf( x )dx MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqiVd02aaSbaaSqaaiaadIhaaeqaaOGaeyypa0JaamyramaabmaabaGaamiEaaGaayjkaiaawMcaaiabg2da9maapehabaGaamiEaiaadAgadaqadaqaaiaadIhaaiaawIcacaGLPaaacaWGKbGaamiEaaWcbaGaaGimaaqaaiabeg7aHbqdcqGHRiI8aaaa@494E@ or μ x = 0 α x( 1 α )dx = x 2 2α | 0 α = α 2 . μ x = 0 α x( 1 α )dx = x 2 2α | 0 α = α 2 . MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqiVd02aaSbaaSqaaiaadIhaaeqaaOGaeyypa0Zaa8qCaeaacaWG4bWaaeWaaeaadaWcaaqaaiaaigdaaeaacqaHXoqyaaaacaGLOaGaayzkaaGaamizaiaadIhaaSqaaiaaicdaaeaacqaHXoqya0Gaey4kIipakiabg2da9maaeiaabaWaaSaaaeaacaWG4bWaaWbaaSqabeaacaaIYaaaaaGcbaGaaGOmaiabeg7aHbaaaiaawIa7amaaDaaaleaacaaIWaaabaGaeqySdegaaOGaeyypa0ZaaSaaaeaacqaHXoqyaeaacaaIYaaaaaaa@527C@

The population variance1 is given by V( x )=E[ ( x μ x ) 2 . ] V( x )=E[ ( x μ x ) 2 . ] MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOvamaabmaabaGaamiEaaGaayjkaiaawMcaaiabg2da9iaadweadaWadaqaamaabmaabaGaamiEaiabgkHiTiabeY7aTnaaBaaaleaacaWG4baabeaaaOGaayjkaiaawMcaamaaCaaaleqabaGaaGOmaaaaaOGaay5waiaaw2faaaaa@4465@ Thus, V( x )= 0 α ( x α 2 ) 2 ( 1 α )dx = 0 α ( x 2 αx+ α 2 4 )( 1 α )dx V( x )= 0 α ( x α 2 ) 2 ( 1 α )dx = 0 α ( x 2 αx+ α 2 4 )( 1 α )dx MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOvamaabmaabaGaamiEaaGaayjkaiaawMcaaiabg2da9maapehabaWaaeWaaeaacaWG4bGaeyOeI0YaaSaaaeaacqaHXoqyaeaacaaIYaaaaaGaayjkaiaawMcaamaaCaaaleqabaGaaGOmaaaakmaabmaabaWaaSaaaeaacaaIXaaabaGaeqySdegaaaGaayjkaiaawMcaaiaadsgacaWG4baaleaacaaIWaaabaGaeqySdeganiabgUIiYdGccqGH9aqpdaWdXbqaamaabmaabaGaamiEamaaCaaaleqabaGaaGOmaaaakiabgkHiTiabeg7aHjaadIhacqGHRaWkdaWcaaqaaiabeg7aHnaaCaaaleqabaGaaGOmaaaaaOqaaiaaisdaaaaacaGLOaGaayzkaaWaaeWaaeaadaWcaaqaaiaaigdaaeaacqaHXoqyaaaacaGLOaGaayzkaaGaamizaiaadIhaaSqaaiaaicdaaeaacqaHXoqya0Gaey4kIipaaaa@62B5@ or V( x )= x 3 3α x 2 2 + α 4 x | 0 α = α 2 3 α 2 2 + α 2 4 = α 2 12 . V( x )= x 3 3α x 2 2 + α 4 x | 0 α = α 2 3 α 2 2 + α 2 4 = α 2 12 . MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOvamaabmaabaGaamiEaaGaayjkaiaawMcaaiabg2da9maaeiaabaWaaSaaaeaacaWG4bWaaWbaaSqabeaacaaIZaaaaaGcbaGaaG4maiabeg7aHbaacqGHsisldaWcaaqaaiaadIhadaahaaWcbeqaaiaaikdaaaaakeaacaaIYaaaaiabgUcaRmaalaaabaGaeqySdegabaGaaGinaaaacaWG4baacaGLiWoadaqhaaWcbaGaaGimaaqaaiabeg7aHbaakiabg2da9maalaaabaGaeqySde2aaWbaaSqabeaacaaIYaaaaaGcbaGaaG4maaaacqGHsisldaWcaaqaaiabeg7aHnaaCaaaleqabaGaaGOmaaaaaOqaaiaaikdaaaGaey4kaSYaaSaaaeaacqaHXoqydaahaaWcbeqaaiaaikdaaaaakeaacaaI0aaaaiabg2da9maalaaabaGaeqySde2aaWbaaSqabeaacaaIYaaaaaGcbaGaaGymaiaaikdaaaGaaiOlaaaa@5D95@

Because of the simple mathematical form of the uniform pdf, the calculations in Example 1 are relatively straight forward. While the calculations for random variables with a pdf that has a more complicated form are generally more difficult (if algebraically possible), the basic methodology remains the same. Example 2 considers the case of a more complicated pdf.

Example 2: The Normal distribution.

A random variable with a mean of μ μ MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqiVd0gaaa@37AA@ and a variance of σ 2 σ 2 MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeq4Wdm3aaWbaaSqabeaacaaIYaaaaaaa@38A0@ that has a normal distribution—that is, x~N( μ, σ 2 ) x~N( μ, σ 2 ) MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiEaiaac6hacaWGobWaaeWaaeaacqaH8oqBcaGGSaGaeq4Wdm3aaWbaaSqabeaacaaIYaaaaaGccaGLOaGaayzkaaGaaiifGaaa@4023@ has the pdf f( x )= 1 σ 2π e ( xμ ) 2 2 σ 2 . f( x )= 1 σ 2π e ( xμ ) 2 2 σ 2 . MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOzamaabmaabaGaamiEaaGaayjkaiaawMcaaiabg2da9maalaaabaGaaGymaaqaaiabeo8aZnaakaaabaGaaGOmaiabec8aWbWcbeaaaaGccaWGLbWaaWbaaSqabeaacqGHsisldaWcaaqaamaabmaabaGaamiEaiabgkHiTiabeY7aTbGaayjkaiaawMcaamaaCaaameqabaGaaGOmaaaaaSqaaiaaikdacqaHdpWCdaahaaadbeqaaiaaikdaaaaaaaaakiaac6caaaa@4BED@ A typical graph of this pdf is given in Figure 2. The area under the curve between values of x of a and b is equal to the probability that x falls between a and b.

Figure 2: The probability x falls between a and b is given by the shaded area.
Probability distribution function of a Normal distribution.
A graph of the Normal distribution.

Joint distributions of samples and the ML method.

Most of the statistical work that economists use involves the use of a sample of observations. It is usual to assume that the members of the sample are drawn independently of each other. The implication of this assumption is that the pdf of the joint distribution is equal to the product of the pfd of each observation; i.e.,

f( x 1 , x 2 ,, x n )=f( x 1 )f( x 2 )f( x n ). f( x 1 , x 2 ,, x n )=f( x 1 )f( x 2 )f( x n ). MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOzamaabmaabaGaamiEamaaBaaaleaacaaIXaaabeaakiaacYcacaWG4bWaaSbaaSqaaiaaikdaaeqaaOGaaiilaiablAciljaacYcacaWG4bWaaSbaaSqaaiaad6gaaeqaaaGccaGLOaGaayzkaaGaeyypa0JaamOzamaabmaabaGaamiEamaaBaaaleaacaaIXaaabeaaaOGaayjkaiaawMcaaiaadAgadaqadaqaaiaadIhadaWgaaWcbaGaaGOmaaqabaaakiaawIcacaGLPaaacqWIVlctcaWGMbWaaeWaaeaacaWG4bWaaSbaaSqaaiaad6gaaeqaaaGccaGLOaGaayzkaaGaaiOlaaaa@52A1@
(1)

The pdf of the joint distribution shown in (1) is known as the likelihood function. If the sample were not independently drawn, the pdf of joint distribution could not be written in such a simple form because of the covariance among the members of the sample would not be equal to zero. The logarithm of this function (or as it is referred to, the log of the likelihood function) is given by the sum L( x 1 , x 2 ,, x n )=lnf( x 1 )+lnf( x 2 )++lnf( x n )= i=1 n lnf( x i ) . L( x 1 , x 2 ,, x n )=lnf( x 1 )+lnf( x 2 )++lnf( x n )= i=1 n lnf( x i ) . MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamitamaabmaabaGaamiEamaaBaaaleaacaaIXaaabeaakiaacYcacaWG4bWaaSbaaSqaaiaaikdaaeqaaOGaaiilaiablAciljaacYcacaWG4bWaaSbaaSqaaiaad6gaaeqaaaGccaGLOaGaayzkaaGaeyypa0JaciiBaiaac6gacaWGMbWaaeWaaeaacaWG4bWaaSbaaSqaaiaaigdaaeqaaaGccaGLOaGaayzkaaGaey4kaSIaciiBaiaac6gacaWGMbWaaeWaaeaacaWG4bWaaSbaaSqaaiaaikdaaeqaaaGccaGLOaGaayzkaaGaey4kaSIaeS47IWKaey4kaSIaciiBaiaac6gacaWGMbWaaeWaaeaacaWG4bWaaSbaaSqaaiaad6gaaeqaaaGccaGLOaGaayzkaaGaeyypa0ZaaabCaeaaciGGSbGaaiOBaiaadAgadaqadaqaaiaadIhadaWgaaWcbaGaamyAaaqabaaakiaawIcacaGLPaaaaSqaaiaadMgacqGH9aqpcaaIXaaabaGaamOBaaqdcqGHris5aOGaaiOlaaaa@6846@ The maximum likelihood method involves choosing as estimators of the unknown parameters of the distribution the values that maximize the likelihood function. However, because the logarithm is a monotonically increasing function2, maximizing the log of the likelihood function is equivalent to maximizing the likelihood function. The following example of this procedure illustrates how to derive ML estimators.

Example 3: The ML estimator of the population mean and population variance.

Assume that x~N( μ, σ 2 ). x~N( μ, σ 2 ). MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiEaiaac6hacaWGobWaaeWaaeaacqaH8oqBcaGGSaGaeq4Wdm3aaWbaaSqabeaacaaIYaaaaaGccaGLOaGaayzkaaGaaiOlaaaa@401D@ Consider a sample of size n drawn independently from this distribution. The likelihood function is the product of the pdf of each observation or:

f( x i )= 1 σ 2π e ( x i μ ) 2 2 σ 2 L( x 1 , x 2 ,, x n )= 1 σ n ( 2π ) n 2 e i=1 n ( x i μ ) 2 2 σ 2 . f( x i )= 1 σ 2π e ( x i μ ) 2 2 σ 2 L( x 1 , x 2 ,, x n )= 1 σ n ( 2π ) n 2 e i=1 n ( x i μ ) 2 2 σ 2 . MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOzamaabmaabaGaamiEamaaBaaaleaacaWGPbaabeaaaOGaayjkaiaawMcaaiabg2da9maalaaabaGaaGymaaqaaiabeo8aZnaakaaabaGaaGOmaiabec8aWbWcbeaaaaGccaWGLbWaaWbaaSqabeaacqGHsisldaWcaaqaamaabmaabaGaamiEamaaBaaameaacaWGPbaabeaaliabgkHiTiabeY7aTbGaayjkaiaawMcaamaaCaaameqabaGaaGOmaaaaaSqaaiaaikdacqaHdpWCdaahaaadbeqaaiaaikdaaaaaaaaakiabgkDiElaadYeadaqadaqaaiaadIhadaWgaaWcbaGaaGymaaqabaGccaGGSaGaamiEamaaBaaaleaacaaIYaaabeaakiaacYcacqWIMaYscaGGSaGaamiEamaaBaaaleaacaWGUbaabeaaaOGaayjkaiaawMcaaiabg2da9maalaaabaGaaGymaaqaaiabeo8aZnaaCaaaleqabaGaamOBaaaakmaabmaabaGaaGOmaiabec8aWbGaayjkaiaawMcaamaaCaaaleqabaWaaSqaaWqaaiaad6gaaeaacaaIYaaaaaaaaaGccaWGLbWaaWbaaSqabeaacqGHsisldaWcaaqaamaaqahabaWaaeWaaeaacaWG4bWaaSbaaWqaaiaadMgaaeqaaSGaeyOeI0IaeqiVd0gacaGLOaGaayzkaaWaaWbaaWqabeaacaaIYaaaaaqaaiaadMgacqGH9aqpcaaIXaaabaGaamOBaaGdcqGHris5aaWcbaGaaGOmaiabeo8aZnaaCaaameqabaGaaGOmaaaaaaaaaOGaaiOlaaaa@798B@
(2)

Thus, the log of the likelihood function of this sample is L( x 1 , x 2 ,, x n )= nln2π 2 nlnσ i=1 n ( x i μ ) 2 2 σ 2 . L( x 1 , x 2 ,, x n )= nln2π 2 nlnσ i=1 n ( x i μ ) 2 2 σ 2 . MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamitamaabmaabaGaamiEamaaBaaaleaacaaIXaaabeaakiaacYcacaWG4bWaaSbaaSqaaiaaikdaaeqaaOGaaiilaiablAciljaacYcacaWG4bWaaSbaaSqaaiaad6gaaeqaaaGccaGLOaGaayzkaaGaeyypa0JaeyOeI0YaaSaaaeaacaWGUbGaciiBaiaac6gacaaIYaGaeqiWdahabaGaaGOmaaaacqGHsislcaWGUbGaciiBaiaac6gacqaHdpWCcqGHsisldaWcaaqaamaaqahabaWaaeWaaeaacaWG4bWaaSbaaSqaaiaadMgaaeqaaOGaeyOeI0IaeqiVd0gacaGLOaGaayzkaaWaaWbaaSqabeaacaaIYaaaaaqaaiaadMgacqGH9aqpcaaIXaaabaGaamOBaaqdcqGHris5aaGcbaGaaGOmaiabeo8aZnaaCaaaleqabaGaaGOmaaaaaaaaaa@6096@ In the ML method we want to find the estimators of the mean and variance, μ μ MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafqiVd0Mbambaaaa@37C4@ and σ σ MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafq4WdmNbambaaaa@37D1@ , that maximize the log of the likelihood function. Substituting in the parameter estimates into the log of the likelihood function gives our problem as:

Max μ , σ L( x 1 , x 2 ,, x n )= Max μ , σ [ nln2π 2 nln σ ( x i μ ) 2 2 σ 2 ]. Max μ , σ L( x 1 , x 2 ,, x n )= Max μ , σ [ nln2π 2 nln σ ( x i μ ) 2 2 σ 2 ]. MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaCbeaeaacaWGnbGaamyyaiaadIhaaSqaaiqbeY7aTzaataGaaiilaiqbeo8aZzaataaabeaakiaadYeadaqadaqaaiaadIhadaWgaaWcbaGaaGymaaqabaGccaGGSaGaamiEamaaBaaaleaacaaIYaaabeaakiaacYcacqWIMaYscaGGSaGaamiEamaaBaaaleaacaWGUbaabeaaaOGaayjkaiaawMcaaiabg2da9maaxababaGaamytaiaadggacaWG4baaleaacuaH8oqBgaWeaiaacYcacuaHdpWCgaWeaaqabaGcdaWadaqaaiabgkHiTmaalaaabaGaamOBaiGacYgacaGGUbGaaGOmaiabec8aWbqaaiaaikdaaaGaeyOeI0IaamOBaiGacYgacaGGUbGafq4WdmNbambacqGHsisldaWcaaqaamaaqaeabaWaaeWaaeaacaWG4bWaaSbaaSqaaiaadMgaaeqaaOGaeyOeI0IafqiVd0MbambaaiaawIcacaGLPaaadaahaaWcbeqaaiaaikdaaaaabeqab0GaeyyeIuoaaOqaaiaaikdacuaHdpWCgaWeamaaCaaaleqabaGaaGOmaaaaaaaakiaawUfacaGLDbaacaGGUaaaaa@6E6C@
(3)

Setting the derivatives of the log of the likelihood function with respect to μ μ MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafqiVd0Mbambaaaa@37C4@ and σ σ MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafq4WdmNbambaaaa@37D1@ equal to 0 gives:

L( x 1 , x 2 ,, x n ) μ = ( x i μ ) σ 2 =0  and L( x 1 , x 2 ,, x n ) μ = ( x i μ ) σ 2 =0  and MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaSaaaeaacqGHciITcaWGmbWaaeWaaeaacaWG4bWaaSbaaSqaaiaaigdaaeqaaOGaaiilaiaadIhadaWgaaWcbaGaaGOmaaqabaGccaGGSaGaeSOjGSKaaiilaiaadIhadaWgaaWcbaGaamOBaaqabaaakiaawIcacaGLPaaaaeaacqGHciITcuaH8oqBgaWeaaaacqGH9aqpdaWcaaqaamaaqaeabaWaaeWaaeaacaWG4bWaaSbaaSqaaiaadMgaaeqaaOGaeyOeI0IafqiVd0MbambaaiaawIcacaGLPaaaaSqabeqaniabggHiLdaakeaacuaHdpWCgaWeamaaCaaaleqabaGaaGOmaaaaaaGccqGH9aqpcaaIWaGaaeiiaiaabccacaqGHbGaaeOBaiaabsgaaaa@585B@
(4)
L( x 1 , x 2 ,, x n ) σ = n σ + ( x i μ ) 2 σ 3 =0. L( x 1 , x 2 ,, x n ) σ = n σ + ( x i μ ) 2 σ 3 =0. MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaSaaaeaacqGHciITcaWGmbWaaeWaaeaacaWG4bWaaSbaaSqaaiaaigdaaeqaaOGaaiilaiaadIhadaWgaaWcbaGaaGOmaaqabaGccaGGSaGaeSOjGSKaaiilaiaadIhadaWgaaWcbaGaamOBaaqabaaakiaawIcacaGLPaaaaeaacqGHciITcuaHdpWCgaWeaaaacqGH9aqpcqGHsisldaWcaaqaaiaad6gaaeaacuaHdpWCgaWeaaaacqGHRaWkdaWcaaqaamaaqaeabaWaaeWaaeaacaWG4bWaaSbaaSqaaiaadMgaaeqaaOGaeyOeI0IafqiVd0MbambaaiaawIcacaGLPaaadaahaaWcbeqaaiaaikdaaaaabeqab0GaeyyeIuoaaOqaaiqbeo8aZzaataWaaWbaaSqabeaacaaIZaaaaaaakiabg2da9iaaicdacaGGUaaaaa@5AA6@
(5)

Solving these two equations simultaneously gives:

μ = i=1 n x i n = x ¯   and   σ 2 = ( x i μ ) 2 n . μ = i=1 n x i n = x ¯   and   σ 2 = ( x i μ ) 2 n . MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafqiVd0MbambacqGH9aqpdaWcaaqaamaaqahabaGaamiEamaaBaaaleaacaWGPbaabeaaaeaacaWGPbGaeyypa0JaaGymaaqaaiaad6gaa0GaeyyeIuoaaOqaaiaad6gaaaGaeyypa0JabmiEayaaraGaaeiiaiaabccacaqGHbGaaeOBaiaabsgacaqGGaGaaeiiaiqbeo8aZzaataWaaWbaaSqabeaacaaIYaaaaOGaeyypa0ZaaSaaaeaadaaeabqaamaabmaabaGaamiEamaaBaaaleaacaWGPbaabeaakiabgkHiTiqbeY7aTzaataaacaGLOaGaayzkaaWaaWbaaSqabeaacaaIYaaaaaqabeqaniabggHiLdaakeaacaWGUbaaaiaac6caaaa@5817@
(6)

Notice the fact that the estimator of the population mean is equal to the sample mean, a result that is the same as the one you found in your introductory statistics course. However, the unbiased estimator of the population variance used in that course is s 2 = ( x i μ ) 2 n1 . s 2 = ( x i μ ) 2 n1 . MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaam4CamaaCaaaleqabaGaaGOmaaaakiabg2da9maalaaabaWaaabqaeaadaqadaqaaiaadIhadaWgaaWcbaGaamyAaaqabaGccqGHsislcuaH8oqBgaWeaaGaayjkaiaawMcaamaaCaaaleqabaGaaGOmaaaaaeqabeqdcqGHris5aaGcbaGaamOBaiabgkHiTiaaigdaaaGaaiOlaaaa@45A4@

Thus, one of the common "problems" with using a ML estimator is that quite often they are biased estimators of a population parameter. On the other hand, under very general conditions ML estimators are consistent, are asymptotically efficient, and have an asymptotically normal distribution (these are desirable large sample size characteristics of potential estimators and are discussed in advanced statistics courses).3

Application of the ML method to regressions

The discussion above illustrates the basics of the ML method—you form the log of the likelihood function and then find the values of the parameter estimates that maximize this function. In most cases the maximization will not yield answers in closed form—that is, you cannot find a neat algebraic formula as we did for the population mean. However, you can use computer programs to search for the values of the parameter estimates that maximize this function. Thus, in most cases in advanced regression models you often will treat the ML method as a “black box” and not concern yourself with the estimation details. However, I illustrate one more example of the ML technique.

Example 4: The ML estimators for a simple regression.

Assume that we want to estimate the population parameters for the regression model y i =β x i + ε i , y i =β x i + ε i , MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyEamaaBaaaleaacaWGPbaabeaakiabg2da9iabek7aIjaadIhadaWgaaWcbaGaamyAaaqabaGccqGHRaWkcqaH1oqzdaWgaaWcbaGaamyAaaqabaaaaa@4081@ where we assume that

  1. ε i ~N( 0, σ 2 ), ε i ~N( 0, σ 2 ), MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqyTdu2aaSbaaSqaaiaadMgaaeqaaOGaaiOFaiaad6eadaqadaqaaiaaicdacaGGSaGaeq4Wdm3aaWbaaSqabeaacaaIYaaaaaGccaGLOaGaayzkaaGaaiilaaaa@40ED@
  2. E( ε i ε j )=0 E( ε i ε j )=0 MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyramaabmaabaGaeqyTdu2aaSbaaSqaaiaadMgaaeqaaOGaeqyTdu2aaSbaaSqaaiaadQgaaeqaaaGccaGLOaGaayzkaaGaeyypa0JaaGimaaaa@3F9E@ for ij, ij, MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyAaiabgcMi5kaadQgacaGGSaaaaa@3A48@
  3. y i = Y i Y ¯ y i = Y i Y ¯ MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyEamaaBaaaleaacaWGPbaabeaakiabg2da9iaadMfadaWgaaWcbaGaamyAaaqabaGccqGHsislceWGzbGbaebaaaa@3D01@ and x i = X i X ¯ x i = X i X ¯ MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiEamaaBaaaleaacaWGPbaabeaakiabg2da9iaadIfadaWgaaWcbaGaamyAaaqabaGccqGHsislceWGybGbaebaaaa@3CFE@ (this assumption allows us to ignore the estimation of the intercept term), and
  4. x i x i MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiEamaaBaaaleaacaWGPbaabeaaaaa@380B@ is a non-stochastic variable.

The assumption of a normally distributed error term implies that ε i = y i β x i ~N( 0, σ 2 ). ε i = y i β x i ~N( 0, σ 2 ). MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqyTdu2aaSbaaSqaaiaadMgaaeqaaOGaeyypa0JaamyEamaaBaaaleaacaWGPbaabeaakiabgkHiTiabek7aIjaadIhadaWgaaWcbaGaamyAaaqabaGccaGG+bGaamOtamaabmaabaGaaGimaiaacYcacqaHdpWCdaahaaWcbeqaaiaaikdaaaaakiaawIcacaGLPaaacaGGUaaaaa@48C6@ Thus, the pdf of the error term is f( ε i )= 1 σ 2π e ( y i β x i ) 2 2 σ 2 f( ε i )= 1 σ 2π e ( y i β x i ) 2 2 σ 2 MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOzamaabmaabaGaeqyTdu2aaSbaaSqaaiaadMgaaeqaaaGccaGLOaGaayzkaaGaeyypa0ZaaSaaaeaacaaIXaaabaGaeq4Wdm3aaOaaaeaacaaIYaGaeqiWdahaleqaaaaakiaadwgadaahaaWcbeqaaiabgkHiTmaalaaabaWaaeWaaeaacaWG5bWaaSbaaWqaaiaadMgaaeqaaSGaeyOeI0IaeqOSdiMaamiEamaaBaaameaacaWGPbaabeaaaSGaayjkaiaawMcaamaaCaaameqabaGaaGOmaaaaaSqaaiaaikdacqaHdpWCdaahaaadbeqaaiaaikdaaaaaaaaakiaac6caaaa@50F0@ and, thus, the likelihood function4 is:

i=1 n f( ε i ) = i=1 n 1 σ 2π e ( y i β x i ) 2 2 σ 2 = ( 1 σ 2π ) n i=1 n e ( y i β x i ) 2 2 σ 2 i=1 n f( ε i ) = i=1 n 1 σ 2π e ( y i β x i ) 2 2 σ 2 = ( 1 σ 2π ) n i=1 n e ( y i β x i ) 2 2 σ 2 MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaebCaeaacaWGMbWaaeWaaeaacqaH1oqzdaWgaaWcbaGaamyAaaqabaaakiaawIcacaGLPaaaaSqaaiaadMgacqGH9aqpcaaIXaaabaGaamOBaaqdcqGHpis1aOGaeyypa0ZaaebCaeaadaWcaaqaaiaaigdaaeaacqaHdpWCdaGcaaqaaiaaikdacqaHapaCaSqabaaaaOGaamyzamaaCaaaleqabaGaeyOeI0YaaSaaaeaadaqadaqaaiaadMhadaWgaaadbaGaamyAaaqabaWccqGHsislcqaHYoGycaWG4bWaaSbaaWqaaiaadMgaaeqaaaWccaGLOaGaayzkaaWaaWbaaWqabeaacaaIYaaaaaWcbaGaaGOmaiabeo8aZnaaCaaameqabaGaaGOmaaaaaaaaaaWcbaGaamyAaiabg2da9iaaigdaaeaacaWGUbaaniabg+GivdGccqGH9aqpdaqadaqaamaalaaabaGaaGymaaqaaiabeo8aZnaakaaabaGaaGOmaiabec8aWbWcbeaaaaaakiaawIcacaGLPaaadaahaaWcbeqaaiaad6gaaaGcdaqeWbqaaiaadwgadaahaaWcbeqaaiabgkHiTmaalaaabaWaaeWaaeaacaWG5bWaaSbaaWqaaiaadMgaaeqaaSGaeyOeI0IaeqOSdiMaamiEamaaBaaameaacaWGPbaabeaaaSGaayjkaiaawMcaamaaCaaameqabaGaaGOmaaaaaSqaaiaaikdacqaHdpWCdaahaaadbeqaaiaaikdaaaaaaaaaaSqaaiaadMgacqGH9aqpcaaIXaaabaGaamOBaaqdcqGHpis1aaaa@7976@
(7)

and the log of the likelihood function is L( ε 1 , ε 2 ,, ε n )=nln 2π nln σ i=1 n ( y i β x i ) 2 2 σ 2 . L( ε 1 , ε 2 ,, ε n )=nln 2π nln σ i=1 n ( y i β x i ) 2 2 σ 2 . MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamitamaabmaabaGaeqyTdu2aaSbaaSqaaiaaigdaaeqaaOGaaiilaiabew7aLnaaBaaaleaacaaIYaaabeaakiaacYcacqWIMaYscaGGSaGaeqyTdu2aaSbaaSqaaiaad6gaaeqaaaGccaGLOaGaayzkaaGaeyypa0JaeyOeI0IaamOBaiGacYgacaGGUbWaaOaaaeaacaaIYaGaeqiWdahaleqaaOGaeyOeI0IaamOBaiGacYgacaGGUbGafq4WdmNbambacqGHsisldaWcaaqaamaaqahabaWaaeWaaeaacaWG5bWaaSbaaSqaaiaadMgaaeqaaOGaeyOeI0IafqOSdiMbambacaWG4bWaaSbaaSqaaiaadMgaaeqaaaGccaGLOaGaayzkaaWaaWbaaSqabeaacaaIYaaaaaqaaiaadMgacqGH9aqpcaaIXaaabaGaamOBaaqdcqGHris5aaGcbaGaaGOmaiqbeo8aZzaataWaaWbaaSqabeaacaaIYaaaaaaakiaac6caaaa@6504@

We find the estimators β β MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafqOSdiMbambaaaa@37AF@ and σ σ MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafq4WdmNbambaaaa@37D1@ in the same manner as we did for the sample mean and variance. Differentiating the log of the likelihood function and setting these first derivatives equal to 0 gives the following two first-order conditions:

L( ε 1 , ε 2 ,, ε n ) β = 2 i=1 n ( y i β x i ) x i 2 σ 2 =0 L( ε 1 , ε 2 ,, ε n ) β = 2 i=1 n ( y i β x i ) x i 2 σ 2 =0 MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaSaaaeaacqGHciITcaWGmbWaaeWaaeaacqaH1oqzdaWgaaWcbaGaaGymaaqabaGccaGGSaGaeqyTdu2aaSbaaSqaaiaaikdaaeqaaOGaaiilaiablAciljaacYcacqaH1oqzdaWgaaWcbaGaamOBaaqabaaakiaawIcacaGLPaaaaeaacqGHciITcuaHYoGygaWeaaaacqGH9aqpdaWcaaqaaiaaikdadaaeWbqaamaabmaabaGaamyEamaaBaaaleaacaWGPbaabeaakiabgkHiTiqbek7aIzaataGaamiEamaaBaaaleaacaWGPbaabeaaaOGaayjkaiaawMcaaiaadIhadaWgaaWcbaGaamyAaaqabaaabaGaamyAaiabg2da9iaaigdaaeaacaWGUbaaniabggHiLdaakeaacaaIYaGafq4WdmNbambadaahaaWcbeqaaiaaikdaaaaaaOGaeyypa0JaaGimaaaa@5FA3@
(8)

and

L( ε 1 , ε 2 ,, ε n ) σ = n σ + i=1 n ( y i β x i ) 2 σ 3 =0. L( ε 1 , ε 2 ,, ε n ) σ = n σ + i=1 n ( y i β x i ) 2 σ 3 =0. MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaSaaaeaacqGHciITcaWGmbWaaeWaaeaacqaH1oqzdaWgaaWcbaGaaGymaaqabaGccaGGSaGaeqyTdu2aaSbaaSqaaiaaikdaaeqaaOGaaiilaiablAciljaacYcacqaH1oqzdaWgaaWcbaGaamOBaaqabaaakiaawIcacaGLPaaaaeaacqGHciITcuaHdpWCgaWeaaaacqGH9aqpcqGHsisldaWcaaqaaiaad6gaaeaacuaHdpWCgaWeaaaacqGHRaWkdaWcaaqaamaaqahabaWaaeWaaeaacaWG5bWaaSbaaSqaaiaadMgaaeqaaOGaeyOeI0IafqOSdiMbambacaWG4bWaaSbaaSqaaiaadMgaaeqaaaGccaGLOaGaayzkaaWaaWbaaSqabeaacaaIYaaaaaqaaiaadMgacqGH9aqpcaaIXaaabaGaamOBaaqdcqGHris5aaGcbaGafq4WdmNbambadaahaaWcbeqaaiaaiodaaaaaaOGaeyypa0JaaGimaiaac6caaaa@6281@
(9)

Thus, the ML estimators are:

β = i=1 n y i x i i=1 n x i 2   and   σ 2 = i=1 n ( y i β x i ) 2 n . β = i=1 n y i x i i=1 n x i 2   and   σ 2 = i=1 n ( y i β x i ) 2 n . MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafqOSdiMbambacqGH9aqpdaWcaaqaamaaqahabaGaamyEamaaBaaaleaacaWGPbaabeaakiaadIhadaWgaaWcbaGaamyAaaqabaaabaGaamyAaiabg2da9iaaigdaaeaacaWGUbaaniabggHiLdaakeaadaaeWbqaaiaadIhadaqhaaWcbaGaamyAaaqaaiaaikdaaaaabaGaamyAaiabg2da9iaaigdaaeaacaWGUbaaniabggHiLdaaaOGaaeiiaiaabccacaqGHbGaaeOBaiaabsgacaqGGaGaaeiiaiqbeo8aZzaataWaaWbaaSqabeaacaaIYaaaaOGaeyypa0ZaaSaaaeaadaaeWbqaamaabmaabaGaamyEamaaBaaaleaacaWGPbaabeaakiabgkHiTiqbek7aIzaataGaamiEamaaBaaaleaacaWGPbaabeaaaOGaayjkaiaawMcaamaaCaaaleqabaGaaGOmaaaaaeaacaWGPbGaeyypa0JaaGymaaqaaiaad6gaa0GaeyyeIuoaaOqaaiaad6gaaaGaaiOlaaaa@65AA@

Notice that in this simple case the ML estimator of β β MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqOSdigaaa@3795@ is the same as the OLS estimator of β β MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqOSdigaaa@3795@ . Also, notice that the ML estimator of σ 2 σ 2 MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeq4Wdm3aaWbaaSqabeaacaaIYaaaaaaa@38A0@ is biased—the (unbiased) OLS estimator of σ 2 σ 2 MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeq4Wdm3aaWbaaSqabeaacaaIYaaaaaaa@38A0@ is s 2 = i=1 n ( y i β x i ) 2 n2 . s 2 = i=1 n ( y i β x i ) 2 n2 . MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaam4CamaaCaaaleqabaGaaGOmaaaakiabg2da9maalaaabaWaaabCaeaadaqadaqaaiaadMhadaWgaaWcbaGaamyAaaqabaGccqGHsislcuaHYoGygaWeaiaadIhadaWgaaWcbaGaamyAaaqabaaakiaawIcacaGLPaaadaahaaWcbeqaaiaaikdaaaaabaGaamyAaiabg2da9iaaigdaaeaacaWGUbaaniabggHiLdaakeaacaWGUbGaeyOeI0IaaGOmaaaacaGGUaaaaa@4B82@

You can use the examples in this module as the basis of your understanding of the ML method. When you see that the ML method is used in a computer program, you can be fairly certain that the program uses one of the many optimizing subroutines to find the maximum of the log of the likelihood program. You can consult the help files with the computer program to see what underlying distribution is used to set up the log of the likelihood function. A concept related to the maximum likelihood estimation method worth exploring is the likelihood ratio test (see the module by Don Johnson entitled The Likelihood Ratio Test for an introduction to this key statistical test.)

Exercises

Exercise 1

Consider the following functions. For each of them, (1) prove that the function is a pdf; (2) calculate the mean and variance of each distribution, and (3) find the maximum likelihood estimator of the parameter θ. θ. MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiabeI7aXjaac6caaaa@3851@ Sketch a graph of each of the distributions for a representative value of θ. θ. MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiabeI7aXjaac6caaaa@3851@

  1. f( x;θ )=( θ+1 ) x θ f( x;θ )=( θ+1 ) x θ where  0x1  0x1 MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaabccacaaIWaGaeyizImQaamiEaiabgsMiJkaaigdaaaa@3C68@ and θ>0. θ>0. MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiabeI7aXjabg6da+iaaicdacaGGUaaaaa@3A13@
  2. f( x;θ )=θ e θx f( x;θ )=θ e θx MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaadAgadaqadaqaaiaadIhacaGG7aGaeqiUdehacaGLOaGaayzkaaGaeyypa0JaeqiUdeNaamyzamaaCaaaleqabaGaeyOeI0IaeqiUdeNaamiEaaaaaaa@4342@ where 0x< 0x< MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaaicdacqGHKjYOcaWG4bGaeyipaWJaeyOhIukaaa@3BCA@ and θ>0. θ>0. MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiabeI7aXjabg6da+iaaicdacaGGUaaaaa@3A13@

Footnotes

  1. Quite often, as in the exercises at the end of this module, it is easier to calculate the variance of a distribution using the alternative formula for the variance: σ x 2 =V( x )=E ( xμ ) 2 =E( x 2 ) μ 2 , σ x 2 =V( x )=E ( xμ ) 2 =E( x 2 ) μ 2 , MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiabeo8aZnaaDaaaleaacaWG4baabaGaaGOmaaaakiabg2da9iaadAfadaqadaqaaiaadIhaaiaawIcacaGLPaaacqGH9aqpcaWGfbWaaeWaaeaacaWG4bGaeyOeI0IaeqiVd0gacaGLOaGaayzkaaWaaWbaaSqabeaacaaIYaaaaOGaeyypa0JaamyramaabmaabaGaamiEamaaCaaaleqabaGaaGOmaaaaaOGaayjkaiaawMcaaiabgkHiTiabeY7aTnaaCaaaleqabaGaaGOmaaaakiaacYcaaaa@4F7E@ where E( x 2 )= x 2 f( x )dx . E( x 2 )= x 2 f( x )dx . MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaadweadaqadaqaaiaadIhadaahaaWcbeqaaiaaikdaaaaakiaawIcacaGLPaaacqGH9aqpdaWdbaqaaiaadIhadaahaaWcbeqaaiaaikdaaaGccaWGMbWaaeWaaeaacaWG4baacaGLOaGaayzkaaGaamizaiaadIhaaSqabeqaniabgUIiYdGccaGGUaaaaa@4530@
  2. The function g( y ) g( y ) MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaadEgadaqadaqaaiaadMhaaiaawIcacaGLPaaaaaa@395C@ is monotonically increasing for y if g ( y )>0. g ( y )>0. MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiqadEgagaqbamaabmaabaGaamyEaaGaayjkaiaawMcaaiabg6da+iaaicdacaGGUaaaaa@3BDC@ Because d dx lnx= 1 x >0 for x>0, d dx lnx= 1 x >0 for x>0, MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaamaalaaabaGaamizaaqaaiaadsgacaWG4baaaiGacYgacaGGUbGaamiEaiabg2da9maalaaabaGaaGymaaqaaiaadIhaaaGaeyOpa4JaaGimaiaabccacaqGMbGaae4BaiaabkhacaqGGaGaamiEaiabg6da+iaaicdacaGGSaaaaa@47BE@ the logarithm function is monotonically increasing for positive values of x. x. MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaadIhacaGGUaaaaa@3798@
  3. Intuitively, what these concepts mean is that as the sample size increases the estimator becomes more precise (the variance becomes smaller and an bias disappears) and the distribution of the estimator approaches the normal distribution. The formal definitions of these terms involve advanced statistical concepts that are reported here only in the interest of completeness. An estimator ( θ ^ ) ( θ ^ ) MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaamaabmaabaGafqiUdeNbaKaaaiaawIcacaGLPaaaaaa@3938@ of the parameter θ θ MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiabeI7aXbaa@379F@ is consistent if and only if plim θ ^ =θ. plim θ ^ =θ. MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaadchaciGGSbGaaiyAaiaac2gacuaH4oqCgaqcaiabg2da9iabeI7aXjaac6caaaa@3EE2@ This estimator has an asymptotically normal distribution if θ ^ a N( θ, { I( θ ) } 1 ). θ ^ a N( θ, { I( θ ) } 1 ). MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiqbeI7aXzaajaWaaCbiaeaacqGHsgIRaSqabeaacaWGHbaaaOGaamOtamaabmaabaGaeqiUdeNaaiilamaacmaabaGaaCysamaabmaabaGaeqiUdehacaGLOaGaayzkaaaacaGL7bGaayzFaaWaaWbaaSqabeaacqGHsislcaaIXaaaaaGccaGLOaGaayzkaaGaaiOlaaaa@486A@ An unbiased estimator is more efficient that another unbiased estimator if it has a smaller variance than the alternative estimator. An asymptotically efficient is an estimator whose mean square error tends to zero as the sample size increases. The mean square error (MSE) is defined to be MSE( θ ^ )=E[ ( θ ^ θ ) 2 ]=V( θ ^ )+ ( Bias[ θ ^ ] ) 2 . MSE( θ ^ )=E[ ( θ ^ θ ) 2 ]=V( θ ^ )+ ( Bias[ θ ^ ] ) 2 . MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaad2eacaWGtbGaamyramaabmaabaGafqiUdeNbaKaaaiaawIcacaGLPaaacqGH9aqpcaWGfbWaamWaaeaadaqadaqaaiqbeI7aXzaajaGaeyOeI0IaeqiUdehacaGLOaGaayzkaaWaaWbaaSqabeaacaaIYaaaaaGccaGLBbGaayzxaaGaeyypa0JaamOvamaabmaabaGafqiUdeNbaKaaaiaawIcacaGLPaaacqGHRaWkdaqadaqaaiaadkeacaWGPbGaamyyaiaadohadaWadaqaaiqbeI7aXzaajaaacaGLBbGaayzxaaaacaGLOaGaayzkaaWaaWbaaSqabeaacaaIYaaaaOGaaiOlaaaa@56DE@ An estimator is asymptotically efficient if lim n MSE( θ ^ )=0. lim n MSE( θ ^ )=0. MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaamaaxababaGaciiBaiaacMgacaGGTbaaleaacaWGUbGaeyOKH4QaeyOhIukabeaakiaad2eacaWGtbGaamyramaabmaabaGafqiUdeNbaKaaaiaawIcacaGLPaaacqGH9aqpcaaIWaGaaiOlaaaa@4582@ See any advanced statistics text or Statistical terminology for further information on these concepts.
  4. The symbol i=1 n x 1 i=1 n x 1 MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaamaarahabaGaamiEamaaBaaaleaacaaIXaaabeaaaeaacaWGPbGaeyypa0JaaGymaaqaaiaad6gaa0Gaey4dIunaaaa@3D95@ is equivalent to the product x 1 x 2 x n . x 1 x 2 x n . MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaadIhadaWgaaWcbaGaaGymaaqabaGccaWG4bWaaSbaaSqaaiaaikdaaeqaaOGaeS47IWKaamiEamaaBaaaleaacaWGUbaabeaakiaac6caaaa@3E8C@

Content actions

Download module as:

Add module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks