<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!DOCTYPE document PUBLIC "-//CNX//DTD CNXML 0.5 plus MathML//EN" "http://cnx.rice.edu/cnxml/0.5/DTD/cnxml_mathml.dtd">
<document xmlns="http://cnx.rice.edu/cnxml" xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="m11234">
  <name>Elementary Hypothesis Testing</name>
  <metadata>
  <md:version>1.6</md:version>
  <md:created>2003/05/21</md:created>
  <md:revised>2003/08/20 17:40:03.427 GMT-5</md:revised>
  <md:authorlist>
    <md:author id="dhj">
      <md:firstname>Don</md:firstname>
      
      <md:surname>Johnson</md:surname>
      <md:email>dhj@rice.edu</md:email>
    </md:author>
  </md:authorlist>

  <md:maintainerlist>
    <md:maintainer id="dhj">
      <md:firstname>Don</md:firstname>
      
      <md:surname>Johnson</md:surname>
      <md:email>dhj@rice.edu</md:email>
    </md:maintainer>
    <md:maintainer id="erkrause">
      <md:firstname>Eileen</md:firstname>
      
      <md:surname>Krause</md:surname>
      <md:email>erkrause@rice.edu</md:email>
    </md:maintainer>
    <md:maintainer id="kclarks">
      <md:firstname>Kyle</md:firstname>
      
      <md:surname>Clarkson</md:surname>
      <md:email>kclarks@rice.edu</md:email>
    </md:maintainer>
    <md:maintainer id="lizzardg">
      <md:firstname>Elizabeth</md:firstname>
      
      <md:surname>Gregory</md:surname>
      <md:email>lizzardg@rice.edu</md:email>
    </md:maintainer>
    <md:maintainer id="kevinduh">
      <md:firstname>Kevin</md:firstname>
      
      <md:surname>Duh</md:surname>
      <md:email>kevinduh@rice.edu</md:email>
    </md:maintainer>
    <md:maintainer id="mariyah">
      <md:firstname>Mariyah</md:firstname>
      
      <md:surname>Poonawala</md:surname>
      <md:email>mariyah@rice.edu</md:email>
    </md:maintainer>
    <md:maintainer id="mjeanes">
      <md:firstname>Matthew</md:firstname>
      
      <md:surname>Jeanes</md:surname>
      <md:email>mjeanes@rice.edu</md:email>
    </md:maintainer>
    <md:maintainer id="jsilv">
      <md:firstname>Jeffrey</md:firstname>
      
      <md:surname>Silverman</md:surname>
      <md:email>jsilv@rice.edu</md:email>
    </md:maintainer>
  </md:maintainerlist>
  
  <md:keywordlist>
    <md:keyword>Bayes</md:keyword>
    <md:keyword>decision criterion</md:keyword>
    <md:keyword>Bayes' decision criterion</md:keyword>
    <md:keyword>decision region</md:keyword>
    <md:keyword>Bayes' cost</md:keyword>
    <md:keyword>cost</md:keyword>
    <md:keyword>likelihood ratio</md:keyword>
    <md:keyword>sufficient statistic</md:keyword>
    <md:keyword>a priori probability</md:keyword>
  </md:keywordlist>

  <md:abstract/>
</metadata>


  <content>
    <para id="intro">
      In statistics, hypothesis testing is some times known as
      decision theory or simply testing.  The key result around which
      all decision theory revolves is the likelihood ratio test.
    </para>
    <section id="likelihood">
      <name>The Likelihood Ratio Test</name>
      <para id="binary">
	In a binary hypothesis testing problem, four possible outcomes
	can result.  Model 
	<m:math>
	  <m:ci><m:msub>
	      <m:mi>ℳ</m:mi>
	      <m:mn>0</m:mn>
	    </m:msub></m:ci>
	</m:math> did in fact represent the best model for the data
	and the decision rule said it was (a correct decision) or said
	it wasn't (an erroneous decision).  The other two outcomes
	arise when model 
	<m:math>
	  <m:ci><m:msub>
	      <m:mi>ℳ</m:mi>
	      <m:mn>1</m:mn>
	    </m:msub></m:ci>
	</m:math> was in fact true with either a correct or incorrect
	decision made.  The decision process operates by segmenting
	the range of observation values into two disjoint
	<term>decision regions</term> 
	<m:math>
	  <m:ci><m:msub>
	      <m:mi>ℜ</m:mi>
	      <m:mn>0</m:mn>
	    </m:msub></m:ci>
	</m:math> and
	<m:math>
	  <m:ci><m:msub>
	      <m:mi>ℜ</m:mi>
	      <m:mn>1</m:mn>
	    </m:msub></m:ci>
	</m:math>.  All values of 
	<m:math>
	  <m:ci type="vector">r</m:ci>
	</m:math> fall into either 
	<m:math>
	  <m:ci><m:msub>
	      <m:mi>ℜ</m:mi>
	      <m:mn>0</m:mn>
	    </m:msub></m:ci>
	</m:math> or
	<m:math>
	  <m:ci><m:msub>
	      <m:mi>ℜ</m:mi>
	      <m:mn>1</m:mn>
	    </m:msub></m:ci>
	</m:math>.  If a given 
	<m:math>
	  <m:ci type="vector">r</m:ci>
	</m:math> lies in 
	<m:math>
	  <m:ci><m:msub>
	      <m:mi>ℜ</m:mi>
	      <m:mn>0</m:mn>
	    </m:msub></m:ci>
	</m:math>, for example, we will announce our decision <quote>"model
	<m:math>
	  <m:ci><m:msub>
	      <m:mi>ℳ</m:mi>
	      <m:mn>0</m:mn>
	    </m:msub></m:ci>
	</m:math> was true"</quote>; if in 
	<m:math>
	  <m:ci><m:msub>
	      <m:mi>ℜ</m:mi>
	      <m:mn>1</m:mn>
	    </m:msub></m:ci>
	</m:math>, model 
	<m:math>
	  <m:ci><m:msub>
	      <m:mi>ℳ</m:mi>
	      <m:mn>1</m:mn>
	    </m:msub></m:ci>
	</m:math> would be proclaimed.  To derive a rational method of
	deciding which model best describes the observations, we need
	a criterion to assess the quality of the decision process.
	Optimizing this criterion will specify the decision regions.
      </para>
      <para id="decision">
	The <term>Bayes' decision criterion</term> seeks to minimize a
	cost function associated with making a decision.  Let
	<m:math>
	  <m:ci><m:msub>
	      <m:mi>C</m:mi>
	      <m:mrow>
		<m:mi>i</m:mi>
		<m:mi>j</m:mi>
	      </m:mrow>
	    </m:msub></m:ci>
	</m:math> be the cost of mistaking model
	<m:math>
	  <m:ci>j</m:ci> 
	</m:math> for model
	<m:math>
	  <m:ci>i</m:ci> 
	</m:math>
	(<m:math>
	  <m:apply>
	    <m:neq/>
	    <m:ci>i</m:ci>
	    <m:ci>j</m:ci>
	  </m:apply>
	</m:math>) and 
	<m:math>
	  <m:ci><m:msub>
	      <m:mi>C</m:mi>
	      <m:mrow>
		<m:mi>i</m:mi>
		<m:mi>i</m:mi>
	      </m:mrow>
	    </m:msub></m:ci>
	</m:math> the presumably smaller cost of correctly choosing
	model 
	<m:math>
	  <m:ci>i</m:ci> 
	</m:math>: 
	<m:math>
	  <m:apply>
	    <m:gt/>
	    <m:ci><m:msub>
		<m:mi>C</m:mi>
		<m:mrow>
		  <m:mi>i</m:mi>
		  <m:mi>j</m:mi>
		</m:mrow>
	      </m:msub></m:ci>
	    <m:ci><m:msub>
		<m:mi>C</m:mi>
		<m:mrow>
		  <m:mi>i</m:mi>
		  <m:mi>i</m:mi>
		</m:mrow>
	      </m:msub></m:ci>
	  </m:apply>
	</m:math>, 
	<m:math>
	  <m:apply>
	    <m:neq/>
	    <m:ci>i</m:ci>
	    <m:ci>j</m:ci>
	  </m:apply>
	</m:math>.  Let 
	<m:math>
	  <m:ci><m:msub>
	      <m:mi>π</m:mi>
	      <m:mi>i</m:mi>
	    </m:msub></m:ci>
	</m:math> be the <foreign>a priori</foreign> probability of model
	<m:math>
	  <m:ci>i</m:ci> 
	</m:math>.  The so-called <term>Bayes' cost</term>
	<m:math>
	  <m:apply>
	    <m:mean/>
	    <m:ci>C</m:ci>
	  </m:apply>
	</m:math> is the average cost of making a decision. 
	<equation id="eq1">
	  <m:math>
	    <m:apply>
	      <m:eq/>
	      <m:apply>
		<m:mean/>
		<m:ci>C</m:ci>
	      </m:apply>
	      <m:apply>
		<m:sum/>
		<m:bvar>
		  <m:ci>i</m:ci>
		</m:bvar>
		<m:bvar>
		  <m:ci>j</m:ci>
		</m:bvar>
		<m:domainofapplication>
		  <m:set>
		    <m:ci>i</m:ci>
		    <m:ci>j</m:ci>
		  </m:set>
		</m:domainofapplication>
		<m:apply>
		  <m:times/>
		  <m:ci><m:msub>
		      <m:mi>C</m:mi>
		      <m:mrow>
			<m:mi>i</m:mi>
			<m:mi>j</m:mi>
		      </m:mrow>
		    </m:msub></m:ci>
		  <m:ci><m:msub>
		      <m:mi>π</m:mi>
		      <m:mi>j</m:mi>
		    </m:msub></m:ci>
		  <m:apply>
		    <m:csymbol definitionURL="http://cnx.rice.edu/cd/cnxmath.ocd#probability"/>
		    <m:mrow>
		      <m:mtext>say  </m:mtext> 
		      <m:msub>
			<m:mi>ℳ</m:mi>
			<m:mi>i</m:mi>
		      </m:msub> 
		      <m:mtext>  when  </m:mtext>
		      <m:msub>
			<m:mi>H</m:mi>
			<m:mi>j</m:mi>
		      </m:msub> 
		      <m:mtext>  true</m:mtext>
		    </m:mrow>
		  </m:apply>
		</m:apply>
	      </m:apply>
	      <m:apply>
		<m:sum/>
		<m:bvar>
		  <m:ci>i</m:ci>
		</m:bvar>
		<m:bvar>
		  <m:ci>j</m:ci>
		</m:bvar>
		<m:domainofapplication>
		  <m:set>
		    <m:ci>i</m:ci>
		    <m:ci>j</m:ci>
		  </m:set>
		</m:domainofapplication>
		<m:apply>
		  <m:times/>
		  <m:ci><m:msub>
		      <m:mi>C</m:mi>
		      <m:mrow>
			<m:mi>i</m:mi>
			<m:mi>j</m:mi>
		      </m:mrow>
		    </m:msub></m:ci>
		  <m:ci><m:msub>
		      <m:mi>π</m:mi>
		      <m:mi>j</m:mi>
		    </m:msub></m:ci>
		  <m:apply>
		    <m:csymbol definitionURL="http://cnx.rice.edu/cd/cnxmath.ocd#probability"/>
		    <m:condition>
		      <m:mrow>
			<m:msub>
			  <m:mi>H</m:mi>
			  <m:mi>j</m:mi>
			</m:msub> 
			<m:mtext>  true</m:mtext>
		      </m:mrow>
		    </m:condition>
		    <m:mrow>
		      <m:mtext>say  </m:mtext>
		      <m:msub>
			<m:mi>ℳ</m:mi>
			<m:mi>i</m:mi>
		      </m:msub> 
		    </m:mrow>
		  </m:apply>
		</m:apply>
	      </m:apply>
	    </m:apply>
	  </m:math>
	</equation>
	The Bayes' cost can be expressed as
	<equation id="eq2">
	  <m:math>
	    <m:apply>
	      <m:eq/>
	      <m:apply>
		<m:mean/>
		<m:ci>C</m:ci>
	      </m:apply>
	      <m:apply>
		<m:sum/>
		<m:bvar>
		  <m:ci>i</m:ci>
		</m:bvar>
		<m:bvar>
		  <m:ci>j</m:ci>
		</m:bvar>
		<m:domainofapplication>
		  <m:set>
		    <m:ci>i</m:ci>
		    <m:ci>j</m:ci>
		  </m:set>
		</m:domainofapplication>
		<m:apply>
		  <m:times/>
		  <m:ci><m:msub>
		      <m:mi>C</m:mi>
		      <m:mrow>
			<m:mi>i</m:mi>
			<m:mi>j</m:mi>
		      </m:mrow>
		    </m:msub></m:ci>
		  <m:ci><m:msub>
		      <m:mi>π</m:mi>
		      <m:mi>j</m:mi>
		    </m:msub></m:ci>
		  <m:apply>
		    <m:csymbol definitionURL="http://cnx.rice.edu/cd/cnxmath.ocd#probability"/>
		    <m:condition>
		      <m:mrow>
			<m:msub>
			  <m:mi>ℳ</m:mi>
			  <m:mn>0</m:mn>
			</m:msub>
			<m:mtext>  true</m:mtext>
		      </m:mrow>
		    </m:condition>
		    <m:apply>
		      <m:in/>
		      <m:ci type="vector">r</m:ci>
		      <m:ci><m:msub>
			  <m:mi>ℜ</m:mi>
			  <m:mi>i</m:mi>
			</m:msub></m:ci>
		    </m:apply>
		  </m:apply>
		</m:apply>
	      </m:apply>
	      <m:apply>
		<m:sum/>
		<m:bvar>
		  <m:ci>i</m:ci>
		</m:bvar>
		<m:bvar>
		  <m:ci>j</m:ci>
		</m:bvar>
		<m:domainofapplication>
		  <m:set>
		    <m:ci>i</m:ci>
		    <m:ci>j</m:ci>
		  </m:set>
		</m:domainofapplication>
		<m:apply>
		  <m:times/>
		  <m:ci><m:msub>
		      <m:mi>C</m:mi>
		      <m:mrow>
			<m:mi>i</m:mi>
			<m:mi>j</m:mi>
		      </m:mrow>
		    </m:msub></m:ci>
		  <m:ci><m:msub>
		      <m:mi>π</m:mi>
		      <m:mi>j</m:mi>
		    </m:msub></m:ci>
		  <m:apply>
		    <m:int/>
		    <m:bvar>
		      <m:ci type="vector">r</m:ci>
		    </m:bvar>
		    <m:domainofapplication>
		      <m:ci><m:msub>
			  <m:mi>ℜ</m:mi>
			  <m:mi>i</m:mi>
			</m:msub></m:ci>
		    </m:domainofapplication>
		    <!--pdf-->
		    <m:apply>
		      <m:csymbol definitionURL="http://cnx.rice.edu/cd/cnxmath.ocd#pdf">p</m:csymbol>
		      <m:bvar>
			<m:ci type="vector">r</m:ci>
		      </m:bvar>
		      <m:condition>
			<m:ci><m:msub>
			    <m:mi>H</m:mi>
			    <m:mi>j</m:mi>
			  </m:msub></m:ci>
		      </m:condition>
		      <m:ci type="vector">r</m:ci>
		    </m:apply>
		  </m:apply>
		</m:apply>
	      </m:apply>
	      <m:apply>
		<m:plus/>
		<m:apply>
		  <m:int/>
		  <m:bvar>
		    <m:ci type="vector">r</m:ci>
		  </m:bvar>
		  <m:domainofapplication>
		    <m:ci><m:msub>
			<m:mi>ℜ</m:mi>
			<m:mn>0</m:mn>
		      </m:msub></m:ci>
		  </m:domainofapplication>
		  <m:apply>
		    <m:plus/>
		    <m:apply>
		      <m:times/>
		      <m:ci><m:msub>
			  <m:mi>C</m:mi>
			  <m:mrow>
			    <m:mn>0</m:mn>
			    <m:mn>0</m:mn>
			  </m:mrow>
			</m:msub></m:ci>
		      <m:ci><m:msub>
			  <m:mi>π</m:mi>
			  <m:mn>0</m:mn>
			</m:msub></m:ci>
		      <!--pdf-->
		      <m:apply>
			<m:csymbol definitionURL="http://cnx.rice.edu/cd/cnxmath.ocd#pdf">p</m:csymbol>
			<m:bvar>
			  <m:ci type="vector">r</m:ci>
			</m:bvar>
			<m:condition>
			  <m:ci><m:msub>
			      <m:mi>ℳ</m:mi>
			      <m:mn>0</m:mn>
			    </m:msub></m:ci>
			</m:condition>
			<m:ci type="vector">r</m:ci>
		      </m:apply>
		    </m:apply>
		    <m:apply>
		      <m:times/>
		      <m:ci><m:msub>
			  <m:mi>C</m:mi>
			  <m:mrow>
			    <m:mn>0</m:mn>
			    <m:mn>1</m:mn>
			  </m:mrow>
			</m:msub></m:ci>
		      <m:ci><m:msub>
			  <m:mi>π</m:mi>
			  <m:mn>1</m:mn>
			</m:msub></m:ci>
		      <!--pdf-->
		      <m:apply>
			<m:csymbol definitionURL="http://cnx.rice.edu/cd/cnxmath.ocd#pdf">p</m:csymbol>
			<m:bvar>
			  <m:ci type="vector">r</m:ci>
			</m:bvar>
			<m:condition>
			  <m:ci><m:msub>
			      <m:mi>ℳ</m:mi>
			      <m:mn>1</m:mn>
			    </m:msub></m:ci>
			</m:condition>
			<m:ci type="vector">r</m:ci>
		      </m:apply>
		    </m:apply>
		  </m:apply>
		</m:apply>
		<m:apply>
		  <m:int/>
		  <m:bvar>
		    <m:ci type="vector">r</m:ci>
		  </m:bvar>
		  <m:domainofapplication>
		    <m:ci><m:msub>
			<m:mi>ℜ</m:mi>
			<m:mn>1</m:mn>
		      </m:msub></m:ci>
		  </m:domainofapplication>
		  <m:apply>
		    <m:plus/>
		    <m:apply>
		      <m:times/>
		      <m:ci><m:msub>
			  <m:mi>C</m:mi>
			  <m:mrow>
			    <m:mn>1</m:mn>
			    <m:mn>0</m:mn>
			  </m:mrow>
			</m:msub></m:ci>
		      <m:ci><m:msub>
			  <m:mi>π</m:mi>
			  <m:mn>0</m:mn>
			</m:msub></m:ci>
		      <!--pdf-->
		      <m:apply>
			<m:csymbol definitionURL="http://cnx.rice.edu/cd/cnxmath.ocd#pdf">p</m:csymbol>
			<m:bvar>
			  <m:ci type="vector">r</m:ci>
			</m:bvar>
			<m:condition>
			  <m:ci><m:msub>
			      <m:mi>ℳ</m:mi>
			      <m:mn>0</m:mn>
			    </m:msub></m:ci>
			</m:condition>
			<m:ci type="vector">r</m:ci>
		      </m:apply>
		    </m:apply>
		    <m:apply>
		      <m:times/>
		      <m:ci><m:msub>
			  <m:mi>C</m:mi>
			  <m:mrow>
			    <m:mn>1</m:mn>
			    <m:mn>1</m:mn>
			  </m:mrow>
			</m:msub></m:ci>
		      <m:ci><m:msub>
			  <m:mi>π</m:mi>
			  <m:mn>1</m:mn>
			</m:msub></m:ci>
		      <!--pdf-->
		      <m:apply>
			<m:csymbol definitionURL="http://cnx.rice.edu/cd/cnxmath.ocd#pdf">p</m:csymbol>
			<m:bvar>
			  <m:ci type="vector">r</m:ci>
			</m:bvar>
			<m:condition>
			  <m:ci><m:msub>
			      <m:mi>ℳ</m:mi>
			      <m:mn>1</m:mn>
			    </m:msub></m:ci>
			</m:condition>
			<m:ci type="vector">r</m:ci>
		      </m:apply>
		    </m:apply>
		  </m:apply>
		</m:apply>
	      </m:apply>
	    </m:apply>
	  </m:math>
	</equation>
	<m:math>
	  <m:apply>
	    <!--pdf-->
	    <m:csymbol definitionURL="http://cnx.rice.edu/cd/cnxmath.ocd#pdf">p</m:csymbol>
	    <m:bvar>
	      <m:ci type="vector">r</m:ci>
	    </m:bvar>
	    <m:condition>
	      <m:ci><m:msub>
		  <m:mi>ℳ</m:mi>
		  <m:mi>i</m:mi>
		</m:msub></m:ci>
	    </m:condition>
	    <m:ci type="vector">r</m:ci>
	  </m:apply>
	</m:math> is the conditional probability density function of
	the observed data 
	<m:math>
	  <m:ci type="vector">r</m:ci> 
	</m:math> given that model
	<m:math>
	  <m:ci><m:msub>
	      <m:mi>ℳ</m:mi>
	      <m:mi>i</m:mi>
	    </m:msub></m:ci>
	</m:math> was true.  To minimize this expression with respect
	to the decision regions 
	<m:math>
	  <m:ci><m:msub>
	      <m:mi>ℜ</m:mi>
	      <m:mn>0</m:mn>
	    </m:msub></m:ci>
	</m:math> and 
	<m:math>
	  <m:ci><m:msub>
	      <m:mi>ℜ</m:mi>
	      <m:mn>1</m:mn>
	    </m:msub></m:ci>
	</m:math>, ponder which integral would yield the smallest
	value if its integration domain included a specific
	observation vector.  This selection process defines the
	decision regions; for example, we choose 
	<m:math>
	  <m:ci><m:msub>
	      <m:mi>ℳ</m:mi>
	      <m:mn>0</m:mn>
	    </m:msub></m:ci>
	</m:math> for those values of 
	<m:math>
	  <m:ci type="vector">r</m:ci> 
	</m:math> which yield a smaller value for the first integral.
	<m:math display="block">
	  <m:apply>
	    <m:lt/>
	    <m:apply>
	      <m:plus/>
	      <m:apply>
		<m:times/>
		<m:ci><m:msub>
		    <m:mi>π</m:mi>
		    <m:mn>0</m:mn>
		  </m:msub></m:ci>
		<m:ci><m:msub>
		    <m:mi>C</m:mi>
		    <m:mrow>
		      <m:mn>0</m:mn>
		      <m:mn>0</m:mn>
		    </m:mrow>
		  </m:msub></m:ci>
		<m:apply>
		  <m:csymbol definitionURL="http://cnx.rice.edu/cd/cnxmath.ocd#pdf">p</m:csymbol>
		  <!--pdf-->
		  <m:bvar>
		    <m:ci type="vector">r</m:ci>
		  </m:bvar>
		  <m:condition>
		    <m:ci><m:msub>
			<m:mi>ℳ</m:mi>
			<m:mn>0</m:mn>
		      </m:msub></m:ci>
		  </m:condition>
		  <m:ci type="vector">r</m:ci>
		</m:apply>
	      </m:apply>
	      <m:apply>
		<m:times/>
		<m:ci><m:msub>
		    <m:mi>π</m:mi>
		    <m:mn>1</m:mn>
		  </m:msub></m:ci>
		<m:ci><m:msub>
		    <m:mi>C</m:mi>
		    <m:mrow>
		      <m:mn>0</m:mn>
		      <m:mn>1</m:mn>
		    </m:mrow>
		  </m:msub></m:ci>
		<m:apply>
		  <m:csymbol definitionURL="http://cnx.rice.edu/cd/cnxmath.ocd#pdf">p</m:csymbol>
		  <!--Probability Density Function-->
		  <m:bvar>
		    <m:ci type="vector">r</m:ci>
		  </m:bvar>
		  <m:condition>
		    <m:ci><m:msub>
			<m:mi>ℳ</m:mi>
			<m:mn>1</m:mn>
		      </m:msub></m:ci>
		  </m:condition>
		  <m:ci type="vector">r</m:ci>
		</m:apply>
	      </m:apply>
	    </m:apply>
	    <m:apply>
	      <m:plus/>
	      <m:apply>
		<m:times/>
		<m:ci><m:msub>
		    <m:mi>π</m:mi>
		    <m:mn>0</m:mn>
		  </m:msub></m:ci>
		<m:ci><m:msub>
		    <m:mi>C</m:mi>
		    <m:mrow>
		      <m:mn>1</m:mn>
		      <m:mn>0</m:mn>
		    </m:mrow>
		  </m:msub></m:ci>
		<m:apply>
		  <m:csymbol definitionURL="http://cnx.rice.edu/cd/cnxmath.ocd#pdf">p</m:csymbol>
		  <!--Probability Density Function-->
		  <m:bvar>
		    <m:ci type="vector">r</m:ci>
		  </m:bvar>
		  <m:condition>
		    <m:ci><m:msub>
			<m:mi>ℳ</m:mi>
			<m:mn>0</m:mn>
		      </m:msub></m:ci>
		  </m:condition>
		  <m:ci type="vector">r</m:ci>
		</m:apply>
	      </m:apply>
	      <m:apply>
		<m:times/>
		<m:ci><m:msub>
		    <m:mi>π</m:mi>
		    <m:mn>1</m:mn>
		  </m:msub></m:ci>
		<m:ci><m:msub>
		    <m:mi>C</m:mi>
		    <m:mrow>
		      <m:mn>1</m:mn>
		      <m:mn>1</m:mn>
		    </m:mrow>
		  </m:msub></m:ci>
		<m:apply>
		  <m:csymbol definitionURL="http://cnx.rice.edu/cd/cnxmath.ocd#pdf">p</m:csymbol>
		  <!--Probability Density Function-->
		  <m:bvar>
		    <m:ci type="vector">r</m:ci>
		  </m:bvar>
		  <m:condition>
		    <m:ci><m:msub>
			<m:mi>ℳ</m:mi>
			<m:mn>1</m:mn>
		      </m:msub></m:ci>
		  </m:condition>
		  <m:ci type="vector">r</m:ci>
		</m:apply>
	      </m:apply>
	    </m:apply>
	  </m:apply>
	</m:math>
	We choose 
	<m:math>
	  <m:ci><m:msub>
	      <m:mi>ℳ</m:mi>
	      <m:mn>1</m:mn>
	    </m:msub></m:ci>
	</m:math> when the inequality is reversed.  This expression is
	easily manipulated to obtain the decision rule known as the
	<term>likelihood ratio test</term>.  
	<equation id="ratio">
	  <m:math>
	    <m:apply>
	      <m:times/>
	      <m:apply>
		<m:divide/>
		<m:apply>
		  <!--pdf-->
		  <m:csymbol definitionURL="http://cnx.rice.edu/cd/cnxmath.ocd#pdf">p</m:csymbol>
		  <m:bvar>
		    <m:ci type="vector">r</m:ci>
		  </m:bvar>
		  <m:condition>
		    <m:ci><m:msub>
			<m:mi>ℳ</m:mi>
			<m:mn>1</m:mn>
		      </m:msub></m:ci>
		  </m:condition>
		  <m:ci type="vector">r</m:ci>
		</m:apply>
		<m:apply>
		  <!--pdf-->
		  <m:csymbol definitionURL="http://cnx.rice.edu/cd/cnxmath.ocd#pdf">p</m:csymbol>
		  <m:bvar>
		    <m:ci type="vector">r</m:ci>
		  </m:bvar>
		  <m:condition>
		    <m:ci><m:msub>
			<m:mi>ℳ</m:mi>
			<m:mn>0</m:mn>
		      </m:msub></m:ci>
		  </m:condition>
		  <m:ci type="vector">r</m:ci>
		</m:apply>
	      </m:apply>
	      <m:munderover>
		<m:mo>≷</m:mo>
		<m:msub>
		  <m:mi>ℳ</m:mi>
		  <m:mn>0</m:mn>
		</m:msub>
		<m:msub>
		  <m:mi>ℳ</m:mi>
		  <m:mn>1</m:mn>
		</m:msub>
	      </m:munderover>
	      <m:apply>
		<m:divide/>
		<m:apply>
		  <m:times/>
		  <m:ci><m:msub>
		      <m:mi>π</m:mi>
		      <m:mn>0</m:mn>
		    </m:msub></m:ci>
		  <m:apply>
		    <m:minus/>
		    <m:ci><m:msub>
			<m:mi>C</m:mi>
			<m:mrow>
			  <m:mn>1</m:mn>
			  <m:mn>0</m:mn>
			</m:mrow>
		      </m:msub></m:ci>
		    <m:ci><m:msub>
			<m:mi>C</m:mi>
			<m:mrow>
			  <m:mn>0</m:mn>
			  <m:mn>0</m:mn>
			</m:mrow>
		      </m:msub></m:ci>
		  </m:apply>
		</m:apply>
		<m:apply>
		  <m:times/>
		  <m:ci><m:msub>
		      <m:mi>π</m:mi>
		      <m:mn>1</m:mn>
		    </m:msub></m:ci>
		  <m:apply>
		    <m:minus/>
		    <m:ci><m:msub>
			<m:mi>C</m:mi>
			<m:mrow>
			  <m:mn>0</m:mn>
			  <m:mn>1</m:mn>
			</m:mrow>
		      </m:msub></m:ci>
		    <m:ci><m:msub>
			<m:mi>C</m:mi>
			<m:mrow>
			  <m:mn>1</m:mn>
			  <m:mn>1</m:mn>
			</m:mrow>
		      </m:msub></m:ci>
		  </m:apply>
		</m:apply>
	      </m:apply>
	    </m:apply>
	  </m:math>
	</equation> The comparison relation means selecting model
	<m:math>
	  <m:ci><m:msub>
	      <m:mi>ℳ</m:mi>
	      <m:mn>1</m:mn>
	    </m:msub></m:ci>
	</m:math> if the left-hand ratio exceeds the value on the right;
	otherwise, 
	<m:math>
	  <m:ci><m:msub>
	      <m:mi>ℳ</m:mi>
	      <m:mn>0</m:mn>
	    </m:msub></m:ci>
	</m:math> is selected.  Thus, the <term>likelihood
	ratio</term> 
	<m:math>
	  <m:apply>
	    <m:divide/>
	    <m:apply>
	      <!--pdf-->
	      <m:csymbol definitionURL="http://cnx.rice.edu/cd/cnxmath.ocd#pdf">p</m:csymbol>
	      <m:bvar>
		<m:ci type="vector">r</m:ci>
	      </m:bvar>
	      <m:condition>
		<m:ci><m:msub>
		    <m:mi>ℳ</m:mi>
		    <m:mn>1</m:mn>
		  </m:msub></m:ci>
	      </m:condition>
	      <m:ci type="vector">r</m:ci>
	    </m:apply>
	    <m:apply>
	      <!--pdf-->
	      <m:csymbol definitionURL="http://cnx.rice.edu/cd/cnxmath.ocd#pdf">p</m:csymbol>
	      <m:bvar>
		<m:ci type="vector">r</m:ci>
	      </m:bvar>
	      <m:condition>
		<m:ci><m:msub>
		    <m:mi>ℳ</m:mi>
		    <m:mn>0</m:mn>
		  </m:msub></m:ci>
	      </m:condition>
	      <m:ci type="vector">r</m:ci>
	    </m:apply>
	  </m:apply>
	</m:math> symbolically represented by 
	<m:math>
	  <m:apply>
	    <m:ci type="fn">Λ</m:ci>
	    <m:ci type="vector">r</m:ci>
	  </m:apply>
	</m:math>, is computed from the observed value of
	<m:math>
	  <m:ci type="vector">r</m:ci> 
	</m:math> and then compared with a <term>threshold</term>
	<m:math>
	  <m:ci>η</m:ci> 
	</m:math> equaling
	<m:math>
	  <m:apply>
	    <m:divide/>
	    <m:apply>
	      <m:times/>
	      <m:ci><m:msub>
		  <m:mi>π</m:mi>
		  <m:mn>0</m:mn>
		</m:msub></m:ci>
	      <m:apply>
		<m:minus/>
		<m:ci><m:msub>
		    <m:mi>C</m:mi>
		    <m:mrow>
		      <m:mn>1</m:mn>
		      <m:mn>0</m:mn>
		    </m:mrow>
		  </m:msub></m:ci>
		<m:ci><m:msub>
		    <m:mi>C</m:mi>
		    <m:mrow>
		      <m:mn>0</m:mn>
		      <m:mn>0</m:mn>
		    </m:mrow>
		  </m:msub></m:ci>
	      </m:apply>
	    </m:apply>
	    <m:apply>
	      <m:times/>
	      <m:ci><m:msub>
		  <m:mi>π</m:mi>
		  <m:mn>1</m:mn>
		</m:msub></m:ci>
	      <m:apply>
		<m:minus/>
		<m:ci><m:msub>
		    <m:mi>C</m:mi>
		    <m:mrow>
		      <m:mn>0</m:mn>
		      <m:mn>1</m:mn>
		    </m:mrow>
		  </m:msub></m:ci>
		<m:ci><m:msub>
		    <m:mi>C</m:mi>
		    <m:mrow>
		      <m:mn>1</m:mn>
		      <m:mn>1</m:mn>
		    </m:mrow>
		  </m:msub></m:ci>
	      </m:apply>
	    </m:apply>
	  </m:apply>
	</m:math>.  Thus, when two models are hypothesized, the
	likelihood ratio test can be succinctly expressed as the
	comparison of the likelihood ratio with a threshold.
	<equation id="threshold">
	  <m:math>
	    <m:apply>
	      <m:times/>
	      <m:apply>
		<m:ci type="fn">Λ</m:ci>
		<m:ci type="vector">r</m:ci>
	      </m:apply>
	      <m:munderover>
		<m:mo>≷</m:mo>
		<m:msub>
		  <m:mi>ℳ</m:mi>
		  <m:mn>0</m:mn>
		</m:msub>
		<m:msub>
		  <m:mi>ℳ</m:mi>
		  <m:mn>1</m:mn>
		</m:msub>
	      </m:munderover>
	      <m:ci>η</m:ci>
	    </m:apply>
	  </m:math>
	</equation>
      </para>
      <para id="dataops">
	The data processing operations are captured entirely by the
	likelihood ratio 
	<m:math>
	  <m:apply>
	    <m:divide/>
	    <m:apply>
	      <!--pdf-->
	      <m:csymbol definitionURL="http://cnx.rice.edu/cd/cnxmath.ocd#pdf">p</m:csymbol>
	      <m:bvar>
		<m:ci type="vector">r</m:ci>
	      </m:bvar>
	      <m:condition>
		<m:ci><m:msub>
		    <m:mi>ℳ</m:mi>
		    <m:mn>1</m:mn>
		  </m:msub></m:ci>
	      </m:condition>
	      <m:ci type="vector">r</m:ci>
	    </m:apply>
	    <m:apply>
	      <!--pdf-->
	      <m:csymbol definitionURL="http://cnx.rice.edu/cd/cnxmath.ocd#pdf">p</m:csymbol>
	      <m:bvar>
		<m:ci type="vector">r</m:ci>
	      </m:bvar>
	      <m:condition>
		<m:ci><m:msub>
		    <m:mi>ℳ</m:mi>
		    <m:mn>0</m:mn>
		  </m:msub></m:ci>
	      </m:condition>
	      <m:ci type="vector">r</m:ci>
	    </m:apply>
	  </m:apply>
	</m:math>.  Furthermore, note that only the value of the
	likelihood ratio <emphasis>relative</emphasis> to the
	threshold matters; to simplify the computation of the
	likelihood ratio, we can perform <emphasis>any</emphasis>
	positively monotonic operations simultaneously on the
	likelihood ratio and the threshold without affecting the
	comparison.  We can multiply the ratio by a positive constant,
	add any constant, or apply a monotonically increasing function
	which simplifies the expressions.  We single one such
	function, the logarithm, because it simplifies likelihood
	ratios that commonly occur in signal processing
	applications. Known as the log-likelihood, we explicitly
	express the likelihood ratio test with it as  
	<equation id="loglike">
	  <m:math>
	    <m:apply>
	      <m:times/>
	      <m:apply>
		<m:ln/>
		<m:apply>
		  <m:ci type="fn">Λ</m:ci>
		  <m:ci type="vector">r</m:ci>
		</m:apply>
	      </m:apply>
	      <m:munderover>
		<m:mo>≷</m:mo>
		<m:msub>
		  <m:mi>ℳ</m:mi>
		  <m:mn>0</m:mn>
		</m:msub>
		<m:msub>
		  <m:mi>ℳ</m:mi>
		  <m:mn>1</m:mn>
		</m:msub>
	      </m:munderover>
	      <m:apply>
		<m:ln/>
		<m:ci>η</m:ci>
	      </m:apply>
	    </m:apply>
	  </m:math>
	</equation>
	Useful simplifying transformations are problem-dependent; by
	laying bare that aspect of the observations essential to the
	model testing problem, we reveal the <term>sufficient
	statistic</term> 
	<m:math>
	  <m:apply>
	    <m:ci type="fn">ϒ</m:ci>
	    <m:ci type="vector">r</m:ci>
	  </m:apply>
	</m:math>: the scalar quantity which best summarizes the data
	(<cite src="#Lehmann">Lehmann, pp. 18-22</cite>).  The
	likelihood ratio test is best expressed in terms of the
	sufficient statistic.
	<equation id="sufficientstat">
	  <m:math>
	    <m:apply>
	      <m:times/>
	      <m:apply>
		<m:ci type="fn">ϒ</m:ci>
		<m:ci type="vector">r</m:ci>
	      </m:apply>
	      <m:munderover>
		<m:mo>≷</m:mo>
		<m:msub>
		  <m:mi>ℳ</m:mi>
		  <m:mn>0</m:mn>
		</m:msub>
		<m:msub>
		  <m:mi>ℳ</m:mi>
		  <m:mn>1</m:mn>
		</m:msub>
	      </m:munderover>
	      <m:ci>γ</m:ci>
	    </m:apply>
	  </m:math>
	</equation> We will denote the threshold value by
	<m:math>
	  <m:ci>γ</m:ci> 
	</m:math> when the sufficient statistic is used or by
	<m:math>
	  <m:ci>η</m:ci> 
	</m:math> when the likelihood ratio appears prior to its
	reduction to a sufficient statistic.
      </para>
      <para id="different">
	As we shall see, if we use a different criterion other than
	the Bayes' criterion, the decision rule often involves the
	likelihood ratio.  The likelihood ratio is comprised of the
	quantities 
	<m:math>
	  <m:apply>
	    <!--pdf-->
	    <m:csymbol definitionURL="http://cnx.rice.edu/cd/cnxmath.ocd#pdf">p</m:csymbol>
	    <m:bvar>
	      <m:ci type="vector">r</m:ci>
	    </m:bvar>
	    <m:condition>
	      <m:ci><m:msub>
		  <m:mi>ℳ</m:mi>
		  <m:mi>i</m:mi>
		</m:msub></m:ci>
	    </m:condition>
	    <m:ci type="vector">r</m:ci>
	  </m:apply>
	</m:math>, termed the <term>likelihood function</term>, which
	is also important in estimation theory.  It is this
	conditional density that portrays the probabilistic model
	describing data generation.  The likelihood function
	completely characterizes the kind of "world" assumed by each
	model; for each model, we must specify the likelihood function
	so that we can solve the hypothesis testing problem.
      </para>
      <para id="complication">
	A complication, which arises in some cases, is that the
	sufficient statistic may not be monotonic.  If monotonic, the
	decision regions 
	<m:math>
	  <m:ci><m:msub>
	      <m:mi>ℜ</m:mi>
	      <m:mn>0</m:mn>
	    </m:msub></m:ci>
	</m:math> and 
	<m:math>
	  <m:ci><m:msub>
	      <m:mi>ℜ</m:mi>
	      <m:mn>1</m:mn>
	    </m:msub></m:ci>
	</m:math> are simply connected (all portions of a region can
	be reached without crossing into the other region).  If not,
	the regions are not simply connected and decision region
	islands are created (see <cnxn document="m11271" target="problem2">this problem</cnxn>).  Such regions usually
	complicate calculations of decision performance.  Monotonic or
	not, the decision rule proceeds as described: the sufficient
	statistic is computed for each observation vector and compared
	to a threshold.
      </para>
      <example id="studying">
	<para id="grade">
	  An instructor in a course in detection theory wants to
	  determine if a particular student studied for his last test.
	  The observed quantity is the student's grade, which we
	  denote by 
	  <m:math>
	    <m:ci type="vector">r</m:ci>
	  </m:math>.  Failure may not indicate studiousness:
	  conscientious students may fail the test.  Define the models
	  as
	  <list id="list1">
	    <item>
	      <m:math>
		<m:ci><m:msub>
		    <m:mi>ℳ</m:mi>
		    <m:mn>0</m:mn>
		  </m:msub></m:ci>
	      </m:math>:   did not study</item>
	    <item>
	      <m:math>
		<m:ci><m:msub> 
		    <m:mi>ℳ</m:mi> 
		    <m:mn>1</m:mn>
		  </m:msub></m:ci> 
	      </m:math>:   did study</item>
	  </list> The conditional densities of the grade are shown in
	  <cnxn target="gradefig"/>.

	  <figure id="gradefig">
	    <media type="image/jpg" src="grade.jpg"/>
	    <caption>Conditional densities for the grade distributions
	    assuming that a student did not study 
	      (<m:math>
		<m:ci><m:msub>
		    <m:mi>ℳ</m:mi>
		    <m:mn>0</m:mn>
		  </m:msub></m:ci>
	      </m:math>) or did
	      (<m:math>
		<m:apply>
		  <m:ci><m:msub>
		      <m:mi>ℳ</m:mi>
		      <m:mn>1</m:mn>
		    </m:msub></m:ci>
		</m:apply>
	      </m:math>) are shown in the top row.  The lower portion
	      depicts the likelihood ratio formed from these densities.
	    </caption>
	  </figure>

	  Based on knowledge of student behavior, the instructor
	  assigns <foreign>a priori</foreign> probabilities of 
	  <m:math>
	    <m:apply>
	      <m:eq/>
	      <m:ci><m:msub>
		  <m:mi>π</m:mi>
		  <m:mn>0</m:mn>
		</m:msub></m:ci>
	      <m:cn type="rational">1<m:sep/>4</m:cn>
	    </m:apply>
	  </m:math> and 
	  <m:math>
	    <m:apply>
	      <m:eq/>
	      <m:ci><m:msub>
		  <m:mi>π</m:mi>
		  <m:mn>1</m:mn>
		</m:msub></m:ci>
	      <m:cn type="rational">3<m:sep/>4</m:cn>
	    </m:apply>
	  </m:math>.  The costs 
	  <m:math>
	    <m:ci><m:msub>
		<m:mi>C</m:mi>
		<m:mrow>
		  <m:mi>i</m:mi>
		  <m:mi>j</m:mi>
		</m:mrow>
	      </m:msub></m:ci>
	  </m:math> are chosen to reflect the instructor's sensitivity
	  to student feelings: 
	  <m:math>
	    <m:apply>
	      <m:eq/>
	      <m:ci><m:msub>
		  <m:mi>C</m:mi>
		  <m:mrow>
		    <m:mn>0</m:mn>
		    <m:mn>1</m:mn>
		  </m:mrow>
		</m:msub></m:ci>
	      <m:cn>1</m:cn>
	      <m:ci><m:msub>
		  <m:mi>C</m:mi>
		  <m:mrow>
		    <m:mn>1</m:mn>
		    <m:mn>0</m:mn>
		  </m:mrow>
		</m:msub></m:ci>
	    </m:apply>
	  </m:math> (an erroneous decision either way is given the
	  same cost) and 
	  <m:math>
	    <m:apply>
	      <m:eq/>
	      <m:ci><m:msub>
		  <m:mi>C</m:mi>
		  <m:mrow>
		    <m:mn>0</m:mn>
		    <m:mn>0</m:mn>
		  </m:mrow>
		</m:msub></m:ci>
	      <m:cn>0</m:cn>
	      <m:ci><m:msub>
		  <m:mi>C</m:mi>
		  <m:mrow>
		    <m:mn>1</m:mn>
		    <m:mn>1</m:mn>
		  </m:mrow>
		</m:msub></m:ci>
	    </m:apply>
	  </m:math>.  The likelihood ratio is plotted in <cnxn target="gradefig"/> and the threshold value
	  <m:math>
	    <m:ci>η</m:ci> 
	  </m:math>, which is computed from the <foreign>a
	  priori</foreign> probabilities and the costs to be
	  <m:math>
	    <m:cn type="rational">1<m:sep/>3</m:cn> 
	  </m:math>, is indicated.  The calculations of this
	  comparison can be simplified in an obvious way.
	  <m:math display="block">
	    <m:apply>
	      <m:times/>
	      <m:apply>
		<m:divide/>
		<m:ci>r</m:ci>
		<m:cn>50</m:cn>
	      </m:apply>
	      <m:munderover>
		<m:mo>≷</m:mo>
		<m:msub>
		  <m:mi>ℳ</m:mi>
		  <m:mn>0</m:mn>
		</m:msub>
		<m:msub>
		  <m:mi>ℳ</m:mi>
		  <m:mn>1</m:mn>
		</m:msub>
	      </m:munderover>
	      <m:cn type="rational">1<m:sep/>3</m:cn>
	    </m:apply>
	  </m:math> or
	  <m:math display="block">
	    <m:apply>
	      <m:eq/>
	      <m:apply>
		<m:times/>
		<m:ci>r</m:ci>
		<m:munderover>
		  <m:mo>≷</m:mo>
		  <m:msub>
		    <m:mi>ℳ</m:mi>
		    <m:mn>0</m:mn>
		  </m:msub>
		  <m:msub>
		    <m:mi>ℳ</m:mi>
		    <m:mn>1</m:mn>
		  </m:msub>
		</m:munderover>
		<m:cn type="rational">50<m:sep/>3</m:cn>
	      </m:apply>
	      <m:cn>16.7</m:cn>
	    </m:apply>
	  </m:math> The multiplication by the factor of 50 is a simple
	  illustration of the reduction of the likelihood ratio to a
	  sufficient statistic.  Based on the assigned costs and
	  <foreign>a priori</foreign> probabilities, the optimum
	  decision rule says the instructor must assume that the
	  student did not study if the student's grade is less than
	  16.7; if greater, the student is assumed to have studied
	  despite receiving an abysmally low grade such as 20.  Note
	  that as the densities given by each model overlap entirely:
	  the possibility of making the wrong interpretation
	  <emphasis>always</emphasis> haunts the instructor.  However,
	  no other procedure will be better!
	</para>
      </example>
    </section>
  </content>

  <bib:file>
    <bib:entry id="Lehmann">
      <bib:book>
	<bib:author>E.L. Lehmann</bib:author>
	<bib:title>Testing Statistical Hypotheses</bib:title>
	<bib:publisher>John Wiley and Sons</bib:publisher>
	<bib:year>1986</bib:year>
	<bib:address>New York</bib:address>
	<bib:edition>second edition</bib:edition>
      </bib:book>
    </bib:entry>
  </bib:file>
</document>
