<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!DOCTYPE document PUBLIC "-//CNX//DTD CNXML 0.5 plus MathML//EN" "http://cnx.rice.edu/cnxml/0.5/DTD/cnxml_mathml.dtd">
<document xmlns="http://cnx.rice.edu/cnxml" xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="m11210">

  <name xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">Estimating Variance Simulation</name>

  <metadata xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">
  <md:version xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">1.3</md:version>
  <md:created xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">2003/05/29</md:created>
  <md:revised xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">2003/07/18 10:25:14.744 GMT-5</md:revised>
  <md:authorlist xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">
    <md:author xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="dmlane">
      <md:firstname xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">David</md:firstname>
      
      <md:surname xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">Lane</md:surname>
      <md:email xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">lane@rice.edu</md:email>
    </md:author>
  </md:authorlist>

  <md:maintainerlist xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">
    <md:maintainer xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="dmlane">
      <md:firstname xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">David</md:firstname>
      
      <md:surname xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">Lane</md:surname>
      <md:email xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">lane@rice.edu</md:email>
    </md:maintainer>
    <md:maintainer xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="mjeanes">
      <md:firstname xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">Matthew</md:firstname>
      
      <md:surname xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">Jeanes</md:surname>
      <md:email xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">mjeanes@rice.edu</md:email>
    </md:maintainer>
  </md:maintainerlist>
  
  

  <md:abstract xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/"/>
</metadata>

  <content xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">
    <para xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="beginby">
      Begin by answering the questions, even if you have to guess. The
      first time you answer the questions you will not be told whether
      you are correct or not.
    </para>

    <para xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="onceyou">
      Once you have answered all the questions, answer them again
      using the simulation to help you. This time you will get
      feedback about each individual answer.
    </para>

    <para xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="showsim">
      <cnxn xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" target="genins">Show Simulation</cnxn>
    </para>

    <media xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" type="application/x-java-applet" src="questionbase.questionBase.class">
      <param name="ARCHIVE" value="questionbase.jar"/>
		<param name="width" value="480"/>
      <param name="height" value="600"/>
      <param name="XML" value="variance_est_sim.xml"/>
      <param name="Background" value="16777164"/>
      <param name="FontSize" value="14"/>
      <param name="EndInfo" value="Please use the simulation to help you discover and understand the answers to the questions."/>
    </media>
    
  <section xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="genins">
    <name xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">General Instructions</name>
    <para xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="thissim">
      This simulation samples from the population of 50 numbers shown
      here. You can see that there are 10 instances of the values 1,
      2, 3, 4, and 5. The mean of the population is therefore 3. The
      variance is the average squared deviation from the mean of
      3. You can compute that this is exactly 2.
    </para>

    <para xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="whenyou">
      When you click on the button "Draw four numbers" four scores are
      sampled (with replacement) from the population. The four numbers
      are shown in red, as is the mean of the four numbers. The
      variance is then computed in two ways. The upper formula
      computes the variance by computing the mean of the squared
      deviations or the four sampled numbers from the sample mean. The
      lower formula computes the mean of the squared deviations or the
      four sampled numbers from the population mean of 3.00 (on rare
      occasions, the sample and population means will be equal). The
      computed variances are placed in the fields to the right of the
      formulas. The mean of the values in a field is shown at the
      bottom of the field. When there is onlyu on e value in the
      field, the mean will, of course, equal that value.
    </para>

    <para xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="ifyou">
      If you click the "Draw four numbers" button again, another four
      numbers will be sampled. The mean and variance will also be
      computed as before. The fields to the right of the formulas will
      hold both variances and the bottom of the field will show the
      mean of the variances.
    </para>

    <para xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="thepop">
      The population variance is exactly 2. Use this fact to assess
      the relative value of the two formulas for variance. See which
      one, on average, approaches 2 and which one gives lower
      estimates. Explore whether either formula is always more
      accurate, or whether sometimes one is more accurate and at other
      times, the other formula is. If the variance based on the sample
      mean had been computed by dividing by N-1 = 3 instead of 4, then
      the variance would be 4/3 times bigger. Does multiplying the
      variance by 4/3 lead to better estimates?
    </para>
    </section>

    <section xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="stepbystep">
      <name xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">Step by Step Instructions</name>
      <para xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="link">
	<cnxn xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" target="beginby">Show Questions</cnxn>
      </para>
      
      <para xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="clickthe">
	Click the "Draw 4 numbers" button. Four numbers will be
	selected from the population. They will be shown in red in the
	population. They will also be shown in red below the "Draw 4
	numbers button." The mean of the 4 numbers is also
	presented. The population mean is 3.0. See how the sample mean
	compares to the population mean.
      </para>

      <para xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="twoformulas">
	Two formulas for the variance are shown. In the first, the
	average squared deviation of the four numbers from the sample
	mean is computed. In the second, the average squared deviation
	from the population mean of 3 is computed. You should notice
	that the former formula will always produce a smaller value
	than the latter formula unless the sample mean is the same as
	the populaton mean. In this case, the two computations lead to
	the same result.
      </para>
      
      <para xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="noticethe">
	Notice the text fields to the right of the formulas. They are
	used to store the results of the simulation. The values of the
	variances are stored, and the mean of all the values is
	displayed at the bottom. After only one sample, the mean
	equals the single value.
      </para>
      
      <para xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="clickagain">
	Click the Draw 4 numbers" button again. Another sample will be
	taken and the computations will be done as before. Each text
	field will have two variances in it. Look to see which formula
	is giving the more accurate estimates of the population
	variance of 2.0.
      </para>
      
      <para xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="withonly">
	With only two samples, it is hard to be sure which formula is
	more accurate. Continue sampling until you have taken about 20
	samples. For each sample, note which formula gives you an
	answer closer to 2.0. You will probably find that formula 2
	usually, but not always comes closer.
      </para>

      <para xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="lookatthe">
	Look at the means for the two formulas. The mean for the upper
	formula will be lower than the mean for the lower
	formula. Look to see which is closer to the population
	variance of 2.0. You should find that the mean of the values
	for the upper formula is too low, probably somewhere around
	1.50. The mean for the lower formula should be closer to
	2.0. If fomula 1 had divided by N -1 (which is 3) rather than
	N (which is 4), it s values would have been
	larger. Specifically, they would have been 4/3 times
	larger. Multiply the mean from forumula 1 by 4/3 and see if it
	comes closer to the populaton variance of 2.
      </para>

      <para xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="thismakes">
	This makes sense because you would expect to be better able to
	estimate the variance if you knew the population mean (as you
	do in formula 2) than if you had to estimate it (as you do in
	formula 1).
      </para>

      <para xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="thecritical">
	The critical point is that when you have to estimate the
	population mean, you get values that are, on average, too
	low. This does not mean that every value will be too low. Look
	through the variances based on formula 1. Even though the mean
	is lower than 2.0, you will find that some of the values are
	above 2.0. This means that even though this formula tends to
	give you values that are too low, there are instances when it
	gives you values that are too high.
      </para>

      
      <media xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" type="application/x-java-applet" src="varstd.Applet1.class">
	<param name="archive" value="varstd.jar"/>
	<param name="width" value="580"/>
	<param name="height" value="460"/>
	<param name="name" value="Median"/>
      </media>
    </section>
		
    
  <section xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="summary">
    <name xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">Summary</name>
    <para xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="summ">
      The average squared difference from the sample mean will, on
      average, understimate the populaton variance. In some samples it
      will overestimate it, but most of the time it will underestimate
      it. If the formula is modified so that the sum of squared
      deviations is divided by N -1 rather than by N, then the
      tendency to underestimate the population variance is eliminated.
    </para>
  </section>
  </content>
  
</document>
