<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!DOCTYPE document PUBLIC "-//CNX//DTD CNXML 0.5 plus MathML//EN" "http://cnx.rice.edu/cnxml/0.5/DTD/cnxml_mathml.dtd">
<document xmlns="http://cnx.rice.edu/cnxml" xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="m11206">

  <name>Sample Size Simulation</name>

  <metadata>
  <md:version>1.4</md:version>
  <md:created>2003/05/29</md:created>
  <md:revised>2003/07/14 16:36:16.875 GMT-5</md:revised>
  <md:authorlist>
    <md:author id="dmlane">
      <md:firstname>David</md:firstname>
      
      <md:surname>Lane</md:surname>
      <md:email>lane@rice.edu</md:email>
    </md:author>
  </md:authorlist>

  <md:maintainerlist>
    <md:maintainer id="dmlane">
      <md:firstname>David</md:firstname>
      
      <md:surname>Lane</md:surname>
      <md:email>lane@rice.edu</md:email>
    </md:maintainer>
    <md:maintainer id="mjeanes">
      <md:firstname>Matthew</md:firstname>
      
      <md:surname>Jeanes</md:surname>
      <md:email>mjeanes@rice.edu</md:email>
    </md:maintainer>
  </md:maintainerlist>
  
  

  <md:abstract/>
</metadata>

  <content>
    <section id="genins">
      <name>General Instructions</name>

      <para id="thissim">
	This simulation demonstrates the effect of sample size on the
	sampling distribution.
      </para>

      <para id="depicted">
	Depicted on the top graph is the population distribution. By
	default it is a uniform distribution (all values are equally
	likely). The sampling distributions for two different sample
	sizes are shown in the lower two graphs. The starting values
	are 2 and 10. By default, the statistic to be computed is the
	mean, although you can also specify to compute the median.
      </para>

      <para id="forboth">
	For both the population distribution and the sampling
	distribution, the mean and the standard deviation are depicted
	graphically on the frequency distribution itself. The
	blue-colored vertical bar below the X-axis indicates the mean
	value. The red line starts from this mean value and extends
	one standard deviation in length in both directions. The
	values of both the mean and the standard deviation are also
	given to the left of the graph. Notice that the numeric form
	of a property matches its graphical form in color.
      </para>

      <para id="inthis">
	In this simulation, you specify two sample sizes (the defaults
	are set at N = 2 and N = 10), and then sample a sufficiently
	large number of samples until the sampling distributions
	stabilize. Compare the mean and standard deviaiton of the two
	sampling distributions. Repeat the process a couple times and
	watch the results. Do you observe a general rule regarding the
	effect of sample size on the mean and the standard deviation
	of the sampling distribution?
      </para>

      <para id="youmay">
	You may also test the effect of sample size with a normal
	population or with a different sample statistic (the median).
      </para>

      <para id="whenyou">
	When you have discovered the rule, go back and answer the
	questions again.
      </para>
    </section>

    <section id="stepbystep">
      <name>Step by Step Instructions</name>
      <para id="link">
	<cnxn document="m11205">Show Questions</cnxn>
      </para>

      <list id="list1" type="enumerated">
        <item>
	With the default setting (uniform population, sample
	statistic set to mean, sample sizes set at 2 and 10,
	respectively), click the button "5 Samples" a couple
	times. Notice how the sample means accumulate at the bottom
	two graphs. Then click the button "10,000 Samples" multiple
	times until the two sampling distributions stablize (don't
	change much in shape with the addition of new
	samples). Compare the two sampling distributions (mean and
	standard deviation). How do the means compare?  (Don't pay
	attention to very small differences since they can occur by
	sampling error.) How do the standard deviations compare?
      </item>
      <item>
	Find the vertical bar at the right-hand end of the X-axis
	on the middle graph (N = 2). Click and drag the bar to a
	position x = 10. When you release the mouse, the area falling
	to the left of the bar is displayed on top of the graph. (The
	location of the bar is rounded to the nearest integer.)
	Determine the probability of getting a sample mean less than
	10 for a sample of size 2 and for a sample of size 10? What is
	the probability of a sample mean being greater than 22 for
	each sample size? (Hint the probability of being less than 22
	is shown. You will have to subtract from 1.0)
      </item>
      <item>
	Set sample sizes to be 10 and 25, respectively. Sample
	50,000 times for each sample size. For each of the resulting
	distributions, calculate the probability of a sample mean
	falling within an interval that encloses the population mean
	16. For example, use the interval betwen x = 12 and x = 20. To
	find the probability of a sample mean being in the interval,
	find the proportion of means below the low end of the interval
	(12) and subtract this from the proportion of means below the
	high end of the interval (16).Which sample mean is more likely
	to be close to the population mean, the one with the smaller
	sample size or the one with the larger sample size?
      </item>
      <item>
	Set the population to be "Normal", set the sample size to
	be 2, 5, 10, 15, 25, respectively. Sample 50,000 times in each
	case. Write down the mean and standard deviation associated
	with each sample size on a piece of paper. Answer the
	following question: Does sample size significantly affect the
	mean of the distribution of sample means? Does sample size
	significantly affect the standard deviation of the
	distribution of sample means??
      </item>
      <item>
	Select "Median" as the sample statistic and repeat the
	above steps. Does sample size affect the sampling distribution
	of the median (mean and standard deviation)?
      </item>
     </list>

      <media type="application/x-java-applet" src="sampdistv2.sampDist.class">
	<param name="archive" value="sampdistv2.jar"/>
	<param name="width" value="600"/>
	<param name="height" value="465"/>
	<param name="name" value="sampdist"/>
      </media>

    </section>
    <section id="summary">
      <name>Summary</name>
      <para id="summ">
	The mean and standard deviation of the distribution of sample
	means are systematically related to those of the
	population. The mean of the sampling distribution of the mean
	is the population mean. The mean of the sampling distribution
	of the median is the population median. As sample size
	increases the standard deviation of the sampling distribution
	of the mean (also called the standard error of the mean)
	decreases. The same is true for the standard error of the
	median.
      </para>
    </section>   
  </content>
  
</document>
