<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE document PUBLIC "-//CNX//DTD CNXML 0.5 plus MathML//EN" "http://cnx.rice.edu/cnxml/0.5/DTD/cnxml_mathml.dtd">
<document xmlns="http://cnx.rice.edu/cnxml" xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="id7033113">
  <name>Collaborative Statistics: Glossary</name>
<metadata>
  <md:version>1.8</md:version>
  <md:created>2008/04/25 09:21:28 GMT-5</md:created>
  <md:revised>2008/07/24 15:54:18.201 GMT-5</md:revised>
  <md:authorlist>
      <md:author id="billowsky">
      <md:firstname>Barbara</md:firstname>
      
      <md:surname>Illowsky</md:surname>
      <md:email>illowskybarbara@deanza.edu</md:email>
    </md:author>
      <md:author id="sdean">
      <md:firstname>Susan</md:firstname>
      
      <md:surname>Dean</md:surname>
      <md:email>deansusan@deanza.edu</md:email>
    </md:author>
  </md:authorlist>

  <md:maintainerlist>
    <md:maintainer id="cnxorg">
      <md:firstname/>
      
      <md:surname>Connexions</md:surname>
      <md:email>cnx@cnx.org</md:email>
    </md:maintainer>
  </md:maintainerlist>
  
  <md:keywordlist>
    <md:keyword>definitions</md:keyword>
    <md:keyword>glossary</md:keyword>
    <md:keyword>statistics</md:keyword>
    <md:keyword>terms</md:keyword>
  </md:keywordlist>

  <md:abstract>This module contains a number of glossary terms related to elementary statistics.  This module represents the combined glossary information for the Collaborative Statistics textbook/module (col10522).</md:abstract>
</metadata>

  
  <content>
    <para id="nodata"/>
    
   </content>
<glossary>
  <definition id="additionrule">
    <term>Addition Rule</term>
    <meaning>
     For any events 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>A </m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{A} {}</m:annotation></m:semantics><m:mspace/></m:math> and
 <m:math><m:mspace/><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>B </m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{B} {}</m:annotation></m:semantics></m:math> in the sample space <m:math>
        <m:semantics>
          <m:mrow>
            <m:mstyle fontsize="12pt">
              <m:mrow>
                <m:mrow>
                  <m:mi>P</m:mi>
                  <m:mo stretchy="false">(</m:mo>
                  <m:mi>A</m:mi>
                  <m:mstyle fontweight="bold">
                    <m:mrow>
                      <m:mspace/><m:mtext> or </m:mtext><m:mspace/>
                    </m:mrow>
                  </m:mstyle>
                  <m:mi>B</m:mi>
                  <m:mrow>
                    <m:mo stretchy="false">)</m:mo>
                    <m:mo stretchy="false">=</m:mo>
                    <m:mi>P</m:mi>
                  </m:mrow>
                  <m:mo stretchy="false">(</m:mo>
                  <m:mi>A</m:mi>
                  <m:mrow>
                    <m:mo stretchy="false">)</m:mo>
                    <m:mo stretchy="false">+</m:mo>
                    <m:mi>P</m:mi>
                  </m:mrow>
                  <m:mo stretchy="false">(</m:mo>
                  <m:mi>B</m:mi>
                  <m:mrow>
                    <m:mo stretchy="false">)</m:mo>
                    <m:mo stretchy="false">−</m:mo>
                    <m:mi>P</m:mi>
                  </m:mrow>
                  <m:mo stretchy="false">(</m:mo>
                  <m:mi>A</m:mi>
                  <m:mstyle fontweight="bold">
                    <m:mrow>
                      <m:mspace/><m:mtext>and</m:mtext><m:mspace/>
                    </m:mrow>
                  </m:mstyle>
                  <m:mi>B</m:mi>
                  <m:mo stretchy="false">)</m:mo>
                </m:mrow>
              </m:mrow>
            </m:mstyle>
            <m:mrow/>
          </m:mrow>
          <m:annotation encoding="StarMath 5.0"> size 12{P \( A bold "or"B \) =P \( A \) +P \( B \) -P \( A bold "and"B \) } {}</m:annotation>
        </m:semantics>
      </m:math>.
    </meaning>
  </definition>

  <definition id="anova">
    <term>Analysis of Variance</term>
    <meaning>
      Also referred to as ANOVA.  A method of testing whether or not the means of three or more populations are equal. The method is applicable if: 
<list id="gllist1" type="bulleted">
<item>All populations of interest are normally distributed.</item>
<item>The populations have equal standard deviations.</item>
<item>Samples (not necessarily of the same size) are randomly and independently selected from each population.</item>
</list>The test statistic for analysis of variance is the F-ratio.
    </meaning>
  </definition>

  <definition id="and">
    <term>AND</term>
    <meaning>
     Logical operation over the subsets of a set. In statistics, if 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>A </m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{A} {}</m:annotation></m:semantics></m:math> and 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>B</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{B} {}</m:annotation></m:semantics></m:math><m:math><m:semantics><m:mrow/><m:annotation encoding="StarMath 5.0">{}</m:annotation></m:semantics></m:math> are any two events (subsets in the sample space), then the event “
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>A</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{A} {}</m:annotation></m:semantics></m:math> and
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>B</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{B} {}</m:annotation></m:semantics></m:math>” consists of all possible outcomes that are common for both  
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>A</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{A} {}</m:annotation></m:semantics></m:math> and
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>B</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{B} {}</m:annotation></m:semantics></m:math>.
    </meaning>
  </definition>

  <definition id="arithmean">
    <term>Arithmetic Mean</term>
    <meaning>
      The sum of the values divided by the number of values. The notation for the mean of a sample is <m:math><m:apply>
  <m:conjugate/>
  <m:ci>x</m:ci>
</m:apply></m:math>. The notation for the mean of a population is <m:math><m:ci>μ</m:ci></m:math>. 
    </meaning>
  </definition>

  <definition id="average">
    <term>Average</term>
    <meaning>
      A number that describes the central tendency of the data. There are a number of specialized averages, including the arithmetic mean, weighted mean, median, mode, and geometric mean.
    </meaning>
  </definition>


  <definition id="bayestheorem">
    <term>Bayes' Theorem</term>
    <meaning>
     Developed by Reverend Bayes in the 1700s). A rule designed to find the probability of one event, 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>A</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{A} {}</m:annotation></m:semantics></m:math>, occurring, given that a finite set of another events, 
<m:math><m:mo>{</m:mo><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:msub><m:mi>B</m:mi><m:mstyle fontsize="8pt"><m:mrow><m:mi>i</m:mi></m:mrow></m:mstyle></m:msub><m:mi>,</m:mi><m:mrow><m:mi>i</m:mi><m:mo stretchy="false">=</m:mo><m:mn>1,2,</m:mn></m:mrow><m:mtext>.</m:mtext><m:mtext>.</m:mtext><m:mtext>.</m:mtext><m:mi>,</m:mi><m:mi>l</m:mi></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{B rSub { size 8{i} } ,i=1,2, "."  "."  "." ,l} {}</m:annotation></m:semantics><m:mo>}</m:mo></m:math>, has occurred. 
    </meaning>
  </definition>


  <definition id="bernoullitr">
    <term>Bernoulli Trials</term>
    <meaning>
      An experiment with the following characteristics: <list type="bulleted" id="gloslst1">
<item>There are only 2 possible outcomes called “success” and “failure” for each trial.</item>
<item>
The probabilities <emphasis><m:math><m:mi>p</m:mi></m:math></emphasis> of success and <emphasis><m:math><m:mi>q</m:mi> <m:mo>=</m:mo> <m:mn>1</m:mn><m:mo>-</m:mo><m:mi>p</m:mi></m:math> </emphasis>for failure are the same for any trial.
</item></list></meaning>
  </definition>


  <definition id="bias">
    <term>Bias</term>
    <meaning>
      A possible consequence if certain members of the population are denied the chance to be selected for the sample.
    </meaning>
  </definition>


  <definition id="bidist">
    <term>Binomial Distribution</term>
    <meaning>
      A discrete random variable (RV) which arises from the Bernoulli trials with the next additional requirements. There are fixed number, n, of independent trials. “Independent” means that the result to any trial (for example, trial 1) in no way affects the answer to all the following trials, and all trials are conducted under the same conditions. Under these circumstances the binomial RV 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>X</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{X} {}</m:annotation></m:semantics></m:math> is defined as the number of success in n trials. The notation is: 

<emphasis><m:math><m:mi>X</m:mi></m:math>~<m:math> <m:mi>B</m:mi>
  <m:mo>(</m:mo>
  <m:mi>n</m:mi>
  <m:mo>,</m:mo>
  <m:mi>p</m:mi>
  <m:mo>)</m:mo></m:math></emphasis>; the domain is <m:math><m:mo>{</m:mo><m:mn>0</m:mn><m:mo>,</m:mo><m:mn>1</m:mn><m:mo>,</m:mo><m:mn>2</m:mn>
<m:mo>,</m:mo><m:mo>.</m:mo><m:mo>.</m:mo><m:mo>.</m:mo><m:mo>,</m:mo><m:mn>n</m:mn><m:mo>}</m:mo></m:math>
 the mean is <m:math><m:apply>
  <m:eq/>
  <m:ci>μ</m:ci>
  <m:ci>np</m:ci>
</m:apply>
</m:math>, and the variance is <m:math>

   <m:msup>
    <m:mi>σ</m:mi>
    <m:mn>2</m:mn>
  </m:msup>
  <m:mo>=</m:mo>
  <m:mi>df</m:mi></m:math>. The probability to have exactly <m:math><m:mi>x</m:mi></m:math> successes in <m:math><m:mi>n</m:mi></m:math> trials is <m:math>
  <m:mi>P</m:mi>
  <m:mo>(</m:mo>
  <m:mi>X</m:mi>
  <m:mo>=</m:mo>
  <m:mi>x</m:mi>
  <m:mo>)</m:mo>
  <m:mo>=</m:mo>
  <m:mfenced>
    <m:mfrac linethickness="0">
      <m:mi>n</m:mi>
      <m:mi>x</m:mi>
    </m:mfrac>
  </m:mfenced>
  <m:msup>
    <m:mi>p</m:mi>
    <m:mi>x</m:mi>
  </m:msup>
  <m:msup>
    <m:mi>q</m:mi>
    <m:mrow>
      <m:mi>n</m:mi>
      <m:mo>−</m:mo>
      <m:mi>x</m:mi>
    </m:mrow>
  </m:msup>
</m:math>.
    </meaning>
  </definition>


  <definition id="centlimit">
    <term>Central Limit Theorem</term>
    <meaning>
     Given a random variable (RV) with known mean <m:math><m:mi>μ</m:mi></m:math> and known variance <m:math><m:mi>σ</m:mi></m:math>
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:msup><m:mrow/><m:mstyle fontsize="8pt"><m:mrow><m:mn>2</m:mn></m:mrow></m:mstyle></m:msup></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{ {} rSup { size 8{2} } } {}</m:annotation></m:semantics></m:math>, we are sampling with size n and we are interested in two new RV - sample mean, 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mover accent="true"><m:mi>X</m:mi><m:mo stretchy="false">ˉ</m:mo></m:mover></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{ { bar  {X}}} {}</m:annotation></m:semantics></m:math>,and sample sum,<m:math><m:mi>Σ</m:mi></m:math> 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>X</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{X} {}</m:annotation></m:semantics></m:math>. If the size <m:math><m:mi>n</m:mi></m:math> of the sample is sufficiently large, then 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mover accent="true"><m:mi>X</m:mi><m:mo stretchy="false">ˉ</m:mo></m:mover></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{ { bar  {X}}} {}</m:annotation></m:semantics></m:math>∼ 

<m:math>
 <m:mi>N</m:mi>
  <m:mfenced>
    <m:mi>μ</m:mi>
    <m:mfrac>
      <m:msup>
        <m:mi>σ</m:mi>
        <m:mn>2</m:mn>
      </m:msup>
      <m:mi>n</m:mi>
    </m:mfrac>
  </m:mfenced>
</m:math>
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:msup><m:mrow/><m:mstyle fontsize="8pt"><m:mrow><m:mn/></m:mrow></m:mstyle></m:msup></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> </m:annotation></m:semantics></m:math> and 
<m:math> <m:mi>Σ</m:mi><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>X</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{X} {}</m:annotation></m:semantics></m:math> ∼  
<m:math><m:mi>N</m:mi>
  <m:mo>(</m:mo>
    <m:mi>nμ</m:mi>
    <m:mo>,</m:mo>
    <m:mi>n</m:mi>
    
    <m:msup>
      <m:mi>σ</m:mi>
      <m:mn>2</m:mn>
    </m:msup>
  <m:mo>)</m:mo></m:math>. In words, if the size n of the sample is sufficiently large, then the distribution of the sample means and the distribution of the sample sums will approximate a normal distribution regardless of the shape of the population. And even more, the mean of the sampling distribution will equal the population mean and mean of sampling sums will equal n times the population mean. The standard deviation of the distribution of the sample means, 
<m:math> <m:mfrac>
    <m:mi>σ</m:mi>
    <m:msqrt>
      <m:mi>n</m:mi>
    </m:msqrt>
  </m:mfrac></m:math>, is called standard error of the mean.
    </meaning>
  </definition>


  <definition id="charts">
    <term>Charts</term>
    <meaning>
     Special graphical formats used to visualize a frequency distribution. They include, but are not limited to: <emphasis>histograms, frequency polygons, cumulative frequency polygons, box plots, stemplots, bar charts, Venn and tree diagrams, and pie charts</emphasis>. Some of them, together with explanations for which kind of chart fits better to the given situation, you can find in Descriptive Statistics.
    </meaning>
  </definition>


  <definition id="classmark">
    <term>Class Mark</term>
    <meaning>
     Midpoint of the class.
    </meaning>
  </definition>



  <definition id="chisqdist">
    <term>Chi-square Distribution</term>
    <meaning>
     The distribution with following characteristics: <list id="chisqlst" type="bulleted"> <item>The random variable (RV) is continuous and takes only nonnegative values (in fact, it is a sum of squares of <m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>k</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{k} {}</m:annotation></m:semantics></m:math> independent normal distributions).</item><item>There is a "family" of Chi-squared distributions. Each representative of the family is completely defined by the number of degrees of freedom, 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>k</m:mi><m:mo stretchy="false">−</m:mo><m:mn>1</m:mn></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{k - 1} {}</m:annotation></m:semantics></m:math>, where <m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>k</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{k} {}</m:annotation></m:semantics></m:math>is a number of categories (not a size of sample). </item><item>The pdf is positively skewed, however, as 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>k</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{k} {}</m:annotation></m:semantics></m:math> increases (<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>k</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{k} {}</m:annotation></m:semantics></m:math>&gt;90), the distribution begins to approximate the normal distribution.</item></list>

The notation is: 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>X</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{X} {}</m:annotation></m:semantics></m:math>∼
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:msup><m:mi>χ</m:mi><m:mstyle fontsize="8pt"><m:mrow><m:msub><m:mn>2</m:mn><m:mstyle fontsize="8pt"><m:mrow><m:mstyle fontstyle="italic"><m:mrow><m:mtext>df</m:mtext></m:mrow></m:mstyle></m:mrow></m:mstyle></m:msub></m:mrow></m:mstyle></m:msup></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{c rSup { size 8{2}   rSub { size 8{ ital "df"} } } } {}</m:annotation></m:semantics></m:math>; the mean 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>μ</m:mi><m:mtext> = df</m:mtext></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{μ" = df"} {}</m:annotation></m:semantics></m:math>; the variance 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:msup><m:mi>σ</m:mi><m:mstyle fontsize="8pt"><m:mrow><m:mn>2</m:mn></m:mrow></m:mstyle></m:msup><m:mtext> = df</m:mtext></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{s rSup { size 8{2} } " = df"} {}</m:annotation></m:semantics></m:math>. Chi-squared distribution is used to calculate the test statistic in <emphasis>Goodness-of-fit Test</emphasis> (to determine if a population follows specified distribution), <emphasis>Test of Independence</emphasis> (to determine if two factors are or are not related), and <emphasis>Test for Single Variance.</emphasis>
    </meaning>
  </definition>



  <definition id="class">
    <term>Class</term>
    <meaning>
     The interval in which the data are booked. It is convenient to group outcomes into the classes when working with large mass of data, particularly when data is continuous. In this case it is easier to visualize data. For example, every bar in histogram corresponds to one class and the midpoint of the interval can be chosen as a representative of all outcomes in the class. Midpoint of the class often called a <emphasis>class mark</emphasis>.
    </meaning>
  </definition>



  <definition id="clustersamp">
    <term>Cluster Sampling</term>
    <meaning>
      A procedure used if the population is dispersed over a wide geographic area. The area is divided in some way into smaller units (counties, precincts, blocks, etc.) called primary units. Then a few primary units are chosen, and a random sample is selected from each unit. 
    </meaning>
  </definition>



  <definition id="coeffcorr">
    <term>Coefficient of Correlation</term>
    <meaning>
A measure developed by Karl Pearson (early 1900s) that gives the strength of association between the independent variable and the dependent variable. The formula is:
    <equation id="id5499555">
      <m:math>
        <m:semantics>
          <m:mrow>
            <m:mstyle fontsize="12pt">
              <m:mrow>
                <m:mrow>
                  <m:mrow>
                    <m:mi>r</m:mi>
                    <m:mo stretchy="false">=</m:mo>
                    <m:mfrac>
                      <m:mrow>
                        <m:mi>n</m:mi>
                        <m:mrow>
                          <m:mrow>
                            <m:mo stretchy="false">∑</m:mo>
                            <m:mstyle fontstyle="italic">
                              <m:mrow>
                                <m:mtext>XY</m:mtext>
                              </m:mrow>
                            </m:mstyle>
                          </m:mrow>
                          <m:mo stretchy="false">−</m:mo>
                          <m:mo stretchy="false">(</m:mo>
                        </m:mrow>
                        <m:mrow>
                          <m:mo stretchy="false">∑</m:mo>
                          <m:mrow>
                            <m:mi>X</m:mi>
                            <m:mo stretchy="false">)</m:mo>
                            <m:mo stretchy="false">(</m:mo>
                            <m:mrow>
                              <m:mo stretchy="false">∑</m:mo>
                              <m:mrow>
                                <m:mi>Y</m:mi>
                                <m:mo stretchy="false">)</m:mo>
                              </m:mrow>
                            </m:mrow>
                          </m:mrow>
                        </m:mrow>
                      </m:mrow>
                      <m:msqrt>
                        <m:mrow>
                          <m:mo stretchy="false">[</m:mo>
                          <m:mi>n</m:mi>
                          <m:mrow>
                            <m:mo stretchy="false">∑</m:mo>
                            <m:mrow>
                              <m:mrow>
                                <m:msup>
                                  <m:mi>X</m:mi>
                                  <m:mstyle fontsize="8pt">
                                    <m:mrow>
                                      <m:mn>2</m:mn>
                                    </m:mrow>
                                  </m:mstyle>
                                </m:msup>
                                <m:mo stretchy="false">−</m:mo>
                                <m:mo stretchy="false">(</m:mo>
                              </m:mrow>
                              <m:mrow>
                                <m:mo stretchy="false">∑</m:mo>
                                <m:mrow>
                                  <m:mi>X</m:mi>
                                  <m:msup>
                                    <m:mo stretchy="false">)</m:mo>
                                    <m:mstyle fontsize="8pt">
                                      <m:mrow>
                                        <m:mn>2</m:mn>
                                      </m:mrow>
                                    </m:mstyle>
                                  </m:msup>
                                  <m:mo stretchy="false">]</m:mo>
                                  <m:mo stretchy="false">[</m:mo>
                                  <m:mi>n</m:mi>
                                  <m:mrow>
                                    <m:mo stretchy="false">∑</m:mo>
                                    <m:mrow>
                                      <m:mrow>
                                        <m:msup>
                                          <m:mi>Y</m:mi>
                                          <m:mstyle fontsize="8pt">
                                            <m:mrow>
                                              <m:mn>2</m:mn>
                                            </m:mrow>
                                          </m:mstyle>
                                        </m:msup>
                                        <m:mo stretchy="false">−</m:mo>
                                        <m:mo stretchy="false">(</m:mo>
                                      </m:mrow>
                                      <m:mrow>
                                        <m:mo stretchy="false">∑</m:mo>
                                        <m:mrow>
                                          <m:mi>Y</m:mi>
                                          <m:msup>
                                            <m:mo stretchy="false">)</m:mo>
                                            <m:mstyle fontsize="8pt">
                                              <m:mrow>
                                                <m:mn>2</m:mn>
                                              </m:mrow>
                                            </m:mstyle>
                                          </m:msup>
                                          <m:mo stretchy="false">]</m:mo>
                                        </m:mrow>
                                      </m:mrow>
                                    </m:mrow>
                                  </m:mrow>
                                </m:mrow>
                              </m:mrow>
                            </m:mrow>
                          </m:mrow>
                        </m:mrow>
                      </m:msqrt>
                    </m:mfrac>
                  </m:mrow>
                  <m:mi>,</m:mi>
                </m:mrow>
              </m:mrow>
            </m:mstyle>
            <m:mrow/>
          </m:mrow>
          <m:annotation encoding="StarMath 5.0"> size 12{r= {  {n Sum { ital "XY"}  -  \(  Sum {X \)  \(  Sum {Y \) } } }  over  { sqrt { \[ n Sum {X rSup { size 8{2} }  -  \(  Sum {X \)  rSup { size 8{2} }  \]  \[ n Sum {Y rSup { size 8{2} }  -  \(  Sum {Y \)  rSup { size 8{2} }  \] } } } } } } } ,} {}</m:annotation>
        </m:semantics>
      </m:math>
    </equation>
    where n is the number of data points. 
    The coefficient cannot be more then 1 and less then -1. The closer the coefficient is to 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mo stretchy="false">±</m:mo><m:mn>1</m:mn></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{ +- 1} {}</m:annotation></m:semantics></m:math>, the stronger the evidence of a significant linear relationship between 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>X</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{X} {}</m:annotation></m:semantics></m:math> and 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>Y</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{Y} {}</m:annotation></m:semantics></m:math>.
    </meaning>
  </definition>



  <definition id="the_cdf">
    <term>Cumulative Distribution Function (CDF)</term>
    <meaning>
      Given a quantitative random variable (RV) [that is, given (
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>X</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{X} {}</m:annotation></m:semantics></m:math>, PDF) for discrete RV and (
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>X</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{X} {}</m:annotation></m:semantics></m:math>, pdf) for continuous RV we consider for all <m:math><m:mi>x</m:mi></m:math> in the domain of <m:math><m:mi>X</m:mi></m:math> the events {set of all outcomes that are less or equal to 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>x</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{x} {}</m:annotation></m:semantics></m:math>}. The probability distribution 
<m:math> <m:mi>P</m:mi>
  <m:mo>(</m:mo>
  <m:mrow>
    <m:mi>X</m:mi>
    <m:mo>≤</m:mo>
    <m:mrow>
      <m:mi>x</m:mi>
      <m:mo>)</m:mo>
    </m:mrow>
  </m:mrow></m:math> is called Cumulative distribution function.
    </meaning>
  </definition>



  <definition id="cumrelfreq">
    <term>Cumulative Relative Frequency</term>
    <meaning>
      The concept applies to an ordered set of observations from smallest to largest, or vise versa. Cumulative relative frequency is the sum of relative frequencies for all values that are less than or equal to the given value.
    </meaning>
  </definition>




  <definition id="compevent">
    <term>Complement Event</term>
    <meaning>
     The event consisting of all outcomes that are not in the given event. 
    </meaning>
  </definition>




  <definition id="condprob">
    <term>Conditional Probability</term>
    <meaning>
    The likelihood that an event will occur given that another event has already occurred.
    </meaning>
  </definition>




  <definition id="coninter">
    <term>Confidential Interval</term>
    <meaning>
  An interval estimate for unknown population parameter. This depends on: 
<list type="bulleted" id="confint1">
<item>The desired confidence level.</item> <item>What is known for the distribution information (for ex., known variance).</item><item>Gathering from the sampling information.</item></list>
    </meaning>
  </definition>




  <definition id="conflevel">
    <term>Confidence Level</term>
    <meaning>
The percent expression for the probability that the confidence interval contains the true population parameter. That is, for ex., if CL=90%, then in 90 out of 100 samples the interval estimate will enclose the true population parameter.
    </meaning>
  </definition>




  <definition id="contintable">
    <term>Contingency Table</term>
    <meaning>
 The method of displaying a frequency distribution in case of dependable (contingent) variables; the table provides the easy way to calculate conditional probabilities.
    </meaning>
  </definition>


<definition id="continrv">
    <term>Continuous RV</term>
    <meaning>
     A random variable (RV) with continuous domain.
    </meaning>
<example id="contrvex"> <para id="contrvpara">The height of trees in the forest is a continuous RV.</para></example>
  </definition>


  <definition id="Corranal">
    <term>Correlation Analysis</term>
    <meaning>
      A group of statistical procedures used to measure the strength of the relationship between two variables.
    </meaning>
  </definition>


  <definition id="countprinc">
    <term>Counting Principal</term>
    <meaning>
      If there are 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>m</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{m} {}</m:annotation></m:semantics></m:math> ways of doing one thing and 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>n</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{n} {}</m:annotation></m:semantics></m:math> ways of doing another, then there are 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>m</m:mi><m:mo stretchy="false">×</m:mo><m:mi>n</m:mi></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{m times n} {}</m:annotation></m:semantics></m:math> ways of doing both. 
  </meaning>
<example id="cntprn1"><para id="cntprn2">A cafe offers 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>m</m:mi><m:mo stretchy="false">=</m:mo><m:mn>5</m:mn></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{m=5} {}</m:annotation></m:semantics></m:math> kinds of coffee and 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>n</m:mi><m:mo stretchy="false">=</m:mo><m:mn>7</m:mn></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{n=7} {}</m:annotation></m:semantics></m:math> kinds of cake. There are 35 ways to serve coffee with cake.</para></example>

  
  </definition>




  <definition id="critval">
    <term>Critical Value</term>
    <meaning>
      The dividing point between the region where the null hypothesis is not rejected and the region where it is rejected. For a one-tailed test, there is only one critical value; for a two-tailed test, there are two critical values—one in each tail— with the same absolute value and opposite signs.
    </meaning>
  </definition>



  <definition id="data">
    <term>Data</term>
    <meaning>
      A set of observations (a set of possible outcomes). Most data can be put into two groups: <emphasis>qualitative</emphasis> (hair color, ethnic groups and many other <emphasis>attributes</emphasis> of population) and <emphasis>quantitative</emphasis> (distance traveled to college, number of children in a family, etc.). In its turn quantitative data can be separated into two subgroups: <emphasis>discrete</emphasis> and <emphasis>continuous</emphasis>. Roughly speaking, data is discrete if it is result of counting (a number of student of the given ethnic group in a class, a number of books on a shelf, etc.), and data is continuous if it is result of measuring (distance traveled, weight of luggage, etc.)
    </meaning>
  </definition>




  <definition id="degrefree">
    <term>Degrees of Freedom (df)</term>
    <meaning>
The number of objects in a sample that are free to vary.
    </meaning>
  </definition>




  <definition id="depsample">
    <term>Dependant Samples</term>
    <meaning>
   Samples chosen from several populations in such a way that they are not independent of each other. Paired samples are dependent because the same individual or item is a member of both samples. 
   </meaning>

<example id="depsample1"><para id="depsamp2">If the test scores of 13 individuals were recorded before a new teaching method was introduced, and then after using the new method, the two paired samples would be considered dependent.
 </para></example>

  </definition>




  <definition id="descstats">
    <term>Descriptive Statistics</term>
    <meaning>
     The methods to describe the important characteristics of a data; for example, charts, frequency distribution, measures of central tendency and measures of spread and skewness.
    </meaning>
  </definition>



  <definition id="discrrv">
    <term>Discrete RV</term>
    <meaning>
 A random variable (RV) that can assume only countable set of values. 
  </meaning>

<example id="discrv1"><para id="discrv2">
<list type="bulleted" id="disccrv3">
<item>Face nominations of cubic die 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mrow/><m:mo stretchy="false">=</m:mo><m:mrow><m:mo stretchy="false">{</m:mo><m:mn>1,2,3,4,5,6</m:mn><m:mo stretchy="false">}</m:mo></m:mrow></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{ {}= lbrace 1,2,3,4,5,6 rbrace } {}</m:annotation></m:semantics></m:math>.</item><item>A number of accidents on HW280 at Thanksgiving Holidays).</item>
</list>

</para></example>

  </definition>




  <definition id="domain">
    <term>Domain</term>
    <meaning>
     A set of possible values for (independent) variable. Domain is a very important part of the definition of a function. For example, the equation 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>y</m:mi><m:mo stretchy="false">=</m:mo><m:msup><m:mi>x</m:mi><m:mstyle fontsize="8pt"><m:mrow><m:mn>2</m:mn></m:mrow></m:mstyle></m:msup></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{y=x rSup { size 8{2} } } {}</m:annotation></m:semantics></m:math> defines one-to-one function if domain is the set of nonnegative real numbers and not one-to-one function if domain is the set of all real numbers. </meaning>
<example id="dom1"><para id="dom2">

<list type="bulleted" id="dom3"><item>We are interested in the longevity of human life in years; the domain is 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mo stretchy="false">{</m:mo><m:mrow><m:mn>0,1,2,3</m:mn><m:mtext>.</m:mtext><m:mtext>.</m:mtext><m:mtext>.</m:mtext><m:mi>,</m:mi><m:mtext>120</m:mtext></m:mrow><m:mo stretchy="false">}</m:mo></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{ lbrace 0,1,2,3 "."  "."  "." ,"120" rbrace } {}</m:annotation></m:semantics></m:math>. </item><item> We are interested in the suit when dealing with the regular deck; the domain is <m:math> <m:mrow>
    <m:mo>{</m:mo>
<m:mo>♥</m:mo>
    <m:mo>;</m:mo>
    <m:mo>♦</m:mo>
    <m:mo>;</m:mo>
    <m:mo>♣</m:mo>
    <m:mo>;</m:mo>
    <m:mo>♠</m:mo>
    <m:mo>}</m:mo>
  </m:mrow></m:math>. </item></list>
</para>
</example>
   
  </definition>




  <definition id="eqlikly">
    <term>Equally Likely</term>
    <meaning>
    Each outcome of an experiment has the same probability.
    </meaning>
  </definition>




  <definition id="ebmbound">
    <term>Error Bound for a Population Mean (EBM)</term>
    <meaning>
      The margin of error. Depends on the confidence level, sample size, and known or estimated population standard deviation.
    </meaning>
  </definition>

 <definition id="ebpbound">
    <term>Error Bound for a Proportion (EBP)</term>
    <meaning>
      The margin of error. Depends on the confidence level, sample size, and the estimated (from the sample) proportion of success.
    </meaning>
  </definition>

 <definition id="event">
    <term>Event</term>
    <meaning>
     A subset in the set of all outcomes of an experiment. The set of all outcomes of an experiment is called a <emphasis>sample space</emphasis> and denoted, as a rule, by S. An event is any arbitrary subset in <emphasis>S</emphasis>: it can contain one outcome, two outcomes, and even no outcomes (empty subset) or all of them (sample space). Standard notations for events are capital letters such as A, B, C, etc. 
    </meaning>
  </definition>


 <definition id="exhaustve">
    <term>Exhaustive</term>
    <meaning>
     Each outcome must appear in one class (category).
    </meaning>
  </definition>


 <definition id="expectedv">
    <term>Expected Value</term>
    <meaning>
     Expected arithmetic average when an experiment is repeated many times. (Called also mean). Notations: 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>E</m:mi><m:mo stretchy="false">(</m:mo><m:mi>x</m:mi><m:mo stretchy="false">)</m:mo><m:mi>,</m:mi><m:mi>μ</m:mi></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{E \( x \) ,μ} {}</m:annotation></m:semantics></m:math> For discrete random variable (RV) with probability distribution function 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>P</m:mi><m:mo stretchy="false">(</m:mo><m:mi>x</m:mi><m:mrow><m:mo stretchy="false">)</m:mo><m:mo stretchy="false">=</m:mo><m:mi>P</m:mi></m:mrow><m:mo stretchy="false">(</m:mo><m:mrow><m:mi>X</m:mi><m:mo stretchy="false">=</m:mo><m:mi>x</m:mi></m:mrow><m:mo stretchy="false">)</m:mo></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{P \( x \) =P \( X=x \) } {}</m:annotation></m:semantics></m:math> the definition also can be written in the form 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>E</m:mi><m:mo stretchy="false">(</m:mo><m:mi>X</m:mi><m:mrow><m:mrow><m:mo stretchy="false">)</m:mo><m:mo stretchy="false">=</m:mo><m:mi>μ</m:mi></m:mrow><m:mo stretchy="false">=</m:mo><m:mrow><m:mo stretchy="false">∑</m:mo><m:mrow><m:mstyle fontstyle="italic"><m:mrow><m:mtext>xP</m:mtext></m:mrow></m:mstyle><m:mo stretchy="false">(</m:mo><m:mi>x</m:mi><m:mo stretchy="false">)</m:mo></m:mrow></m:mrow></m:mrow></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{E \( X \) =μ= Sum { ital "xP" \( x \) } } {}</m:annotation></m:semantics></m:math>.
    </meaning>
  </definition>


 <definition id="expdist">
    <term>Exponential Distribution</term>
    <meaning>
     Continuous random variable (RV) that appears when we are interested in intervals of time between some random events, for example, the length of time between emergency arrivals at a hospital. Notation: 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>X</m:mi><m:mtext>~</m:mtext><m:mstyle fontstyle="italic"><m:mrow><m:mtext>Exp</m:mtext></m:mrow></m:mstyle><m:mo stretchy="false">(</m:mo><m:mi>m</m:mi><m:mo stretchy="false">)</m:mo></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{X "~"  ital "Exp" \( m \) } {}</m:annotation></m:semantics></m:math>; the mean is 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>μ</m:mi><m:mo stretchy="false">=</m:mo><m:mfrac><m:mn>1</m:mn><m:mi>m</m:mi></m:mfrac></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{μ= {  {1}  over  {m} } } {}</m:annotation></m:semantics></m:math>, and the variance is 
<m:math> <m:msup>
    <m:mi>σ</m:mi>
    <m:mn>2</m:mn>
  </m:msup>
  <m:mo>=</m:mo>
  <m:mfrac>
    <m:mn>1</m:mn>
    <m:msup>
      <m:mi>m</m:mi>
      <m:mn>2</m:mn>
    </m:msup>
  </m:mfrac></m:math>, the probability density function is 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>f</m:mi><m:mo stretchy="false">(</m:mo><m:mi>x</m:mi><m:mrow><m:mo stretchy="false">)</m:mo><m:mo stretchy="false">=</m:mo><m:mstyle fontstyle="italic"><m:mrow><m:msup><m:mtext>me</m:mtext><m:mstyle fontsize="8pt"><m:mrow><m:mrow><m:mo stretchy="false">−</m:mo><m:mstyle fontstyle="italic"><m:mrow><m:mtext>mx</m:mtext></m:mrow></m:mstyle></m:mrow></m:mrow></m:mstyle></m:msup></m:mrow></m:mstyle></m:mrow><m:mi>,</m:mi><m:mtext/></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{f \( x \) = ital "me" rSup { size 8{- ital "mx"} } ,"  "} {}</m:annotation></m:semantics></m:math>  <m:math><m:mrow>
    <m:mi>x</m:mi>
    <m:mo>≥</m:mo>
    <m:mn>0</m:mn>
  </m:mrow></m:math> and cumulative distribution is 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>P</m:mi><m:mo stretchy="false">(</m:mo><m:mrow><m:mi>X</m:mi><m:mo stretchy="false">≤</m:mo><m:mi>x</m:mi></m:mrow><m:mrow><m:mo stretchy="false">)</m:mo><m:mo stretchy="false">=</m:mo><m:mrow><m:mn>1</m:mn><m:mo stretchy="false">−</m:mo><m:msup><m:mi>e</m:mi><m:mstyle fontsize="8pt"><m:mrow><m:mrow><m:mo stretchy="false">−</m:mo><m:mstyle fontstyle="italic"><m:mrow><m:mtext>mx</m:mtext></m:mrow></m:mstyle></m:mrow></m:mrow></m:mstyle></m:msup></m:mrow></m:mrow></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{P \( X &lt;= x \) =1-e rSup { size 8{- ital "mx"} } } {}</m:annotation></m:semantics></m:math>.
    </meaning>
  </definition>


 <definition id="experiment">
    <term>Experiment</term>
    <meaning>
  A planned activity carried out under controlled conditions.
    </meaning>
  </definition>


 <definition id="fDistribution">
    <term>F Distribution</term>
    <meaning>
      Developed by Sir Ronald Fisher. The distribution with following characteristics:

<list type="bulleted" id="fdistlist">
<item>The random variable (RV) is a ratio (called F-ratio) of two sums of weighted squares; so it is continuous and takes only nonnegative value. </item><item>The pdf is positively skewed approaching the x-axis never touching it.</item><item>There is a "family" of F distributions.</item></list> Every representative of the family is completely defined by 2 parameters: a number of degrees of freedom in the numerator in F-ratio and the number of degrees of freedom in the denominator in F-ratio. Used to calculate the test statistic in testing of 2 population variances and in ANOVA problems.
    </meaning>
  </definition>


 <definition id="freqdist">
    <term>Frequency Distribution</term>
    <meaning>
    A grouping of data into mutually exclusive classes showing the number of outcomes in each class.
    </meaning>
  </definition>


 <definition id="freq">
    <term>Frequency</term>
    <meaning>
   A number of times a value of the data is occurred in the set of all data.
    </meaning>
  </definition>

 <definition id="geodist">
    <term>Geometric Distribution</term>
    <meaning>
    A discrete random variable (RV) which arises from the Bernoulli trials with the next additional requirement: we keep repeating trials until the first success. Under these circumstances the geometric variable <m:math><m:mi>X</m:mi></m:math> is defined as the number of trials until the first success. The notation is: <emphasis><m:math><m:mi>X</m:mi></m:math>∼<m:math> <m:mi>G</m:mi>
  <m:mo>(</m:mo>
  <m:mi>p</m:mi>
  <m:mo>)</m:mo></m:math></emphasis>; the domain is 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mo stretchy="false">{</m:mo><m:mrow><m:mn>1,2,</m:mn><m:mtext>.</m:mtext><m:mtext>.</m:mtext><m:mtext>.</m:mtext><m:mi>,</m:mi><m:mi>n</m:mi></m:mrow><m:mo stretchy="false">}</m:mo></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{ lbrace 1,2, "."  "."  "." ,n rbrace } {}</m:annotation></m:semantics></m:math>; the mean is 
<m:math><m:mi>μ</m:mi>
  <m:mo>=</m:mo>
  <m:mfrac>
    <m:mn>1</m:mn>
    <m:mi>p</m:mi>
  </m:mfrac></m:math>, and the variance is 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mrow><m:msup><m:mi>σ</m:mi><m:mstyle fontsize="8pt"><m:mrow><m:mn>2</m:mn></m:mrow></m:mstyle></m:msup><m:mo stretchy="false">=</m:mo><m:mrow><m:mfrac><m:mn>1</m:mn><m:mi>p</m:mi></m:mfrac><m:mo stretchy="false">⋅</m:mo><m:mo stretchy="false">(</m:mo></m:mrow></m:mrow><m:mrow><m:mfrac><m:mn>1</m:mn><m:mi>p</m:mi></m:mfrac><m:mo stretchy="false">−</m:mo><m:mn>1</m:mn></m:mrow><m:mo stretchy="false">)</m:mo><m:mtext>.</m:mtext></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{s rSup { size 8{2} } = {  {1}  over  {p} }  cdot  \(  {  {1}  over  {p} } -1 \)  "." } {}</m:annotation></m:semantics></m:math> The probability to have exactly x failures before the first success is given by the formula: 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>P</m:mi><m:mo stretchy="false">(</m:mo><m:mrow><m:mi>X</m:mi><m:mo stretchy="false">=</m:mo><m:mi>x</m:mi></m:mrow><m:mrow><m:mo stretchy="false">)</m:mo><m:mo stretchy="false">=</m:mo><m:mi>p</m:mi></m:mrow><m:mo stretchy="false">(</m:mo><m:mrow><m:mn>1</m:mn><m:mo stretchy="false">−</m:mo><m:mi>p</m:mi></m:mrow><m:msup><m:mo stretchy="false">)</m:mo><m:mstyle fontsize="8pt"><m:mrow><m:mrow><m:mi>x</m:mi><m:mo stretchy="false">−</m:mo><m:mn>1</m:mn></m:mrow></m:mrow></m:mstyle></m:msup></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{P \( X=x \) =p \( 1 - p \)  rSup { size 8{x - 1} } } {}</m:annotation></m:semantics></m:math>.
    </meaning>
  </definition>

<definition id="geomean">
    <term>Geometric Mean</term>
    <meaning>
    The nth root of the product of n the values.
    </meaning>
  </definition>

 <definition id="hpygeoprob">
    <term>Hypergeometric Probability</term>
    <meaning>
   A discrete random variable (RV) with characteristics: 
<list type="bulleted" id="hyp1">
<item>There is a fixed number of trials. </item><item> The probability of success is not the same from trial to trial, so it is not Bernoulli trials.</item>
</list>
 The typical example is sampling from a mixture of two groups of items, when we are interested in the only one. <m:math><m:mi>X</m:mi></m:math> is defined as the number of successes out of the total number chosen. The notation is: 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>X</m:mi><m:mtext>~</m:mtext><m:mi>H</m:mi><m:mo stretchy="false">(</m:mo><m:mi>r</m:mi><m:mi>,</m:mi><m:mi>b</m:mi><m:mi>,</m:mi><m:mi>n</m:mi><m:mo stretchy="false">)</m:mo><m:mtext>.</m:mtext></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{X "~" H \( r,b,n \)} {}</m:annotation></m:semantics></m:math>, where <m:math><m:mi>r</m:mi></m:math> = number of items in the group of interest, <m:math><m:mi>b</m:mi></m:math> = number of items in the group not of interest, and <m:math><m:mi>n</m:mi></m:math> = number of items chosen. 
    </meaning>
  </definition>


 <definition id="hypotest">
    <term>Hypothesis Testing</term>
    <meaning>
   Based on sample evidence procedure to determine whether the hypothesis stated is a reasonable statement and cannot be rejected, or is unreasonable and should be rejected.
    </meaning>
  </definition>


 <definition id="hypothesis">
    <term>Hypothesis</term>
    <meaning>
   A statement about the value of a population parameter. In case of two hypotheses, the statement assumed to be true is called null hypothesis (notation 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:msub><m:mi>H</m:mi><m:mstyle fontsize="8pt"><m:mrow><m:mn>0</m:mn></m:mrow></m:mstyle></m:msub></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{H rSub { size 8{0} } } {}</m:annotation></m:semantics></m:math>) and contradictory statement is called alternate hypothesis (notation 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:msub><m:mi>H</m:mi><m:mstyle fontsize="8pt"><m:mrow><m:mi>a</m:mi></m:mrow></m:mstyle></m:msub></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{H rSub { size 8{a} } } {}</m:annotation></m:semantics></m:math>).
    </meaning>
  </definition>


 <definition id="indevents">
    <term>Independent Events</term>
    <meaning>
   The occurrence of one event has no effect on the probability of the occurrence of any other event. Events A and B are independent if one of the following is true: (1). <m:math><m:mi>P</m:mi>
  <m:mo>(</m:mo>
  <m:msup>
    <m:mi>A</m:mi>
    <m:mn>2</m:mn>
  </m:msup>
  <m:mi>B</m:mi>
  <m:mo>)</m:mo>
  <m:mo>=</m:mo>
  <m:mi>P</m:mi>
  <m:mo>(</m:mo>
  <m:mi>A</m:mi>
  <m:mo>)</m:mo>
  <m:mo>;</m:mo></m:math> (2) <m:math><m:mi>P</m:mi>
  <m:mo>(</m:mo>
  <m:msup>
    <m:mi>B</m:mi>
    <m:mn>2</m:mn>
  </m:msup>
  <m:mi>A</m:mi>
  <m:mo>)</m:mo>
  <m:mo>=</m:mo>
  <m:mi>P</m:mi>
  <m:mo>(</m:mo>
  <m:mi>B</m:mi>
  <m:mo>)</m:mo>
  <m:mo>;</m:mo></m:math> (3) <m:math> <m:mi>P</m:mi>
  <m:mo>(</m:mo>
  <m:mi>A</m:mi>
  <m:mi>and</m:mi>
  <m:mi>B</m:mi>
  <m:mo>)</m:mo>
  <m:mo>=</m:mo>
  <m:mi>P</m:mi>
  <m:mo>(</m:mo>
  <m:mi>A</m:mi>
  <m:mo>)</m:mo>
  <m:mi>P</m:mi>
  <m:mo>(</m:mo>
  <m:mi>B</m:mi>
  <m:mo>)</m:mo></m:math>.
    </meaning>
  </definition>


 <definition id="indsamp">
    <term>Independent Samples</term>
    <meaning>
   Samples that are not related in any way.
    </meaning>
  </definition>


 <definition id="infrstats">
    <term>Inferential Statistics </term>
    <meaning>
   also called statistical inference or inductive statistics. This facet of statistics deals with estimating a population parameter based on a sample statistic. For example, if 4 out of the 100 calculators sampled are defective we might infer that 4 percent of the production is defective.
    </meaning>
  </definition>


 <definition id="iqr">
    <term>Interquartile Range (IRQ)</term>
    <meaning>
   The distance between the third quartile and the first quartile. 
    </meaning>
  </definition>


 <definition id="intest">
    <term>Interval Estimate</term>
    <meaning>
   The based on sample information interval within which a population parameter probably lies.
    </meaning>
  </definition>


 <definition id="signtest">
    <term>Level of Significance of the Test </term>
    <meaning>
   Also often referred as <emphasis>preconceived α or probability </emphasis>of Type I error. The probability to reject the null hypothesis when the null hypothesis, in fact, is true.
    </meaning>
  </definition>


 <definition id="linregress">
    <term>Linear Regression Equation </term>
    <meaning>
   A linear equation in the form <m:math>  <m:mover>
<m:mi>y</m:mi>
<m:mo>^</m:mo>
</m:mover>
  <m:mo>=</m:mo>
  <m:mi>a</m:mi>
  <m:mo>+</m:mo>
  <m:mi>bX</m:mi></m:math>, that defines the relationship between two variables. It is used to predict dependent variable <m:math><m:mi>Y</m:mi></m:math> based on a selected value of independent variable <m:math><m:mi>X</m:mi></m:math>.
    </meaning>
  </definition>


 <definition id="mean">
    <term>Mean</term>
    <meaning>
   A number to measure the central tendency (average), shortening from arithmetic mean. By definition, the mean for a sample (usually denoted by 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mover accent="true"><m:mi>X</m:mi><m:mo stretchy="false">ˉ</m:mo></m:mover></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{ { bar  {X}}} {}</m:annotation></m:semantics></m:math>) is 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mover accent="true"><m:mi>X</m:mi><m:mo stretchy="false">ˉ</m:mo></m:mover><m:mo stretchy="false">=</m:mo><m:mfrac><m:mtext>Sum of all values in the sample</m:mtext><m:mtext>Number of values in the sample</m:mtext></m:mfrac></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{ { bar  {X}}= {  {"Sum of all values in the sample"}  over  {"Number of values in the sample"} } } {}</m:annotation></m:semantics></m:math>, and the mean for a population (usually denoted by
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>μ</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{m} {}</m:annotation></m:semantics></m:math>) is 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>μ</m:mi><m:mo stretchy="false">=</m:mo><m:mfrac><m:mtext>Sum of all values in the population</m:mtext><m:mtext>Number of values in the population</m:mtext></m:mfrac></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{m= {  {"Sum of all values in the population"}  over  {"Number of values in the population"} } } {}</m:annotation></m:semantics></m:math>.
    </meaning>
  </definition>


 <definition id="median">
    <term>Median</term>
    <meaning>
   A number that separates ordered data into halves: half the values are the same number or smaller than the median and half the values are the same number or larger than the median. The median may or may not be part of the data.
    </meaning>
  </definition>


 <definition id="mode">
    <term>Mode</term>
    <meaning>
   The value that appears most frequently in a set of data.
    </meaning>
  </definition>

 <definition id="multirule">
    <term>Multiplication Rule</term>
    <meaning>
   For any events A and B in the sample space, <m:math>
        <m:semantics>
          <m:mrow/>
          <m:annotation encoding="StarMath 5.0">{}</m:annotation>
        </m:semantics>
      </m:math>
      <m:math>
        <m:semantics>
          <m:mrow>
            <m:mstyle fontsize="12pt">
              <m:mrow>
                <m:mrow>
                  <m:mi>P</m:mi>
                  <m:mo stretchy="false">(</m:mo>
                  <m:mi>A</m:mi>
                  <m:mstyle fontweight="bold">
                    <m:mrow>
                      <m:mtext>and</m:mtext>
                    </m:mrow>
                  </m:mstyle>
                  <m:mi>B</m:mi>
                  <m:mrow>
                    <m:mo stretchy="false">)</m:mo>
                    <m:mo stretchy="false">=</m:mo>
                    <m:mi>P</m:mi>
                  </m:mrow>
                  <m:mo stretchy="false">(</m:mo>
                  <m:mi>A</m:mi>
                  <m:mo stretchy="false">∣</m:mo>
                  <m:mi>B</m:mi>
                  <m:mrow>
                    <m:mo stretchy="false">)</m:mo>
                    <m:mo stretchy="false">⋅</m:mo>
                    <m:mi>P</m:mi>
                  </m:mrow>
                  <m:mo stretchy="false">(</m:mo>
                  <m:mi>B</m:mi>
                  <m:mrow>
                    <m:mo stretchy="false">)</m:mo>
                    <m:mo stretchy="false">=</m:mo>
                    <m:mi>P</m:mi>
                  </m:mrow>
                  <m:mo stretchy="false">(</m:mo>
                  <m:mi>B</m:mi>
                  <m:mo stretchy="false">∣</m:mo>
                  <m:mi>A</m:mi>
                  <m:mrow>
                    <m:mo stretchy="false">)</m:mo>
                    <m:mo stretchy="false">⋅</m:mo>
                    <m:mi>P</m:mi>
                  </m:mrow>
                  <m:mo stretchy="false">(</m:mo>
                  <m:mi>A</m:mi>
                  <m:mo stretchy="false">)</m:mo>
                  <m:mtext>.</m:mtext>
                </m:mrow>
              </m:mrow>
            </m:mstyle>
            <m:mrow/>
          </m:mrow>
          <m:annotation encoding="StarMath 5.0"> size 12{P \( A bold "and"B \) =P \( A \lline B \)  cdot P \( B \) =P \( B \lline A \)  cdot P \( A \)  "." } {}</m:annotation>
        </m:semantics>
      </m:math>
    </meaning>
  </definition>


<definition id="mutex">
    <term>Mutually Exclusive</term>
    <meaning>
   An observation cannot fall into more than one class (category). Being in one category prevents being in a mutually exclusive category.
    </meaning>
  </definition>

 

<definition id="normdist">
    <term>Normal Distribution</term>
    <meaning>
   A continuous random variable (RV) with 
<m:math>
<m:mi>p</m:mi><m:mi>d</m:mi><m:mi>f</m:mi><m:mo>=</m:mo><m:mfrac>
   <m:mn>1</m:mn>
   <m:mrow>
     <m:mi>σ</m:mi>
     <m:msqrt>
       <m:mn>2</m:mn><m:mi>π</m:mi>
     </m:msqrt>
   </m:mrow>
</m:mfrac>


<m:msup>
  <m:mi>e</m:mi>
  <m:mrow>
   <m:mfrac>
     <m:mrow>
        <m:msup>
           <m:mrow>
              <m:mo>-</m:mo><m:mo>(</m:mo><m:mi>x</m:mi><m:mo>-</m:mo><m:mi>μ</m:mi><m:mo>)</m:mo>
           </m:mrow>
           <m:mn>2</m:mn>
        </m:msup>     
     </m:mrow>
     <m:mrow>
        <m:mn>2</m:mn>
        <m:msup>
           <m:mi>σ</m:mi>
           <m:mn>2</m:mn>
        </m:msup>
     </m:mrow>
   </m:mfrac>
  </m:mrow>
</m:msup>

</m:math>, where <m:math><m:mi>μ</m:mi></m:math>  is the mean of the distribution and <m:math><m:mi>σ</m:mi></m:math>  is its standard deviation. Notation: <m:math><m:mi>X</m:mi></m:math>  ~  <m:math> <m:mi>N</m:mi>
  <m:mfenced>
    <m:mi>μ</m:mi>
    <m:msup>
      <m:mi>σ</m:mi>
      <m:mn>2</m:mn>
    </m:msup>
  </m:mfenced></m:math>. If <m:math><m:mi>μ</m:mi><m:mo>=</m:mo><m:mn>0</m:mn></m:math> and <m:math><m:mi>σ</m:mi><m:mo>=</m:mo><m:mn>1</m:mn></m:math>, the RV is called <emphasis>standard normal distribution</emphasis>, or <emphasis>z-score</emphasis>.
    </meaning>
  </definition>

 

<definition id="onetailtest">
    <term>One-Tailed Test</term>
    <meaning>
   Used when the alternate hypothesis states a direction, such as Ha:m &gt; 40. Rejection region is only in one tail (the right tail).
    </meaning>
  </definition>

 

<definition id="or">
    <term>OR</term>
    <meaning>
   Logical operation over the subsets of a set. In statistics, if <m:math><m:mi>A</m:mi></m:math> and <m:math><m:mi>B</m:mi></m:math> are any two events (subsets in the sample space), then the event “<m:math><m:mi>A</m:mi></m:math> <emphasis>or</emphasis> <m:math><m:mi>B</m:mi></m:math>” consists of all outcomes that are in <m:math><m:mi>A</m:mi></m:math>, or in <m:math><m:mi>B</m:mi></m:math>, or in both <m:math><m:mi>A</m:mi></m:math> and <m:math><m:mi>B</m:mi></m:math>.
    </meaning>
  </definition>

 

<definition id="outcome">
    <term>Outcome (observation)</term>
    <meaning>
   A particular result of an experiment.
    </meaning>
  </definition>

 

<definition id="outlier">
    <term>Outlier</term>
    <meaning>
   An observation that does not fit the rest of the data.
    </meaning>
  </definition>

 

<definition id="parameter">
    <term>Parameter</term>
    <meaning>
   A numerical characteristic of the population. 
    </meaning>
<example id="param1"><para id="param2">The mean price to rent a 1-bedroom apartment in California.</para></example>

  </definition>

 
<definition id="pdf">
    <term>pdf</term>
    <meaning>
   see <term src="#pdffn"> Probability Density Function</term>
    </meaning>
  </definition>

 
<definition id="pdf2">
    <term>PDF</term>
    <meaning>
   see <term src="#pdfelab"> Probability Distribution Function</term>
    </meaning>
  </definition>

<definition id="percentile">
    <term>Percentile</term>
    <meaning>
 A number that separates 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mfrac><m:mn>1</m:mn><m:mtext>100</m:mtext></m:mfrac></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{ {  {1}  over  {"100"} } } {}</m:annotation></m:semantics></m:math>of the data. </meaning>

<example id="prtil1"><para id="prtil2">
Let a data set contain 200 ordered observations starting with 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mo stretchy="false">{</m:mo><m:mrow><m:mn>2</m:mn><m:mtext>.</m:mtext><m:mn>3,2</m:mn><m:mtext>.</m:mtext><m:mn>7,2</m:mn><m:mtext>.</m:mtext><m:mn>8,2</m:mn><m:mtext>.</m:mtext><m:mn>9,2</m:mn><m:mtext>.</m:mtext><m:mn>9,3</m:mn><m:mtext>.</m:mtext><m:mn>0</m:mn><m:mtext>.</m:mtext><m:mtext>.</m:mtext><m:mtext>.</m:mtext></m:mrow><m:mo stretchy="false">}</m:mo></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{ lbrace 2 "." 3,2 "." 7,2 "." 8,2 "." 9,2 "." 9,3 "." 0 "."  "."  "."  rbrace } {}</m:annotation></m:semantics></m:math>. Then the first percentile is 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mrow><m:mfrac><m:mrow><m:mo stretchy="false">(</m:mo><m:mn>2</m:mn><m:mtext>.</m:mtext><m:mrow><m:mn>7</m:mn><m:mo stretchy="false">+</m:mo><m:mn>2</m:mn></m:mrow><m:mtext>.</m:mtext><m:mn>8</m:mn><m:mo stretchy="false">)</m:mo></m:mrow><m:mn>2</m:mn></m:mfrac><m:mo stretchy="false">=</m:mo><m:mn>2</m:mn></m:mrow><m:mtext>.</m:mtext><m:mtext>75</m:mtext></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{ {  { \( 2 "." 7+2 "." 8 \) }  over  {2} } =2 "." "75"} {}</m:annotation></m:semantics></m:math>, because 1% of the data is to the left of this point on the number line and 99% of the data is on its right. The second percentile is 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mrow><m:mfrac><m:mrow><m:mo stretchy="false">(</m:mo><m:mn>2</m:mn><m:mtext>.</m:mtext><m:mrow><m:mn>9</m:mn><m:mo stretchy="false">+</m:mo><m:mn>2</m:mn></m:mrow><m:mtext>.</m:mtext><m:mn>9</m:mn><m:mo stretchy="false">)</m:mo></m:mrow><m:mn>2</m:mn></m:mfrac><m:mo stretchy="false">=</m:mo><m:mn>2</m:mn></m:mrow><m:mtext>.</m:mtext><m:mn>9</m:mn></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{ {  { \( 2 "." 9+2 "." 9 \) }  over  {2} } =2 "." 9} {}</m:annotation></m:semantics></m:math>, separating 2% of the data. Percentiles may or may not be part of the data. (In this example, the first percentile is not in the data, but the second percentile is.). The median of the data is the second quartile and is the 50-th percentile at the same time. The first and third quartiles are 25th and 75th percentiles, respectively.
    </para></example>


  </definition>

 
<definition id="pointest">
    <term>Point Estimate</term>
    <meaning>
 A single number computed from a sample and used to estimate a population parameter. 
    </meaning>
  </definition>

 
<definition id="poisson">
    <term>Poisson Distribution</term>
    <meaning>
 A discrete random variable (RV) is the number of times a certain event will occur in a specific period of time, or in specific area, or any other units of measurement. The characteristics of the variable are: the probability that an event occurs in a given unit is the same for all units and doesn’t depend on the number of event that occurs in the other units. The distribution is completely defined by the mean number μ of event in the unit interval of measurement. The notation is: 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>X</m:mi><m:mtext>~</m:mtext><m:mi>P</m:mi><m:mo stretchy="false">(</m:mo><m:mi>μ</m:mi><m:mo stretchy="false">)</m:mo></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{X "~" P \( μ \) } {}</m:annotation></m:semantics></m:math>; the domain is whole numbers, 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mo stretchy="false">{</m:mo><m:mrow><m:mn>0,1,2,</m:mn><m:mtext>.</m:mtext><m:mtext>.</m:mtext><m:mtext>.</m:mtext></m:mrow><m:mo stretchy="false">}</m:mo></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{ lbrace 0,1,2, "."  "."  "."  rbrace } {}</m:annotation></m:semantics></m:math>; the mean is 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>μ</m:mi><m:mo stretchy="false">=</m:mo><m:mstyle fontstyle="italic"><m:mrow><m:mtext>np</m:mtext></m:mrow></m:mstyle></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{μ= ital "np"} {}</m:annotation></m:semantics></m:math>, and the variance is 
<m:math>    <m:msup>
    <m:mi>σ</m:mi>
    <m:mn>2</m:mn>
  </m:msup>
  <m:mo>=</m:mo>
  <m:msup>
    <m:mi>μ</m:mi>
    <m:mn>2</m:mn>
  </m:msup> </m:math>, the probability to have exactly <m:math><m:mi>x</m:mi></m:math>  successes in <m:math><m:mi>r</m:mi></m:math>  trials is 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>P</m:mi><m:mo stretchy="false">(</m:mo><m:mrow><m:mi>X</m:mi><m:mo stretchy="false">=</m:mo><m:mi>x</m:mi></m:mrow><m:mrow><m:mo stretchy="false">)</m:mo><m:mo stretchy="false">=</m:mo><m:msup><m:mi>e</m:mi><m:mstyle fontsize="8pt"><m:mrow><m:mrow><m:mo stretchy="false">−</m:mo><m:mi>μ</m:mi></m:mrow></m:mrow></m:mstyle></m:msup></m:mrow><m:mfrac><m:msup><m:mi>μ</m:mi><m:mstyle fontsize="8pt"><m:mrow><m:mi>x</m:mi></m:mrow></m:mstyle></m:msup><m:mrow><m:mi>x</m:mi><m:mi>!</m:mi></m:mrow></m:mfrac></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{P \( X=x \) =e rSup { size 8{ - μ} }  {  {μ rSup { size 8{x} } }  over  {x!} } } {}</m:annotation></m:semantics></m:math>. The Poisson distribution often used to approximate the binomial distribution when n is “large” and p is “small” (a general rule is that n should be equal to or greater than 20 and <m:math><m:mi>p</m:mi></m:math>  equal to or less than .05).
    </meaning>
  </definition>

 
<definition id="population">
    <term>Population</term>
    <meaning>
 The collection, or set, of all individuals, objects, or measurements whose properties are being studied.
    </meaning>
  </definition>

 
<definition id="prealpha">
    <term>Preconceived <m:math><m:ci>α</m:ci></m:math></term>
    <meaning>
 The probability to reject the null hypothesis when the null hypothesis is true. This probability is also often referred as probability of Type I error or the level of significance of the test.
    </meaning>
  </definition>

 
<definition id="pdffn">
    <term>Probability Density Function (pdf)</term>
    <meaning>
 A mathematical description of a continuous random variable (RV). For a continuous RV probability of any specific value in domain is zero: 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>P</m:mi><m:mo stretchy="false">(</m:mo><m:mrow><m:mi>X</m:mi><m:mo stretchy="false">=</m:mo><m:mi>x</m:mi></m:mrow><m:mrow><m:mo stretchy="false">)</m:mo><m:mo stretchy="false">=</m:mo><m:mn>0</m:mn></m:mrow></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{P \( X=x \) =0} {}</m:annotation></m:semantics></m:math>. Therefore, PDF doesn’t give any information about behavior of RV. The description of a continuous RV is given by the probability density function (pdf). By definition, pdf is any positive function f(x) over the real numbers such that the area bounded above by f(x), below by x-axis and from the right by a vertical line 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>X</m:mi><m:mo stretchy="false">=</m:mo><m:mi>x</m:mi></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{X=x} {}</m:annotation></m:semantics></m:math>, equals to
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>P</m:mi><m:mo stretchy="false">(</m:mo><m:mrow><m:mi>X</m:mi><m:mo stretchy="false">≤</m:mo><m:mi>x</m:mi></m:mrow><m:mo stretchy="false">)</m:mo></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{P \( X &lt;= x \) } {}</m:annotation></m:semantics></m:math>.
    </meaning>
  </definition>



 <definition id="pdfelab">
    <term>Probability Distribution Function (PDF)</term>
    <meaning>
   A mathematical description of a discrete random variable (RV), given either in the form of the equation (by formula) , or in the form of a table listing all the possible outcomes of an experiment and the probability associated with each outcome. 
</meaning>

<example id="pdfer1"><para id="pdfer2">
A biased coin with probability 0.7 of head is tossed 5 times. We are interested in the number of heads (means, the RV <m:math><m:mi>X</m:mi></m:math>  = the number of heads). <m:math><m:mi>X</m:mi></m:math>  is Binomial RV, so <m:math><m:mi>X</m:mi></m:math> ∼<m:math><m:mi>B</m:mi>
  <m:mfenced>
    <m:mn>5</m:mn>
    <m:mrow>
      <m:mo>.</m:mo>
      <m:mn>7</m:mn>
    </m:mrow>
  </m:mfenced></m:math> and 
<m:math>   <m:mi>P</m:mi>
  <m:mo>(</m:mo>
  <m:mi>X</m:mi>
  <m:mo>=</m:mo>
  <m:mi>x</m:mi>
  <m:mo>)</m:mo>
  <m:mo>=</m:mo></m:math> <m:math>  <m:mfenced>
    <m:mtable>
      <m:mtr>
        <m:mtd>
          <m:mn>5</m:mn>
        </m:mtd>
      </m:mtr>
      <m:mtr>
        <m:mtd>
          <m:mi>x</m:mi>
        </m:mtd>
      </m:mtr>
    </m:mtable>
  </m:mfenced>
  <m:msup>
    <m:mrow>
      <m:mo>.</m:mo>
      <m:mn>7</m:mn>
    </m:mrow>
    <m:mi>x</m:mi>
  </m:msup>
  <m:msup>
    <m:mrow>
      <m:mo>.</m:mo>
      <m:mn>3</m:mn>
    </m:mrow>
    <m:mrow>
      <m:mn>5</m:mn>
      <m:mo>−</m:mo>
      <m:mi>x</m:mi>
    </m:mrow>
  </m:msup></m:math>or in the form of the table.
<table id="id4500004" frame="none">
      <tgroup cols="2" colsep="1" rowsep="0">
        <colspec colnum="1" colname="c1"/>
        <colspec colnum="2" colname="c2"/>
<thead>        
  <row rowsep="1">
            <entry><m:math><m:mi>x</m:mi></m:math> </entry>
            <entry><m:math>   <m:mi>P</m:mi>
  <m:mo>(</m:mo>
  <m:mi>X</m:mi>
  <m:mo>=</m:mo>
  <m:mi>x</m:mi>
  <m:mo>)</m:mo>
</m:math></entry>
          </row>
</thead>        
<tbody>

          <row>
            <entry>0</entry>
            <entry>0.0024</entry>
          </row>
          <row>
            <entry>1</entry>
            <entry>0.0284</entry>
          </row>
          <row>
            <entry>2</entry>
            <entry>0.1323</entry>
          </row>
          <row>
            <entry>3</entry>
            <entry>0.3087</entry>
          </row>
          <row>
            <entry>4</entry>
            <entry>0.3602</entry>
          </row>
          <row>
            <entry>5</entry>
            <entry>0.1681</entry>
          </row>
        </tbody>
      </tgroup>
    </table> 
    </para></example>
  </definition>

<definition id="probdistr">
    <term>Probability Distribution</term>
    <meaning>
The common name for <emphasis>Probability Density Function (pdf)</emphasis> and <emphasis>Probability Distribution Function (PDF)</emphasis>.
    </meaning>
  </definition>

<definition id="prob">
    <term>Probability</term>
    <meaning>
A number between 0 and 1, inclusive, that gives the likelihood that a specific event will occur. More exact, the foundation of statistics are given by the following 3 axioms (by A. N. Kolmogorov, 1930’s): Let <m:math><m:mi>S</m:mi></m:math>  denote the sample space, <m:math><m:mi>A</m:mi></m:math>  and <m:math><m:mi>B</m:mi></m:math>  are any two events in <m:math><m:mi>S</m:mi></m:math> . Then: (1). 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mrow><m:mn>0</m:mn><m:mo stretchy="false">≤</m:mo><m:mi>P</m:mi></m:mrow><m:mo stretchy="false">(</m:mo><m:mi>A</m:mi><m:mrow><m:mo stretchy="false">)</m:mo><m:mo stretchy="false">≤</m:mo><m:mn>1</m:mn></m:mrow><m:mi>;</m:mi></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{0 &lt;= P \( A \)  &lt;= 1;} {}</m:annotation></m:semantics></m:math> (2). If <m:math><m:mi>A</m:mi></m:math>  and <m:math><m:mi>B</m:mi></m:math>  are any two mutually exclusive events, then <m:math>  <m:mi>P</m:mi>
  <m:mo>(</m:mo>
  <m:mi>A</m:mi>
  <m:mi>or</m:mi>
  <m:mi>B</m:mi>
  <m:mo>)</m:mo>
  <m:mo>=</m:mo>
  <m:mi>P</m:mi>
  <m:mo>(</m:mo>
  <m:mi>A</m:mi>
  <m:mo>)</m:mo>
  <m:mo>+</m:mo>
  <m:mi>P</m:mi>
  <m:mo>(</m:mo>
  <m:mi>B</m:mi>
  <m:mo>)</m:mo>
  <m:mo>;</m:mo></m:math> (3). <m:math><m:mi>P</m:mi>
  <m:mo>(</m:mo>
  <m:mi>S</m:mi>
  <m:mo>)</m:mo>
  <m:mo>=</m:mo>
  <m:mn>1</m:mn></m:math> .
    </meaning>
  </definition>

<definition id="proportion">
    <term>Proportion</term>
    <meaning>
Given a binomial random variable (RV), <m:math><m:mi>X</m:mi></m:math> ∼<m:math>  <m:mi>B</m:mi>
  <m:mfenced>
    <m:mi>n</m:mi>
    <m:mi>p</m:mi>
  </m:mfenced></m:math>, let’s consider the ratio of number <m:math><m:mi>X</m:mi></m:math> of success in n Bernouli trials to the number <m:math><m:mi>n</m:mi></m:math> of trials, <m:math>  <m:mi>P</m:mi>
  <m:mo>'</m:mo>
  <m:mo>=</m:mo>
  <m:mfrac>
    <m:mi>X</m:mi>
    <m:mi>n</m:mi>
  </m:mfrac></m:math>. This new RV is called a proportion, and if the number of trials, <m:math><m:mi>n</m:mi></m:math>, is large enough, <m:math><m:mi>P'</m:mi></m:math> ∼<m:math> <m:mi>N</m:mi>
  <m:mfenced>
    <m:mi>p</m:mi>
    <m:mfrac>
      <m:mi>pq</m:mi>
      <m:mi>n</m:mi>
    </m:mfrac>
  </m:mfenced></m:math>.
    </meaning>
  </definition>

<definition id="pvalue">
    <term>p-value</term>
    <meaning>
The probability that event will happen purely by chance assuming the null hypothesis is true. The smaller p-value, the stronger the evidence is against the null hypothesis.
    </meaning>
  </definition>

<definition id="qual">
    <term>Qualitative Data</term>
    <meaning>
see <term src="#data">Data</term>.
    </meaning>
  </definition>

<definition id="quant">
   <term>Quantitative Data</term>
    <meaning>
 see <term src="#data">Data</term>.
    </meaning>
  </definition>

<definition id="quartiles">
    <term>Quartiles</term>
    <meaning>
The numbers that separate the data into quarters. Quartiles may or may not be part of the data. The second quartile is the median of the data.
    </meaning>
  </definition>

<definition id="range">
    <term>Range</term>
    <meaning>
Difference between the highest and lowest values: Range = Highest value – Lowest value.
    </meaning>
  </definition>

<definition id="relfreq">
    <term>Relative Frequency</term>
    <meaning>
The ratio of a number of times a value of the data is occurred in the set of all outcomes to the number of all outcomes.
    </meaning>
  </definition>

<definition id="randvar">
    <term>Random Variable (RV)</term>
    <meaning>
see <term src="#variable">Variable</term>
    </meaning>
  </definition>


<definition id="samplesp">
    <term>Sample Space</term>
    <meaning>
The set of all possible outcomes of an experiment.
    </meaning>
  </definition>


<definition id="sample">
    <term>Sample</term>
    <meaning>
A portion of the population understudy. A sample is representative if it characterizes the population being studied.
    </meaning>
  </definition>


<definition id="samperr">
    <term>Sample Error</term>
    <meaning>
The difference between a sample statistic and the corresponding population parameter that can be attributed to sampling (to chance).
    </meaning>
  </definition>

<definition id="sampling">
    <term>Sampling</term>
    <meaning>
A procedure for gathering information about entire population. the more popular procedures are: systematic sampling, simple random sampling, stratified sampling, clustered sampling.
    </meaning>
  </definition>

<definition id="scattdia">
    <term>Scatter Diagram</term>
    <meaning>
A chart that visually depicts the relationship between two variables.
    </meaning>
  </definition>

<definition id="simrandsamp">
    <term>Simple Random Sampling</term>
    <meaning>
A sampling scheme in which every member of the population has the same chance of being selected.
    </meaning>
  </definition>

<definition id="spaddrule">
    <term>Special Rule for Addition</term>
    <meaning>
For this rule to apply the events must be mutually exclusive: 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>P</m:mi><m:mo stretchy="false">(</m:mo><m:mstyle fontstyle="italic"><m:mrow><m:mtext>AorB</m:mtext></m:mrow></m:mstyle><m:mrow><m:mo stretchy="false">)</m:mo><m:mo stretchy="false">=</m:mo><m:mi>P</m:mi></m:mrow><m:mo stretchy="false">(</m:mo><m:mi>A</m:mi><m:mrow><m:mo stretchy="false">)</m:mo><m:mo stretchy="false">+</m:mo><m:mi>P</m:mi></m:mrow><m:mo stretchy="false">(</m:mo><m:mi>B</m:mi><m:mo stretchy="false">)</m:mo></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{P \(  ital "AorB" \) =P \( A \) +P \( B \) } {}</m:annotation></m:semantics></m:math>.
    </meaning>
  </definition>

<definition id="spmultrule">
    <term>Special Rule of Multiplication</term>
    <meaning>
For this rule to apply the events must be independent:
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>P</m:mi><m:mo stretchy="false">(</m:mo><m:mstyle fontstyle="italic"><m:mrow><m:mtext>AandB</m:mtext></m:mrow></m:mstyle><m:mrow><m:mo stretchy="false">)</m:mo><m:mo stretchy="false">=</m:mo><m:mi>P</m:mi></m:mrow><m:mo stretchy="false">(</m:mo><m:mi>A</m:mi><m:mo stretchy="false">)</m:mo><m:mi>P</m:mi><m:mo stretchy="false">(</m:mo><m:mi>B</m:mi><m:mo stretchy="false">)</m:mo></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{P \(  ital "AandB" \) =P \( A \) P \( B \) } {}</m:annotation></m:semantics></m:math>.
    </meaning>
  </definition>

<definition id="stddev">
    <term>Standard Deviation</term>
    <meaning>
A number that is equal to the square root of the variance and measures how far data values are from their mean. Notations: s for sample standard deviation and   <m:math><m:ci>σ</m:ci></m:math>for population standard deviation.
    </meaning>
  </definition>

<definition id="stdmean">
    <term>Standard Error of the Mean</term>
    <meaning>
The standard deviation of the distribution of the sample means, 
<m:math>  <m:mfrac>
    <m:mi>σ</m:mi>
    <m:msqrt>
      <m:mi>n</m:mi>
    </m:msqrt>
  </m:mfrac></m:math>.
    </meaning>
  </definition>

<definition id="nrmdist">
    <term>Standard Normal Distribution</term>
    <meaning>
A continuous random variable (RV) 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>X</m:mi><m:mtext>~</m:mtext><m:mi>N</m:mi><m:mo stretchy="false">(</m:mo><m:mn>0,1</m:mn><m:mo stretchy="false">)</m:mo></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{X "~" N \( 0,1 \) } {}</m:annotation></m:semantics></m:math>. When X follows the standard normal distribution, it is often noted as 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>Z</m:mi><m:mtext>~</m:mtext><m:mi>N</m:mi><m:mo stretchy="false">(</m:mo><m:mn>0,1</m:mn><m:mo stretchy="false">)</m:mo></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{Z "~" N \( 0,1 \) } {}</m:annotation></m:semantics></m:math>.
    </meaning>
  </definition>

<definition id="stat">
    <term>Statistic</term>
    <meaning>
A numerical characteristic of the sample. Statistic estimates the corresponding population parameter. For example, the average number of full-time students in a 7:30 a.m. class for this term (statistic) is an estimate for the average number of full-time students in any class this term (parameter).
    </meaning>
  </definition>

<definition id="stats">
    <term>Statistics</term>
    <meaning>
The science of collecting, organizing, analyzing, and interpreting numerical data.
    </meaning>
  </definition>

<definition id="stratrandsamp">
    <term>Stratified Random Sampling</term>
    <meaning>
A population is divided into groups (called strata) and then a sample is selected from each stratum.
    </meaning>
  </definition>

<definition id="studenttdist">
    <term>Student-<emphasis>t</emphasis> Distribution</term>
    <meaning>
Investigated and reported by William S. Gossett in 1908 and published under the pseudonym Student. The major characteristics of the random variable (RV) are: 

<list type="bulleted" id="tdist1"><item>It is a continuous and assumes any real values. </item><item>The pdf is symmetrical about its mean of zero. However, it is more spread out and flatter at the apex than the normal distribution. </item><item>  It approaches the standard normal distribution as n gets larger. </item><item>  There is a "family" of t distributions: every representative of family is completely defined by the number of degrees of freedom which is one less than the number of data.</item></list>

    </meaning>
  </definition>

<definition id="systsamp">
    <term>Systematic Sampling</term>
    <meaning>
A population is arranged in some standard list (for example, alphabetically) and then every m-th (for example, every fifth) representative if the list is taken in the sample starting from random initial representative.
    </meaning>
  </definition>

<definition id="tstat">
    <term><emphasis>t</emphasis> statistic</term>
    <meaning>
Calculated from the data according to Student-t distribution statistic, that is used to conduct the test and make the statistical inference about the whole population. If data contains n observations, then the number of degrees of freedom of Student-t distribution is n-1. t statistic is used, for example, when population standard deviation is unknown, when n is small, and when samples are dependent.
    </meaning>
  </definition>

<definition id="teststat">
    <term>Test Statistic</term>
    <meaning>
Calculated from the sample value that is used to conduct the test and make the statistical inference about the whole population. The calculation depends on choice of appropriate distribution, which often is reflected in the name of statistic: z-score, <emphasis>t-</emphasis>statistic, <emphasis>F-</emphasis>statistic, etc.
    </meaning>
  </definition>

<definition id="treediagram">
    <term>Tree Diagram</term>
    <meaning>
The useful visual representation of a sample space and events in the form of “tree” with branches marked by possible outcomes simultaneously with associated probabilities (frequencies, relative frequencies).
    </meaning>
  </definition>


<definition id="type1err">
    <term>Type 1 Error</term>
    <meaning>
The decision is to reject Null hypothesis, when, in fact, Null hypothesis is true.
    </meaning>
  </definition>


<definition id="type2err">
    <term>Type 2 Error</term>
    <meaning>
The decision is not to reject Null hypothesis, when, Null hypothesis is false.
    </meaning>
  </definition>


<definition id="unidist">
    <term>Uniform Distribution</term>
    <meaning>
Continuous random variable (RV) that appears to have equally likely outcomes over the domain, 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mrow><m:mi>a</m:mi><m:mo stretchy="false">&lt;</m:mo><m:mi>x</m:mi></m:mrow><m:mo stretchy="false">&lt;</m:mo><m:mi>b</m:mi></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{a&lt;x&lt;b} {}</m:annotation></m:semantics></m:math>. Often referred as <emphasis>Rectangular distribution</emphasis> because graph of its pdf has form of rectangle. Notation: 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>X</m:mi><m:mtext>~</m:mtext><m:mi>U</m:mi><m:mo stretchy="false">(</m:mo><m:mi>a</m:mi><m:mi>,</m:mi><m:mi>b</m:mi><m:mo stretchy="false">)</m:mo></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{X "~" U \( a,b \) } {}</m:annotation></m:semantics></m:math>. The mean is 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>μ</m:mi><m:mo stretchy="false">=</m:mo><m:mfrac><m:mrow><m:mi>a</m:mi><m:mo stretchy="false">+</m:mo><m:mi>b</m:mi></m:mrow><m:mn>2</m:mn></m:mfrac></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{μ= {  {a+b}  over  {2} } } {}</m:annotation></m:semantics></m:math>, and the variance is 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:msup><m:mi>σ</m:mi><m:mstyle fontsize="8pt"><m:mrow><m:mn>2</m:mn></m:mrow></m:mstyle></m:msup><m:mo stretchy="false">=</m:mo><m:mfrac><m:mrow><m:mo stretchy="false">(</m:mo><m:mrow><m:mi>b</m:mi><m:mo stretchy="false">−</m:mo><m:mi>a</m:mi></m:mrow><m:msup><m:mo stretchy="false">)</m:mo><m:mstyle fontsize="8pt"><m:mrow><m:mn>2</m:mn></m:mrow></m:mstyle></m:msup></m:mrow><m:mtext>12</m:mtext></m:mfrac></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{s rSup { size 8{2} } = {  { \( b-a \)  rSup { size 8{2} } }  over  {"12"} } } {}</m:annotation></m:semantics></m:math>, the probability density function is 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>f</m:mi><m:mo stretchy="false">(</m:mo><m:mi>x</m:mi><m:mrow><m:mo stretchy="false">)</m:mo><m:mo stretchy="false">=</m:mo><m:mfrac><m:mn>1</m:mn><m:mrow><m:mi>b</m:mi><m:mo stretchy="false">−</m:mo><m:mi>a</m:mi></m:mrow></m:mfrac></m:mrow><m:mi>,</m:mi><m:mtext/><m:mrow><m:mrow><m:mi>a</m:mi><m:mo stretchy="false">≤</m:mo><m:mi>X</m:mi></m:mrow><m:mo stretchy="false">≤</m:mo><m:mi>b</m:mi></m:mrow></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{f \( x \) = {  {1}  over  {b-a} } ,"   "a &lt;= X &lt;= b} {}</m:annotation></m:semantics></m:math>, and cumulative distribution is 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>P</m:mi><m:mo stretchy="false">(</m:mo><m:mrow><m:mi>X</m:mi><m:mo stretchy="false">≤</m:mo><m:mi>x</m:mi></m:mrow><m:mrow><m:mo stretchy="false">)</m:mo><m:mo stretchy="false">=</m:mo><m:mfrac><m:mrow><m:mi>x</m:mi><m:mo stretchy="false">−</m:mo><m:mi>a</m:mi></m:mrow><m:mrow><m:mi>b</m:mi><m:mo stretchy="false">−</m:mo><m:mi>a</m:mi></m:mrow></m:mfrac></m:mrow></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{P \( X &lt;= x \) = {  {x-a}  over  {b-a} } } {}</m:annotation></m:semantics></m:math>.
    </meaning>
  </definition>


<definition id="variable">
    <term>Variable (Random Variable)</term>
    <meaning>
A characteristic of interest in a population being studied. Common notation for variables are upper case Latin letters 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>X</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{X} {}</m:annotation></m:semantics></m:math>, 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>Y</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{Y} {}</m:annotation></m:semantics></m:math>, 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>Z</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{Z} {}</m:annotation></m:semantics></m:math>,...; common notation for specific value from the domain (set of all possible values of a variable) are lower case Latin letters 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>x</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{x} {}</m:annotation></m:semantics></m:math>, 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>y</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{y} {}</m:annotation></m:semantics></m:math>, 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>z</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{z} {}</m:annotation></m:semantics></m:math>,.... For example, if 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>X</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{X} {}</m:annotation></m:semantics></m:math> is a number of children in a family, then domain is and 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>x</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{x} {}</m:annotation></m:semantics></m:math> represents any integer from 0 to 20. Variable in statistics differs from variable in intermediate algebra in two following ways. 

<list type="bulleted" id="arrvee">
<item> The domain of random variable (RV) is not necessarily numerical set; it can be some “wording” set; for example, if 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>X</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{X} {}</m:annotation></m:semantics></m:math> = hair color then the domain is {black, blond, gray, green, orange}. </item><item> We can tell what specific value of 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>x</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{x} {}</m:annotation></m:semantics></m:math> does the variable 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>X</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{X} {}</m:annotation></m:semantics></m:math> take only after performing the experiment. </item></list>Before the experiment any value from domain is possible. For example, without ultrasound we can not tell the gender of a baby that should be delivered, but after delivery the gender is evident. More exact, every value from the domain is accompanied with some number
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>p</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{p} {}</m:annotation></m:semantics></m:math>, 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mrow><m:mn>0</m:mn><m:mo stretchy="false">≤</m:mo><m:mi>p</m:mi></m:mrow><m:mo stretchy="false">≤</m:mo><m:mn>1</m:mn></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{0 &lt;= p &lt;= 1} {}</m:annotation></m:semantics></m:math>, that characterizes the chance to have this value as an outcome of the experiment. In the example with gender, 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>p</m:mi><m:mo stretchy="false">=</m:mo><m:mfrac><m:mn>1</m:mn><m:mn>2</m:mn></m:mfrac></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{p= {  {1}  over  {2} } } {}</m:annotation></m:semantics></m:math>. That’s why statisticians use more exact name <emphasis>“Random variable” (RV)</emphasis> instead of variable. Even more, they use word “distribution” having in the mind the RV, that is the pairing (value, probability of the value). 
    </meaning>
  </definition>



<definition id="variance">
    <term>Variance</term>
    <meaning>
Mean of the squared deviations from the mean. Square of the standard deviation.
    </meaning>
  </definition>



<definition id="vendiagram">
    <term>Venn Diagram</term>
    <meaning>
The useful visual representation of a sample space and events in the form of circles or ovals showing their intersections.
    </meaning>
  </definition>



<definition id="zscore">
    <term>z-score</term>
    <meaning>
Let’s consider the linear transformation of the form 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>z</m:mi><m:mo stretchy="false">=</m:mo><m:mfrac><m:mrow><m:mi>x</m:mi><m:mo stretchy="false">−</m:mo><m:mi>μ</m:mi></m:mrow><m:mi>σ</m:mi></m:mfrac></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{z= {  {x-μ}  over  {σ} } } {}</m:annotation></m:semantics></m:math>. If this transformation is applied to any normal distribution
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>X</m:mi><m:mtext>~</m:mtext><m:mi>N</m:mi><m:mo stretchy="false">(</m:mo><m:mi>μ</m:mi><m:mi>,</m:mi><m:msup><m:mi>σ</m:mi><m:mstyle fontsize="8pt"><m:mrow><m:mn>2</m:mn></m:mrow></m:mstyle></m:msup><m:mo stretchy="false">)</m:mo></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{X "~" N \( μ,σ rSup { size 8{2} }  \) } {}</m:annotation></m:semantics></m:math>, the result is the standard normal distribution 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mrow><m:mi>Z</m:mi><m:mtext>~</m:mtext><m:mi>N</m:mi><m:mo stretchy="false">(</m:mo><m:mn>0,1</m:mn><m:mo stretchy="false">)</m:mo></m:mrow></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{Z "~" N \( 0,1 \) } {}</m:annotation></m:semantics></m:math>. If this transformation is applied to any specific value 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>x</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{x} {}</m:annotation></m:semantics></m:math> of RV with mean 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>μ</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{μ} {}</m:annotation></m:semantics></m:math> and standard deviation 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>σ</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{σ} {}</m:annotation></m:semantics></m:math> , the result is called z-score of 
<m:math><m:semantics><m:mrow><m:mstyle fontsize="12pt"><m:mrow><m:mi>x</m:mi></m:mrow></m:mstyle><m:mrow/></m:mrow><m:annotation encoding="StarMath 5.0"> size 12{x} {}</m:annotation></m:semantics></m:math>. z-score allows to compare data that are normally distributed but scaled differently.
    </meaning>
  </definition>



</glossary>
</document>
