<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!DOCTYPE document PUBLIC "-//CNX//DTD CNXML 0.5 plus MathML//EN" "http://cnx.rice.edu/technology/cnxml/schema/dtd/0.5/cnxml_mathml.dtd">
<document xmlns="http://cnx.rice.edu/cnxml" xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:bib="http://bibtexml.sf.net/" xmlns:m="http://www.w3.org/1998/Math/MathML" id="new">
  <name>Linear Regression and Correlation: Summary</name>
  <metadata>
  <md:version>1.3</md:version>
  <md:created>2008/06/23 16:50:14 GMT-5</md:created>
  <md:revised>2008/07/15 11:49:04.445 GMT-5</md:revised>
  <md:authorlist>
      <md:author id="billowsky">
      <md:firstname>Barbara</md:firstname>
      
      <md:surname>Illowsky</md:surname>
      <md:email>illowskybarbara@deanza.edu</md:email>
    </md:author>
      <md:author id="sdean">
      <md:firstname>Susan</md:firstname>
      
      <md:surname>Dean</md:surname>
      <md:email>deansusan@deanza.edu</md:email>
    </md:author>
  </md:authorlist>

  <md:maintainerlist>
    <md:maintainer id="cnxorg">
      <md:firstname/>
      
      <md:surname>Connexions</md:surname>
      <md:email>cnx@cnx.org</md:email>
    </md:maintainer>
  </md:maintainerlist>
  
  <md:keywordlist>
    <md:keyword>elementary</md:keyword>
    <md:keyword>statistics</md:keyword>
  </md:keywordlist>

  <md:abstract>This module provides a summary on Linear Regression and Correlation as a part of Collaborative Statistics collection (col10522) by Barbara Illowsky and Susan Dean.</md:abstract>
</metadata>
  <content>
    <para id="delete_me"><emphasis>Bivariate Data:</emphasis> Each data point has two values. The form is <m:math><m:mo>(</m:mo><m:mi>x</m:mi><m:mo>,</m:mo><m:mi>y</m:mi><m:mo>)</m:mo></m:math>.</para><para id="element-590"><emphasis>Line of Best Fit or Least Squares Line (LSL):</emphasis>
<m:math>
<m:mover><m:mi>y</m:mi><m:mo>^</m:mo></m:mover>
<m:mo>=</m:mo>
<m:mi>a</m:mi>
<m:mo>+</m:mo>
<m:mtext>bx</m:mtext>
</m:math>
</para><para id="element-19"><m:math><m:mi>x</m:mi></m:math> = independent variable; 
<m:math><m:mi>y</m:mi></m:math> = dependent variable</para><para id="element-806"><emphasis>Residual:</emphasis> <m:math><m:mtext>Actual y value</m:mtext><m:mo>-</m:mo><m:mtext>predicted y value</m:mtext><m:mo>=</m:mo><m:mi>y</m:mi><m:mo>-</m:mo><m:mover><m:mi>y</m:mi><m:mo>^</m:mo></m:mover></m:math></para><list id="element-274" type="enumerated"><name>Correlation Coefficient r:</name><item>Used to determine whether a line of best fit is good for prediction.</item>
<item>Between -1 and 1 inclusive. The closer <m:math><m:mi>r</m:mi></m:math> is to 1 or -1, the closer the original points are to a straight line.</item>
<item>If <m:math><m:mi>r</m:mi></m:math> is negative, the slope is negative. If  <m:math><m:mi>r</m:mi></m:math> is positive, the slope is positive.</item>
<item>If <m:math><m:mi>r</m:mi><m:mo>=</m:mo><m:mn>0</m:mn></m:math>, then the line is horizontal.</item></list><para id="element-745"><emphasis>Sum of Squared Errors (SSE):</emphasis> The smaller the <emphasis>SSE</emphasis>, the better the original set of
points fits the line of best fit.</para><para id="element-802"><emphasis>Outlier:</emphasis> A point that does not seem to fit the rest of the data.</para>   
  </content>
  
</document>
