Textbook by: Barbara Illowsky, Ph.D., Susan Dean.

Summary

Summary: This module provides a summary on Linear Regression and Correlation as a part of Collaborative Statistics collection (col10522) by Barbara Illowsky and Susan Dean.

Bivariate Data: Each data point has two values. The form is (x,y)(x,y).

Line of Best Fit or Least Squares Line (LSL): y^ = a + bx y^=a+bx

xx = independent variable; yy = dependent variable

Residual: Actual y value-predicted y value=y-y^Actual y value-predicted y value=y-y^

Correlation Coefficient r:

1. Used to determine whether a line of best fit is good for prediction.
2. Between -1 and 1 inclusive. The closer rr is to 1 or -1, the closer the original points are to a straight line.
3. If rr is negative, the slope is negative. If rr is positive, the slope is positive.
4. If r=0r=0, then the line is horizontal.

Sum of Squared Errors (SSE): The smaller the SSE, the better the original set of points fits the line of best fit.

Outlier: A point that does not seem to fit the rest of the data.

