Summary: Note: This module is currently under revision, and its content is subject to change. This module is being prepared as part of a statistics textbook that will be available for the Fall 2008 semester.
Note: You are viewing an old version of this document. The latest version is available here.
Before we take up the discussion of linear regression and correlation, we need to examine a
way to display the relation between two variables
From an article in the Wall Street Journal: In Europe and Asia,
m-commerce is becoming more popular. M-commerce users have special mobile
phones that work like electronic wallets as well as provide phone and Internet services.
Users can do everything from paying for parking to buying a TV set or soda from a
machine to banking to checking sports scores on the Internet. In the next few years, will
there be a relationship between the year and the number of m-commerce users?
Construct a scatterplot. Let
| x | y |
|---|---|
| 2000 | 0.5 |
| 2002 | 20.0 |
| 2003 | 33.0 |
| 2004 | 47.0 |
![]() |
A scatterplot shows the direction and strength of a relationship between the variables. A clear direction happens when there is either
You can determine the strength of the relationship by looking at the scatterplot and seeing how close the points are to a line, a power function, an exponential function, or to some other type of function.
When you look at a scatterplot, you want to notice the overall pattern and any deviations from the pattern. The following scatterplot examples illustrate these concepts.
Positive Linear Pattern (Strong)
Negative Linear Pattern (Strong)
Exponential Growth Pattern
No Pattern
Linear Pattern with One Deviation
Negative Linear Pattern (Weak)
In this chapter, we are interested in scatterplots that show a linear pattern. Linear patterns
are quite common. The linear relationship is strong if the points are close to a straight line.
If we think that the points show a linear relationship, we would like to draw a line on the
scatterplot. This line can be calculated through a process called linear regression.
However, we only calculate a regression line if one of the variables helps to explain or
predict the other variable. If