Skip to content Skip to navigation

Connexions

You are here: Home » Content » Linear Regression and Correlation: Scatter Plots

Navigation

Content Actions

  • Download module PDF
  • Add to ...
    Add the module to:
    • My Favorites
    • A lens
    • An external social bookmarking service
    • My Favorites (What is 'My Favorites'?)
      'My Favorites' is a special kind of lens which you can use to bookmark modules and collections directly in Connexions. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need a Connexions account to use 'My Favorites'.
    • A lens (What is a lens?)

      Definition of a lens

      Lenses

      A lens is a custom view of Connexions content. You can think of it as a fancy kind of list that will let you see Connexions through the eyes of organizations and people you trust.

      What is in a lens?

      Lens makers point to Connexions materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

      Who can create a lens?

      Any individual Connexions member, a community, or a respected organization.

    • External bookmarks
  • E-mail the authors

Lenses

What is a lens?

Definition of a lens

Lenses

A lens is a custom view of Connexions content. You can think of it as a fancy kind of list that will let you see Connexions through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to Connexions materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual Connexions member, a community, or a respected organization.

This content is ...

In these lenses

  • CCOTP

    This module is included inLens: CCOTP Lens
    By: Tahiya MaromeAs a part of collection:"Collaborative Statistics"

    Comments:

    "Part of the Books featured on Community College Open Textbook Project"

    Click the "CCOTP" link to see all content selected in this lens.

  • Bio 502 at CSUDH

    This module is included inLens: Bio 502
    By: Terrence McGlynnAs a part of collection:"Collaborative Statistics"

    Comments:

    "This is the course textbook for Biology 502 at CSU Dominguez Hills"

    Click the "Bio 502 at CSUDH" link to see all content selected in this lens.

Recently Viewed

This feature requires Javascript to be enabled.

Tags

(What is a tag?)

These tags come from the endorsement, affiliation, and other lenses that include this content.

Linear Regression and Correlation: Scatter Plots

Module by: Susan Dean, Dr. Barbara Illowsky

Summary: This module provides an overview of Linear Regression and Correlation: Scatter Plots as a part of Collaborative Statistics collection (col10522) by Barbara Illowsky and Susan Dean.

Before we take up the discussion of linear regression and correlation, we need to examine a way to display the relation between two variables xx and yy. The most common and easiest way is a scatter plot. The following example illustrates a scatter plot.

Example 1

From an article in the Wall Street Journal: In Europe and Asia, m-commerce is becoming more popular. M-commerce users have special mobile phones that work like electronic wallets as well as provide phone and Internet services. Users can do everything from paying for parking to buying a TV set or soda from a machine to banking to checking sports scores on the Internet. In the next few years, will there be a relationship between the year and the number of m-commerce users? Construct a scatter plot. Let xx = the year and let yy = the number of m-commerce users, in millions.

Figure 1
Subfigure 1.1: Table showing the number of m-commerce users (in millions) by year.Subfigure 1.2: Scatter plot showing the number of m-commerce users (in millions) by year.
xx (year) yy (# of users)
2000 0.5
2002 20.0
2003 33.0
2004 47.0
A scatter plot with the x-axis representing the year and the y-axis representing the number of m-commerce users in millions.  There are four points plotted, at (2000, 0.5), (2002, 20.0), (2003, 33.0), (2004, 47.0).

A scatter plot shows the direction and strength of a relationship between the variables. A clear direction happens when there is either:

  • High values of one variable occurring with high values of the other variable or low values of one variable occurring with low values of the other variable.
  • High values of one variable occurring with low values of the other variable.

You can determine the strength of the relationship by looking at the scatter plot and seeing how close the points are to a line, a power function, an exponential function, or to some other type of function.

When you look at a scatterplot, you want to notice the overall pattern and any deviations from the pattern. The following scatterplot examples illustrate these concepts.

Figure 2
Positive Linear Pattern (Strong) Linear Pattern w/ One Deviation
Subfigure 2.1Subfigure 2.2
Scatterplot of 6 points in a straight ascending line from lower left to upper right.Scatterplot of 6 points in a straight ascending line from lower left to upper right with one additional point in the upper left corner.
Figure 3
Negative Linear Pattern (Strong) Negative Linear Pattern (Weak)
Subfigure 3.1Subfigure 3.2
Scatterplot of 6 points in a straight descending line from upper left to lower right.Scatterplot of 8 points in a wobbly descending line from upper left to lower right.
Figure 4
Exponential Growth Pattern No Pattern
Subfigure 4.1Subfigure 4.2
Scatterplot of 7 points in a exponential curve from along the x-axis on the left to slowly ascending up the graph in the upper right.Scatterplot of many points scattered everywhere.

In this chapter, we are interested in scatter plots that show a linear pattern. Linear patterns are quite common. The linear relationship is strong if the points are close to a straight line. If we think that the points show a linear relationship, we would like to draw a line on the scatter plot. This line can be calculated through a process called linear regression. However, we only calculate a regression line if one of the variables helps to explain or predict the other variable. If xx is the independent variable and yy the dependent variable, then we can use a regression line to predict yy for a given value of xx.

Comments, questions, feedback, criticisms?

Send feedback