Skip to content Skip to navigation

Connexions

You are here: Home » Content » Linear Regression and Correlation: Facts About the Correlation Coefficient for Linear Regression

Navigation

Content Actions

  • Download module PDF
  • Add to ...
    Add the module to:
    • My Favorites
    • A lens
    • An external social bookmarking service
    • My Favorites (What is 'My Favorites'?)
      'My Favorites' is a special kind of lens which you can use to bookmark modules and collections directly in Connexions. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need a Connexions account to use 'My Favorites'.
    • A lens (What is a lens?)

      Definition of a lens

      Lenses

      A lens is a custom view of Connexions content. You can think of it as a fancy kind of list that will let you see Connexions through the eyes of organizations and people you trust.

      What is in a lens?

      Lens makers point to Connexions materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

      Who can create a lens?

      Any individual Connexions member, a community, or a respected organization.

    • External bookmarks
  • E-mail the authors

Lenses

What is a lens?

Definition of a lens

Lenses

A lens is a custom view of Connexions content. You can think of it as a fancy kind of list that will let you see Connexions through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to Connexions materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual Connexions member, a community, or a respected organization.

This content is ...

In these lenses

  • Printable Books

    This module is included inLens: Connexions Books Available for Print on Demand
    By: ConnexionsAs a part of collection:"Collaborative Statistics"

    Comments:

    "This book was purchased from the authors by the Maxfield Foundation and provided to the community as an open textbook available freely online and in PDF format. Bound copies of the book can also […]"

    Click the "Printable Books" link to see all content selected in this lens.

  • Bio 502 at CSUDH

    This module is included inLens: Bio 502
    By: Terrence McGlynnAs a part of collection:"Collaborative Statistics"

    Comments:

    "This is the course textbook for Biology 502 at CSU Dominguez Hills"

    Click the "Bio 502 at CSUDH" link to see all content selected in this lens.

Recently Viewed

Tags

(What is a tag?)

These tags come from the endorsement, affiliation, and other lenses that include this content.

Linear Regression and Correlation: Facts About the Correlation Coefficient for Linear Regression

Module by: Dr. Barbara Illowsky, Susan Dean

Summary: This module provides an overview of Facts About the Correlation Coefficient for Linear Regression as a part of Collaborative Statistics collection (col10522) by Barbara Illowsky and Susan Dean.

  • A positive rr means that when xx increases, yy increases and when xx decreases, yy decreases (positive correlation).
  • A negative rr means that when xx increases, yy decreases and when xx decreases, yy increases (negative correlation).
  • An rr of zero means there is absolutely no linear relationship between xx and yy (no correlation).
  • High correlation does not suggest that xx causes yy or yy causes xx. We say "correlation does not imply causation." For example, every person who learned math in the 17th century is dead. However, learning math does not necessarily cause death!

Figure 1
Positive CorrelationNegative CorrelationZero Correlation
Subfigure 1.1: A scatter plot showing data with a positive correlation.Subfigure 1.2: A scatter plot showing data with a negative correlation.Subfigure 1.3: A scatter plot showing data with zero correlation.
Scatterplot of points ascending from the lower left to the upper right.Scatterplot of points descending from the upper left to the lower right.Scatterplot of points in a horizontal configuration.

The 95% Critical Values of the Sample Correlation Coefficient Table at the end of this chapter (before the Summary) may be used to give you a good idea of whether the computed value of rr is significant or not. Compare rr to the appropriate critical value in the table. If rr is significant, then you may want to use the line for prediction.

Example 1

Suppose you computed r=0.801r=0.801 using n=10n=10 data points. df=n-2=10 -2=8df=n-2=10 -2=8. The critical values associated with df=8df=8 are -0.632 and + 0.632. If rr<negative critical valuenegative critical value or r>positive critical valuer>positive critical value, then rr is significant. Since r=0.801r=0.801 and 0.801>0.6320.801>0.632, rr is significant and the line may be used for prediction. If you view this example on a number line, it will help you.

Figure 2: rr is not significant between -0.632 and +0.632. r=0.801>+0.632r=0.801>+0.632. Therefore, rr is significant.
Horizontal number line with values of -1, -0.632, 0, 0.632, 0.801, and 1. A dashed line above values -0.632, 0, and 0.632 indicates not significant values.

Example 2

Suppose you computed r=-0.624r=-0.624 with 14 data points. df=14-2=12df=14-2=12. The critical values are -0.532 and 0.532. Since -0.624-0.624<-0.532-0.532, rr is significant and the line may be used for prediction

Figure 3: r=-0.624r=-0.624<-0.532-0.532. Therefore, rr is significant.
Horizontal number line with values of -0.624, -0.532, and 0.532.

Example 3

Suppose you computed r=0.776r=0.776 and n=6n=6. df=6-2=4df=6-2=4. The critical values are -0.811 and 0.811. Since -0.811-0.811< 0.7760.776 < 0.8110.811, rr is not significant and the line should not be used for prediction.

Figure 4: -0.811-0.811<r=0.776r=0.776<0.8110.811. Therefore, rr is not significant.
Horizontal number line with values -0.924, -0.532, and 0.532.

Note:

If rr is -1 or rr is +1, then all the data points lie exactly on a straight line. If the line is significant, then within the range of the x-values, the line can be used to predict a yy value. As an illustration, consider the third exam/final exam example. The line of best fit is: y ^ = -173.51 + 4.83x y ^ =-173.51+4.83x with r = 0.6631 r=0.6631

Can the line be used for prediction? Given a third exam score (xx value), can we successfully predict the final exam score (predicted yy value). Test r=0.6631r=0.6631 with its appropriate critical value.

Using the table with df=11-2=9df=11-2=9, the critical values are -0.602 and +0.602. Since 0.6631>0.6020.6631>0.602, rr is significant. Because rr is significant and the scatter plot shows a reasonable linear trend, the line can be used to predict final exam scores.

Example 4

Suppose you computed the following correlation coefficients. Using the table at the end of the chapter, determine if rr is significant and the line of best fit associated with each rr can be used to predict a yy value. If it helps, draw a number line.

  • r=-0.567r=-0.567 and the sample size, nn, is 19. The df=n-2=17df=n-2=17. The critical value is -0.456. -0.567-0.567<-0.456-0.456 so rr is significant.
  • r=0.708r=0.708 and the sample size, nn, is 9. The df=n-2=7df=n-2=7. The critical value is 0.666. 0.708>0.6660.708>0.666 so rr is significant.
  • r=0.134r=0.134 and the sample size, nn, is 14. The df=14-2=12df=14-2=12. The critical value is 0.532. 0.134 is between -0.532 and 0.532 so rr is not significant.
  • r=0r=0 and the sample size, nn, is 5. No matter what the dfs are, r=0r=0 is between the two critical values so rr is not significant.

Comments, questions, feedback, criticisms?

Send feedback