Summary: This module provides a practice of Linear Regression and Correlation as a part of Collaborative Statistics collection (col10522) by Barbara Illowsky and Susan Dean.
Note: Your browser may not currently support MathML. See our browser support page for additional details. You can always view the correct math in the PDF version.
The data below are real. Keep in mind that these are only reported figures. (Source: Centers for Disease Control and Prevention, National Center for HIV, STD, and TB Prevention, October 24, 2003)
| Year | # AIDS cases diagnosed | # AIDS deaths |
| Pre-1981 | 91 | 29 |
| 1981 | 319 | 121 |
| 1982 | 1,170 | 453 |
| 1983 | 3,076 | 1,482 |
| 1984 | 6,240 | 3,466 |
| 1985 | 11,776 | 6,878 |
| 1986 | 19,032 | 11,987 |
| 1987 | 28,564 | 16,162 |
| 1988 | 35,447 | 20,868 |
| 1989 | 42,674 | 27,591 |
| 1990 | 48,634 | 31,335 |
| 1991 | 59,660 | 36,560 |
| 1992 | 78,530 | 41,055 |
| 1993 | 78,834 | 44,730 |
| 1994 | 71,874 | 49,095 |
| 1995 | 68,505 | 49,456 |
| 1996 | 59,347 | 38,510 |
| 1997 | 47,149 | 20,736 |
| 1998 | 38,393 | 19,005 |
| 1999 | 25,174 | 18,454 |
| 2000 | 25,522 | 17,347 |
| 2001 | 25,643 | 17,402 |
| 2002 | 26,464 | 16,371 |
| Total | 802,118 | 489,093 |
Graph “year” vs. “# AIDS cases diagnosed.” Plot the points on the graph located below in the section titled "Plot" . Do not include pre-1981. Label both axes with words. Scale both axes.
Enter your data into your calculator or computer. The pre-1981 data should not be included. Why is that so?
Write the linear equation below, rounding to 4 decimal places:
Calculate the following:
equation:
Solve.
Plot the 2 above points on the graph below. Then, connect the 2 points to form the regression line.

Obtain the graph on your calculator or computer.
Look at the graph above.
Does the line seem to fit the data? Why or why not?
Do you think a linear fit is best? Why or why not?
Hand draw a smooth curve on the graph above that shows the flow of the data.
What does the correlation imply about the relationship between time (years) and the number of diagnosed AIDS cases reported in the U.S.?
Why is “year” the independent variable and “# AIDS cases diagnosed.” the dependent variable (instead of the reverse)?
Solve.
"Collaborative Statistics was written by two faculty members at De Anza College in Cupertino, California. This book is intended for introductory statistics courses being taken by students at two- […]"