Inside Collection (Textbook): Collaborative Statistics (with edits: Teegarden)
Summary: This module describes how the chi-square distribution can be used to test for independence.
Tests of independence involve using a contingency table of observed (data) values. You first saw a contingency table when you studied probability in the Probability Topics chapter.
The test statistic for a test of independence is similar to that of a goodness-of-fit test:
where:
There are
A test of independence determines whether two factors are independent or not. You first encountered the term independence in Chapter 3. As a review, consider the following example.
Suppose
Let
If
Solve for
About 28 people from the sample are expected to be cell phone users while driving and to receive speeding violations.
In a test of independence, we state the null and alternate hypotheses in words. Since the contingency table consists of two factors, the null hypothesis states that the factors are independent and the alternate hypothesis states that they are not independent (dependent). If we do a test of independence using the example above, then the null hypothesis is:
If the null hypothesis were true, we would expect about 28 people to be cell phone users while driving and to receive a speeding violation.
The test of independence is always right-tailed because of the calculation of the test statistic. If the expected and observed values are not close together, then the test statistic is very large and way out in the right tail of the chi-square curve, like goodness-of-fit.
The degrees of freedom for the test of independence are:
The following formula calculates the expected number
(
In a volunteer group, adults 21 and older volunteer from one to nine hours each week to spend time with a disabled senior citizen. The program recruits among community college students, four-year college students, and nonstudents. The following table is a sample of the adult volunteers and the number of hours they volunteer per week.
| Type of Volunteer | 1-3 Hours | 4-6 Hours | 7-9 Hours | Row Total |
|---|---|---|---|---|
| Community College Students | 111 | 96 | 48 | 255 |
| Four-Year College Students | 96 | 133 | 61 | 290 |
| Nonstudents | 91 | 150 | 53 | 294 |
| Column Total | 298 | 379 | 162 | 839 |
Are the number of hours volunteered independent of the type of volunteer?
The observed table and the question at the end of the problem, "Are the number of hours volunteered independent of the type of volunteer?" tell you this is a test of independence. The two factors are number of hours volunteered and type of volunteer. This test is always right-tailed.
The expected table is:
| Type of Volunteer | 1-3 Hours | 4-6 Hours | 7-9 Hours |
|---|---|---|---|
| Community College Students | 90.57 | 115.19 | 49.24 |
| Four-Year College Students | 103.00 | 131.00 | 56.00 |
| Nonstudents | 104.42 | 132.81 | 56.77 |
For example, the calculation for the expected frequency for the top left cell is
Calculate the test statistic:
Distribution for the test:
Graph:

Probability statement:
Compare
Make a decision: Since
Conclusion: At a 5% level of significance, from the data, there is sufficient evidence to conclude that the number of hours volunteered and the type of volunteer are dependent on one another.
For the above example, if there had been another type of volunteer, teenagers, what would the degrees of freedom be?
TI-83+ and TI-84 calculator: Press the MATRX key and arrow over to
EDIT. Press 1:[A]. Press 3 ENTER 3 ENTER. Enter the table values by
row from Example 11-6. Press ENTER after each. Press 2nd QUIT. Press
STAT and arrow over to TESTS. Arrow down to C:χ2-TEST. Press
ENTER. You should see Observed:[A] and Expected:[B]. Arrow down to
Calculate. Press ENTER. The test statistic is 12.9909 and the Draw instead of
calculate.
De Anza College is interested in the relationship between anxiety level and the need to succeed in school. A random sample of 400 students took a test that measured anxiety level and need to succeed in school. The table shows the results. De Anza College wants to know if anxiety level and need to succeed in school are independent events.
| Need to Succeed in School | High Anxiety |
Med-high Anxiety |
Medium Anxiety |
Med-low Anxiety |
Low Anxiety |
Row Total |
|---|---|---|---|---|---|---|
| High Need | 35 | 42 | 53 | 15 | 10 | 155 |
| Medium Need | 18 | 48 | 63 | 33 | 31 | 193 |
| Low Need | 4 | 5 | 11 | 15 | 17 | 52 |
| Column Total | 57 | 95 | 127 | 63 | 58 | 400 |
How many high anxiety level students are expected to have a high need to succeed in school?
The column total for a high anxiety level is 57. The row total for high need to succeed in school is 155. The sample size or total surveyed is 400.
The expected number of students who have a high anxiety level and a high need to succeed in school is about 22.
If the two variables are independent, how many students do you expect to have a low need to succeed in school and a med-low level of anxiety?
The column total for a med-low anxiety level is 63. The row total for a low need to succeed in school is 52. The sample size or total surveyed is 400.
"Reviewer's Comments: 'I recommend this book. Overall, the chapters are very readable and the material presented is consistent and appropriate for the course. A wide range of exercises introduces […]"