Summary: This module is the complementary teacher's guide for the "The Chi-Square Distribution" chapter of the Collaborative Statistics collection (col10522) by Barbara Illowsky and Susan Dean.
This chapter is concerned with three chi-square applications: goodness-of-fit; independence; and single variance. We rely on technology to do the calculations, especially for goodness-of-fit and for independence. However, the first example in the chapter (the number of absences in the days of the week) has the student calculate the chi-square statistic in steps. The same could be done for the chi-square statistic in a test of independence.
The chi-square distribution generally is skewed to the right. There is a different chi-square curve for each df. When the df's are 90 or more, the chi-square distribution is a very good approximation to the normal. For the chi-square distribution,
A goodness-of-fit hypothesis test is used to determine whether or not data "fit" a particular distribution.
In a past issue of the magazine GEICO Direct, there was an article concerning the percentage of teenage motor vehicle deaths and time of day. The following percentages were given from a sample.
| Time of Day | Death Rate |
|---|---|
| 12 a.m. to 3 a.m. | 17% |
| 3 a.m. to 6 a.m. | 8% |
| 6 a.m. to 9 a.m. | 8% |
| 9 a.m. to 12 noon | 6% |
| 12 noon to 3 p.m. | 10% |
| 3 p.m. to 6 p.m. | 16% |
| 6 p.m. to 9 p.m. | 15% |
| 9 p.m. to 12 a.m. | 19% |
For the purpose of this example, suppose another sample of 100 produced the same percentages. We hypothesize that the data from this new sample fits a uniform distribution. The level of significance is 1% (
The distribution for the hypothesis test is
The table contains the observed percentages. For the sample of 100, the observed (O) numbers are 17, 8, 8, 6, 10, 16, 15 and 19. The expected (E) numbers are each 12.5 for a uniform distribution (100 divided by 8 cells). The chi-square test statistic is calculated using
If you are using the TI-84 series graphing calculators, ON SOME OF THEM there is a function in STAT TESTS called
If you are using the TI-83 series, enter the observed numbers in list1 and the expected numbers in list2 and in list3 (go to the list name), enter (list1-list2)^2/list2. Press enter. Add the values in list3 (this is the test statistic). Then go to 2nd DISTR
Probability Statement:
(Always a right-tailed test)
Since
We conclude that there is not sufficient evidence to reject the null hypothesis. It appears that the number of teenage motor vehicle deaths fits a uniform distribution. It does not matter what time of the day or night it is. Teenagers die from motor vehicle accidents equally at any time of the day or night. However, if the level of significance were 10%, we would reject the null hypothesis and conclude that the distribution of deaths does not fit a uniform distribution.
A test of independence compares two factors to determine if they are independent (i.e. one factor does not affect the happening of a second factor).
The following table shows a random sample of 100 hikers and the area of hiking preferred.
| Gender | The Coastline | Near Lakes and Streams | On Mountain Peaks |
|---|---|---|---|
| Female | 18 | 16 | 11 |
| Male | 16 | 25 | 14 |
The distribution for the hypothesis test is
The df's are equal to:
The chi-square statistic is calculated using
Each expected (E) value is calculated using
The first expected value (female, the coastline) is
The expected values are: 15.3, 18.45, 11.25, 18.7, 22.55, 13.75
The chi-square statistic is:
The TI-83/84 series have the function
Probability Statement:
Since
There is not sufficient evidence to conclude that gender and hiking preference are not independent.
Sometimes you might be interested in how something varies. A test of a single variance is the type of hypothesis test you could run in order to determine variability.
A vending machine company which produces coffee vending machines claims that its machine pours an 8 ounce cup of coffee, on the average, with a standard deviation of 0.3 ounces. A college that uses the vending machines claims that the standard deviation is more than 0.3 ounces causing the coffee to spill out of a cup. The college sampled 30 cups of coffee and found that the standard deviation was 1 ounce. At the 1% level of significance, test the claim made by the vending machine company.
The distribution for the hypothesis test is
The test statistic
Probability Statement:
Since
There is sufficient evidence to conclude that the standard deviation is more than 0.3 ounces of coffee. The vending machine company needs to adjust their machines to prevent spillage.
Have the students do the Practice 1, Practice 2, and Practice 3 in class collaboratively.
Assign Homework . Suggested homework: 3, 5, 7 (GOF), 9, 13, 15 (Test of Indep.), 17, 19, 23 (Variance), 24 - 37 (General)