# Connexions

You are here: Home » Content » Collaborative Statistics » Test of Independence

### Lenses

What is a lens?

#### Definition of a lens

##### Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

##### What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

##### Who can create a lens?

Any individual member, a community, or a respected organization.

##### What are tags?

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

#### Endorsed by (What does "Endorsed by" mean?)

This content has been endorsed by the organizations listed. Click each link for a list of all content endorsed by the organization.
• College Open Textbooks

This collection is included inLens: Community College Open Textbook Collaborative
By: CC Open Textbook Collaborative

"Reviewer's Comments: 'I recommend this book. Overall, the chapters are very readable and the material presented is consistent and appropriate for the course. A wide range of exercises introduces […]"

Click the "College Open Textbooks" link to see all content they endorse.

Click the tag icon to display tags associated with this content.

#### Affiliated with (What does "Affiliated with" mean?)

This content is either by members of the organizations listed or about topics related to the organizations listed. Click each link to see a list of all content affiliated with the organization.
• OrangeGrove

This collection is included inLens: Florida Orange Grove Textbooks
By: Florida Orange Grove

Click the "OrangeGrove" link to see all content affiliated with them.

Click the tag icon to display tags associated with this content.

• Featured Content

This collection is included inLens: Connexions Featured Content
By: Connexions

"Collaborative Statistics was written by two faculty members at De Anza College in Cupertino, California. This book is intended for introductory statistics courses being taken by students at two- […]"

Click the "Featured Content" link to see all content affiliated with them.

Click the tag icon to display tags associated with this content.

#### Also in these lenses

• Lucy Van Pelt

This collection is included inLens: Lucy's Lens
By: Tahiya Marome

"Part of the Books featured on Community College Open Textbook Project"

Click the "Lucy Van Pelt" link to see all content selected in this lens.

Click the tag icon to display tags associated with this content.

• Educational Technology Lens

This collection is included inLens: Educational Technology
By: Steve Wilhite

Click the "Educational Technology Lens" link to see all content selected in this lens.

Click the tag icon to display tags associated with this content.

• crowe

This collection is included in aLens by: Chris Rowe

Click the "crowe" link to see all content selected in this lens.

• Bio 502 at CSUDH

This collection is included inLens: Bio 502
By: Terrence McGlynn

"This is the course textbook for Biology 502 at CSU Dominguez Hills"

Click the "Bio 502 at CSUDH" link to see all content selected in this lens.

Click the tag icon to display tags associated with this content.

### Recently Viewed

This feature requires Javascript to be enabled.

### Tags

(What is a tag?)

These tags come from the endorsement, affiliation, and other lenses that include this content.

Inside Collection (Textbook):

Textbook by: Barbara Illowsky, Ph.D., Susan Dean. E-mail the authors

# Test of Independence

Summary: This module describes how the chi-square distribution can be used to test for independence.

Tests of independence involve using a contingency table of observed (data) values. You first saw a contingency table when you studied probability in the Probability Topics chapter.

The test statistic for a test of independence is similar to that of a goodness-of-fit test:

Σ ( i j ) ( O - E ) 2 E Σ ( i j ) ( O - E ) 2 E
(1)

where:

• OO = observed values
• EE = expected values
• ii = the number of rows in the table
• jj = the number of columns in the table

There are i j ij terms of the form ( O - E ) 2 E ( O - E ) 2 E .

A test of independence determines whether two factors are independent or not. You first encountered the term independence in Chapter 3. As a review, consider the following example.

## Note:

The expected value for each cell needs to be at least 5 in order to use this test.

## Example 1

Suppose A A = a speeding violation in the last year and B B = a cell phone user while driving. If A A and B B are independent then P ( A AND B ) = P ( A ) P ( B ) P(A AND B)=P(A)P(B). A AND B A AND B is the event that a driver received a speeding violation last year and is also a cell phone user while driving. Suppose, in a study of drivers who received speeding violations in the last year and who uses cell phones while driving, that 755 people were surveyed. Out of the 755, 70 had a speeding violation and 685 did not; 305 were cell phone users while driving and 450 were not.

Let yy = expected number of drivers that use a cell phone while driving and received speeding violations.

If AA and BB are independent, then P ( A AND B ) = P ( A ) P ( B ) P(A AND B)=P(A)P(B). By substitution,

y 755 = 70 755 305 755 y 755 = 70 755 305 755

Solve for y : y = 70 305 755 = 28.3 y:y= 70 305 755 =28.3

About 28 people from the sample are expected to be cell phone users while driving and to receive speeding violations.

In a test of independence, we state the null and alternate hypotheses in words. Since the contingency table consists of two factors, the null hypothesis states that the factors are independent and the alternate hypothesis states that they are not independent (dependent). If we do a test of independence using the example above, then the null hypothesis is:

H o H o : Being a cell phone user while driving and receiving a speeding violation are independent events.

If the null hypothesis were true, we would expect about 28 people to be cell phone users while driving and to receive a speeding violation.

The test of independence is always right-tailed because of the calculation of the test statistic. If the expected and observed values are not close together, then the test statistic is very large and way out in the right tail of the chi-square curve, like goodness-of-fit.

The degrees of freedom for the test of independence are:

df = (number of columns - 1)(number of rows - 1) df = (number of columns - 1)(number of rows - 1)

The following formula calculates the expected number (EE):

E = (row total)(column total) total number surveyed E= (row total)(column total) total number surveyed

## Example 2

In a volunteer group, adults 21 and older volunteer from one to nine hours each week to spend time with a disabled senior citizen. The program recruits among community college students, four-year college students, and nonstudents. The following table is a sample of the adult volunteers and the number of hours they volunteer per week.

Table 1: Number of Hours Worked Per Week by Volunteer Type (Observed)
The table contains observed (O) values (data).
Type of Volunteer 1-3 Hours 4-6 Hours 7-9 Hours Row Total
Community College Students 111 96 48 255
Four-Year College Students 96 133 61 290
Nonstudents 91 150 53 294
Column Total 298 379 162 839

### Problem 1

Are the number of hours volunteered independent of the type of volunteer?

#### Solution

The observed table and the question at the end of the problem, "Are the number of hours volunteered independent of the type of volunteer?" tell you this is a test of independence. The two factors are number of hours volunteered and type of volunteer. This test is always right-tailed.

H o H o : The number of hours volunteered is independent of the type of volunteer.

H a H a : The number of hours volunteered is dependent on the type of volunteer.

The expected table is:

Table 2: Number of Hours Worked Per Week by Volunteer Type (Expected)
The table contains expected (EE) values (data).
Type of Volunteer 1-3 Hours 4-6 Hours 7-9 Hours
Community College Students 90.57 115.19 49.24
Four-Year College Students 103.00 131.00 56.00
Nonstudents 104.42 132.81 56.77

For example, the calculation for the expected frequency for the top left cell is

E = (row total)(column total) total number surveyed = 255 298 839 = 90.57 E= (row total)(column total) total number surveyed = 255 298 839 =90.57

Calculate the test statistic: χ 2 = 12.99 χ 2 =12.99 (calculator or computer)

Distribution for the test: χ 4 2 χ 4 2

df=(3 columns- 1)(3 rows-1)= (2)(2)=4df=(3 columns-1)(3 rows-1)=(2)(2)=4

Graph:

Probability statement: p-value=P(χ2 >12.99)=0.0113p-value=P(χ2 >12.99)=0.0113

Compare αα and the p-valuep-value: Since no αα is given, assume α=0.05α=0.05. p-value=0.0113p-value=0.0113. α>p-valueα>p-value.

Make a decision: Since α>p-valueα>p-value, reject HoHo. This means that the factors are not independent.

Conclusion: At a 5% level of significance, from the data, there is sufficient evidence to conclude that the number of hours volunteered and the type of volunteer are dependent on one another.

For the above example, if there had been another type of volunteer, teenagers, what would the degrees of freedom be?

##### Note:
Calculator instructions follow.

TI-83+ and TI-84 calculator: Press the MATRX key and arrow over to EDIT. Press 1:[A]. Press 3 ENTER 3 ENTER. Enter the table values by row from Example 11-6. Press ENTER after each. Press 2nd QUIT. Press STAT and arrow over to TESTS. Arrow down to C:χ2-TEST. Press ENTER. You should see Observed:[A] and Expected:[B]. Arrow down to Calculate. Press ENTER. The test statistic is 12.9909 and the p-value=0.0113p-value=0.0113. Do the procedure a second time but arrow down to Draw instead of calculate.

## Example 3

De Anza College is interested in the relationship between anxiety level and the need to succeed in school. A random sample of 400 students took a test that measured anxiety level and need to succeed in school. The table shows the results. De Anza College wants to know if anxiety level and need to succeed in school are independent events.

Table 3: Need to Succeed in School vs. Anxiety Level
Need to Succeed in School High
Anxiety
Med-high
Anxiety
Medium
Anxiety
Med-low
Anxiety
Low
Anxiety
Row Total
High Need 35 42 53 15 10 155
Medium Need 18 48 63 33 31 193
Low Need 4 5 11 15 17 52
Column Total 57 95 127 63 58 400

### Problem 1

How many high anxiety level students are expected to have a high need to succeed in school?

#### Solution

The column total for a high anxiety level is 57. The row total for high need to succeed in school is 155. The sample size or total surveyed is 400.

E = (row total)(column total) total surveyed = 155 57 400 = 22.09 E= (row total)(column total) total surveyed = 155 57 400 =22.09

The expected number of students who have a high anxiety level and a high need to succeed in school is about 22.

### Problem 2

If the two variables are independent, how many students do you expect to have a low need to succeed in school and a med-low level of anxiety?

#### Solution

The column total for a med-low anxiety level is 63. The row total for a low need to succeed in school is 52. The sample size or total surveyed is 400.

##### Problem 3
• a. E = (row total)(column total) total surveyed E= (row total)(column total) total surveyed =
• b. The expected number of students who have a med-low anxiety level and a low need to succeed in school is about:
###### Solution
• a. E = (row total)(column total) total surveyed = 8.19 E= (row total)(column total) total surveyed = 8.19
• b. 8

## Glossary

Contingency Table:
The method of displaying a frequency distribution as a table with rows and columns to show how two variables may be dependent (contingent) upon each other. The table provides an easy way to calculate conditional probabilities.

## Content actions

PDF | EPUB (?)

### What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

#### Collection to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

#### Definition of a lens

##### Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

##### What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

##### Who can create a lens?

Any individual member, a community, or a respected organization.

##### What are tags?

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks

#### Module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

#### Definition of a lens

##### Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

##### What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

##### Who can create a lens?

Any individual member, a community, or a respected organization.

##### What are tags?

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks