Twenty students were asked how many hours they worked per day. Their responses, in hours, are listed below, followed by a frequency table listing the different data values in ascending order and their frequencies.
5, 6, 3, 3, 2, 4, 7, 5, 2, 3, 5, 6, 5, 4, 4, 3, 5, 2, 5, 3
Table 1: Frequency Table of Student Work Hours
| Data Value |
Frequency |
| 2 |
3 |
| 3 |
5 |
| 4 |
3 |
| 5 |
6 |
| 6 |
2 |
| 7 |
1 |
A frequency is the number of times a given datum occurs in a data set. According to the table above, there are three students who work 2 hours, five students who work 3 hours, etc. The total of the frequency column, 20, represents the total number of students included in the sample. A relative frequency is the fraction of times an answer occurs. To find the relative frequencies, divide each frequency by the total number of students in the sample - in this case, 20. Relative frequencies can be written as fractions, percents, or decimals. Cumulative relative frequency is the accumulation of the previous relative frequencies. To find the cumulative relative frequencies, add all the previous relative frequencies to the relative frequency for the current row.
Table 2: Frequency Table of Student Work Hours w/ Relative and Cumulative Relative Frequency
| Data Value |
Frequency |
Relative Frequency |
Cumulative Relative Frequency |
| 2 |
3 |
3
20
3
20
or 0.15 |
0.15 |
| 3 |
5 |
5
20
5
20
or 0.25 |
0.15 + 0.25 = 0.40 |
| 4 |
3 |
3
20
3
20
or 0.15 |
0.40 + 0.15 = 0.55 |
| 5 |
6 |
6
20
6
20
or 0.30 |
0.55 + 0.30 = 0.85 |
| 6 |
2 |
2
20
2
20
or 0.10 |
0.85 + 0.10 = 0.95 |
| 7 |
1 |
1
20
1
20
or 0.05 |
0.95 + 0.05 = 1.00 |
- The sum of the relative frequency column is
20
20
20
20
, or 1.
- The last entry of the cumulative relative frequency column is one, indicating that one hundred percent
of the data has been accumulated.
- Because of rounding, the relative frequency column may not always sum to one and the last entry in the cumulative relative frequency column may not be one. However, they each should be close to one.
As you can imagine, it is pretty straightforward to do something like this in R. The following functions apply:
- Frequencies:
table(). - Relative frequencies:
table() divided by length() (which tells us how many items are in an R object). - Cumulative frequencies: First we create intervals using
cut(), then we use cumsum(). Note: Set your cut() breaks ranging from one below your minimum value to your maximum value. - Cumulative relative frequencies: First, calculate your cumulative frequencies (see item 3) and divide that by the total number of observations (obtained using
length()).
# Entering the data
hours.worked = c(5, 6, 3, 3, 2, 4, 7, 5, 2, 3, 5, 6,
5, 4, 4, 3, 5, 2, 5, 3)
# A general frequency table
table(hours.worked)
## hours.worked
## 2 3 4 5 6 7
## 3 5 3 6 2 1
# Relative frequency table
table(hours.worked)/length(hours.worked)
## hours.worked
## 2 3 4 5 6 7
## 0.15 0.25 0.15 0.30 0.10 0.05
# To get cumulative frequencies, we need to put
# the hours into different intervals
x = table(cut(hours.worked, breaks = c(1:7)))
# Cumulative frequencies
cumsum(x)
## (1,2] (2,3] (3,4] (4,5] (5,6] (6,7]
## 3 8 11 17 19 20
# Cumulative relative frequencies
cumsum(x)/length(hours.worked)
## (1,2] (2,3] (3,4] (4,5] (5,6] (6,7]
## 0.15 0.40 0.55 0.85 0.95 1.00
The following table represents the heights, in inches, of a sample of 100 male semiprofessional soccer players.
Table 3: Frequency Table of Soccer Player Height
| HEIGHTS (INCHES) |
FREQUENCY OF STUDENTS |
RELATIVE FREQUENCY |
CUMULATIVE RELATIVE FREQUENCY |
| 59.95 - 61.95 |
5 |
5
100
5
100
= 0.05
|
0.05 |
| 61.95 - 63.95 |
3 |
3
100
3
100
= 0.03
|
0.05 + 0.03 = 0.08 |
| 63.95 - 65.95 |
15 |
15
100
15
100
= 0.15
|
0.08 + 0.15 = 0.23 |
| 65.95 - 67.95 |
40 |
40
100
40
100
= 0.40
|
0.23 + 0.40 = 0.63 |
| 67.95 - 69.95 |
17 |
17
100
17
100
= 0.17
|
0.63 + 0.17 = 0.80 |
| 69.95 - 71.95 |
12 |
12
100
12
100
= 0.12
|
0.80 + 0.12 = 0.92 |
| 71.95 - 73.95 |
7 |
7
100
7
100
= 0.07
|
0.92 + 0.07 = 0.99 |
| 73.95 - 75.95 |
1 |
1
100
1
100
= 0.01
|
0.99 + 0.01 = 1.00 |
| |
Total = 100 |
Total = 1.00 |
|
The data in this table has been grouped into the following intervals:
- 59.95 - 61.95 inches
- 61.95 - 63.95 inches
- 63.95 - 65.95 inches
- 65.95 - 67.95 inches
- 67.95 - 69.95 inches
- 69.95 - 71.95 inches
- 71.95 - 73.95 inches
- 73.95 - 75.95 inches
This example is used again in the
Descriptive Statistics chapter, where the method used to compute the intervals will be explained.
In this sample, there are 5 players whose heights are between 59.95 - 61.95 inches, 3 players whose heights fall within the interval 61.95 - 63.95 inches, 15 players whose heights fall within the interval 63.95 - 65.95 inches, 40 players whose heights fall within the interval 65.95 - 67.95 inches, 17 players whose heights fall within the interval 67.95 - 69.95 inches, 12 players whose heights fall within the interval 69.95 - 71.95, 7 players whose height falls within the interval 71.95 - 73.95, and 1 player whose height falls within the interval 73.95 - 75.95. All heights fall between the endpoints of an interval and not at the endpoints.
From the table, find the percentage of heights that are less than 65.95 inches.
If you look at the first, second, and third rows, the heights are all less than 65.95 inches. There are 5 + 3 + 15 = 23 males whose heights are less than 65.95 inches. The percentage of heights less than 65.95 inches is then
23
100
23
100
or 23%. This percentage is the cumulative relative frequency entry in the third row.
From the table, find the percentage of heights that fall between 61.95 and 65.95 inches.
Add the relative frequencies in the second and third rows: 0.03 + 0.15 = 0.18 or 18%.
Use the table of heights of the 100 male semiprofessional soccer players. Fill in the blanks and check your answers.
- The percentage of heights that are from 67.95 to 71.95 inches is:
- The percentage of heights that are from 67.95 to 73.95 inches is:
- The percentage of heights that are more than 65.95 inches is:
- The number of players in the sample who are between 61.95 and 71.95 inches tall is:
- What kind of data are the heights?
- Describe how you could gather this data (the heights) so that the data are characteristic of all male semiprofessional soccer players.
Remember, you count frequencies. To find the relative frequency, divide the frequency by the total number of data values. To find the cumulative relative frequency, add all of the previous relative frequencies to the relative frequency for the current row.
- 29%
- 36%
- 77%
- 87
- quantitative continuous
- get rosters from each team and choose a simple random sample from each
- Frequency:
The number of times a value of the data occurs.
- Relative Frequency:
The ratio of the number of times a value of the data occurs in the set of all outcomes to the number of all outcomes.
- Cumulative Relative Frequency:
The term applies to an ordered set of observations from smallest to largest. The Cumulative Relative Frequency is the sum of the relative frequencies for all values that are less than or equal to the given value.