*The next three questions refer to the following information.* We are interested in the number of years students in a particular elementary statistics class have lived in California.
The information in the following table is from the entire section.

Number of years | Frequency |
---|---|

7 | 1 |

14 | 3 |

15 | 1 |

18 | 1 |

19 | 4 |

20 | 3 |

22 | 1 |

23 | 1 |

26 | 1 |

40 | 2 |

42 | 2 |

Total = 20 |

### Exercise 24

What is the IQR?

**A.**8**B.**11**C.**15**D.**35

#### Solution

A

### Exercise 25

What is the mode?

**A.**19**B.**19.5**C.**14 and 20**D.**22.65

#### Solution

A

### Exercise 26

Is this a sample or the entire population?

**A.**sample**B.**entire population**C.**neither

#### Solution

B

*The next two questions refer to the following table.*

X | Frequency |
---|---|

0 | 3 |

1 | 12 |

2 | 33 |

3 | 28 |

4 | 11 |

5 | 9 |

6 | 4 |

### Exercise 27

The 80th percentile is:

**A.**5**B.**80**C.**3**D.**4

#### Solution

D

### Exercise 28

The number that is 1.5 standard deviations BELOW the mean is approximately:

**A.**0.7**B.**4.8**C.**-2.8**D.**Cannot be determined

#### Solution

A

*The next two questions refer to the following histogram.* Frederico recently opened a "designer" T-shirt store near the beach. During the first month of operation, he conducted a marketing survey of a random sample of 111 customers. One of the questions asked the customer how many T-shirts he/she owns that cost more than $19 each.

### Exercise 29

The percent of people that own at most three (3) T-shirts costing more than $19 each is approximately:

**A.**21**B.**59**C.**41**D.**Cannot be determined

#### Solution

C

### Exercise 30

If the data were collected by asking the first 111 people who entered the store, then the type of sampling is:

**A.**cluster**B.**simple random**C.**stratified**D.**convenience

#### Solution

D

### Exercise 31

A music school has budgeted to purchase 3 musical instruments. They plan to purchase a piano costing $3000, a guitar costing $550, and a drum set costing $600. The average cost for a piano is $4,000 with a standard deviation of $2,500. The average cost for a guitar is $500 with a standard deviation of $200. The average cost for drums is $700 with a standard deviation of $100. Which cost is the lowest, when compared to other instruments of the same type? Which cost is the highest when compared to other instruments of the same type. Justify your answer numerically.

#### Solution

For pianos, the cost of the piano is 0.4 standard deviations BELOW average. For guitars, the cost of the guitar is 0.25 standard deviations ABOVE average. For drums, the cost of the drum set is 1.0 standard deviations BELOW average. Of the three, the drums cost the lowest in comparison to the cost of other instruments of the same type. The guitar cost the most in comparison to the cost of other instruments of the same type.

### Exercise 32

Suppose that a publisher conducted a survey asking adult consumers the number of fiction paperback books they had purchased in the previous month. The results are summarized in the table below. (Note that this is the data presented for publisher B in homework exercise 13).

# of books | Freq. | Rel. Freq. |
---|---|---|

0 | 18 | |

1 | 24 | |

2 | 24 | |

3 | 22 | |

4 | 15 | |

5 | 10 | |

7 | 5 | |

9 | 1 |

- Are there any outliers in the data? Use an appropriate numerical test involving the IQR to identify outliers, if any, and clearly state your conclusion.
- If a data value is identified as an outlier, what should be done about it?
- Are any data values further than 2 standard deviations away from the mean? In some situations, statisticians may use this criteria to identify data values that are unusual, compared to the other data values. (Note that this criteria is most appropriate to use for data that is mound-shaped and symmetric, rather than for skewed data.)
- Do parts (a) and (c) of this problem this give the same answer?
- Examine the shape of the data. Which part, (a) or (c), of this question gives a more appropriate result for this data?
- Based on the shape of the data which is the most appropriate measure of center for this data: mean, median or mode?

#### Solution

- IQR = 4 – 1 = 3 ; Q1 – 1.5*IQR = 1 – 1.5(3) = -3.5 ; Q3 + 1.5*IQR = 4 + 1.5(3) = 8.5 ;The data value of 9 is larger than 8.5. The purchase of 9 books in one month is an outlier.
- The outlier should be investigated to see if there is an error or some other problem in the data; then a decision whether to include or exclude it should be made based on the particular situation. If it was a correct value then the data value should remain in the data set. If there is a problem with this data value, then it should be corrected or removed from the data. For example: If the data was recorded incorrectly (perhaps a 9 was miscoded and the correct value was 6) then the data should be corrected. If it was an error but the correct value is not known it should be removed from the data set.
- xbar – 2s = 2.45 – 2*1.88 = -1.31 ; xbar + 2s = 2.45 + 2*1.88 = 6.21 ; Using this method, the five data values of 7 books purchased and the one data value of 9 books purchased would be considered unusual.
- No: part (a) identifies only the value of 9 to be an outlier but part (c) identifies both 7 and 9.
- The data is skewed (to the right). It would be more appropriate to use the method involving the IQR in part (a), identifying only the one value of 9 books purchased as an outlier. Note that part (c) remarks that identifying unusual data values by using the criteria of being further than 2 standard deviations away from the mean is most appropriate when the data are mound-shaped and symmetric.
- The data are skewed to the right. For skewed data it is more appropriate to use the median as a measure of center.