Summary: This module describes a number of statistical measures used to describe data, such as percentiles, spread, and skewness.
Note: You are viewing an old version of this document. The latest version is available here.
The most common measure of spread is the standard deviation. The standard deviation is a number that measures how far data values are from their mean. For example, if the mean of a set of data containing 7 is 5 and the standard deviation is 2, then the value 7 is one (1) standard deviation from its mean because 5 + (1)(2) = 7.
The number line may help you understand standard deviation. If we were to put 5 and 7 on a number line, 7 is to the right of 5. We say, then, that 7 is one standard deviation to the right of 5. If 1 were also part of the data set, then 1 is two standard deviations to the left of 5 because 5 +(-2)(2) = 1.
1=5+(-2)(2) ; 7=5+(1)(2)

Formula: value =
Generally, a value = mean + (#ofSTDEVs)(standard deviation), where #ofSTDEVs = the number of standard deviations.
If
To calculate the standard deviation, calculate the variance first. The variance is the average of the squares of the deviations. The standard deviation is the square root of the variance. You can think of the standard deviation as a special average of the deviations (the
In a fifth grade class, the teacher was interested in the average age and the standard deviation of the ages of her students. What follows are the ages of her students to the nearest half year:
• 9 ; • 9.5 ; • 9.5 ; • 10 ; • 10 ; • 10 ; • 10 ; • 10.5 ; • 10.5 ; • 10.5 ; • 10.5 ; • 11 ; • 11 ; • 11 ; • 11 ; • 11 ; • 11 ; • 11.5 ; • 11.5 ; • 11.5
The average age is 10.53 years, rounded to 2 places.
The variance may be calculated by using a table. Then the standard deviation is calculated by taking the square root of the variance. We will explain the parts of the table after calculating
| Data | Freq. | Deviations | (Freq.)( |
|
|---|---|---|---|---|
The sample variance,
The sample standard deviation,
Typically, you do the calculation for the standard deviation on your calculator or computer. The intermediate results are not rounded. This is done for accuracy.
Verify the mean and standard deviation calculated above on your calculator or computer. Find the median and mode.
Find the value that is 1 standard deviation above the mean. Find
Find the value that is two standard deviations below the mean. Find
Find the values that are 1.5 standard deviations from (below and above) the mean.
Explanation of the table: The deviations show how spread out the data are about the mean. The value 11.5 is farther from the mean than 11. The deviations 0.97 and 0.47 indicate that. If you add the deviations, the sum is always zero. (For this example, there are 20 deviations.) So you cannot simply add the deviations to get the spread of the data. By squaring the deviations, you make them positive numbers. The variance, then, is the average squared deviation. It is small if the values are close to the mean and large if the values are far from the mean.
The variance is a squared measure and does not have the same units as the data. Taking the square root solves the problem. The standard deviation measures the spread in the same units as the data.
For the sample variance, we divide by the total number of data values minus one (
Your concentration should be on what the standard deviation does, not on the arithmetic. The standard deviation is a number which measures how far the data are spread from the mean. Let a calculator or computer do the arithmetic.
The sample standard deviation,
The standard deviation, when first presented, can seem unclear. By graphing your data, you can get a better "feel" for the deviations and the standard deviation. You will find that in symmetrical distributions, the standard deviation can be very helpful but in skewed distributions, the standard deviation may not be much help. The reason is that the two sides of a skewed distribution have different spreads. In a skewed distribution, it is better to look at the first quartile, the median, the third quartile, the smallest value, and the largest value. Because numbers can be confusing, always graph your data.
Use the following data (first exam scores) from Susan Dean's spring pre-calculus class:
• 33; • 42; • 49; • 49; • 53; • 55; • 55; • 61; • 63; • 67; • 68; • 68; • 69; • 69; • 72; • 73; • 74; • 78; • 80; • 83; • 88; • 88; • 88; • 90; • 92; • 94; • 94; • 94; • 94; • 96; • 100
| Data | Frequency | Relative Frequency | Cumulative Relative Frequency |
|---|---|---|---|
| 33 | 1 | 0.032 | 0.032 |
| 42 | 1 | 0.032 | 0.064 |
| 49 | 2 | 0.065 | 0.129 |
| 53 | 1 | 0.032 | 0.161 |
| 55 | 2 | 0.065 | 0.226 |
| 61 | 1 | 0.032 | 0.258 |
| 63 | 1 | 0.032 | 0.29 |
| 67 | 1 | 0.032 | 0.322 |
| 68 | 2 | 0.065 | 0.387 |
| 69 | 2 | 0.065 | 0.452 |
| 72 | 1 | 0.032 | 0.484 |
| 73 | 1 | 0.032 | 0.516 |
| 74 | 1 | 0.032 | 0.548 |
| 78 | 1 | 0.032 | 0.580 |
| 80 | 1 | 0.032 | 0.612 |
| 83 | 1 | 0.032 | 0.644 |
| 88 | 3 | 0.097 | 0.741 |
| 90 | 1 | 0.032 | 0.773 |
| 92 | 1 | 0.032 | 0.805 |
| 94 | 4 | 0.129 | 0.934 |
| 96 | 1 | 0.032 | 0.966 |
| 100 | 1 | 0.032 | 0.998 (Why isn't this value 1?) |
![]() |
The long left whisker in the box plot is reflected in the left side of the histogram. The spread of the exam scores in the lower 50% is greater (73 - 33 = 40) than the spread in the upper 50% (100 - 73 = 27). The histogram, box plot, and chart all reflect this. There are a substantial number of A and B grades (80s, 90s, and 100). The histogram clearly shows this. The box plot shows us that the middle 50% of the exam scores (IQR = 29) are Ds, Cs, and Bs. The box plot also shows us that the lower 25% of the exam scores are Ds and Fs.
Two students, John and Ali, from different high schools, wanted to find out who had the highest G.P.A. when compared to his school. Which student had the highest G.P.A. when compared to his school?
| Student | GPA | School Mean GPA | School Standard Deviation |
|---|---|---|---|
| John | 2.85 | 3.0 | 0.7 |
| Ali | 77 | 80 | 10 |
Use the formula value = mean + (#ofSTDEVs)(stdev) and solve for #ofSTDEVs for each student (stdev = standard deviation):
For John,
For Ali,
John has the better G.P.A. when compared to his school because his G.P.A. is 0.21 standard deviations below his mean while Ali's G.P.A. is 0.3 standard deviations below his mean.