This demonstration illustrates the effect of restricting the range of scores on the the correlaton between variables. When the demonstration begins, the dataset "Strrength" is displayed. The X-axis contains scores on a measure of grip strength; the Y-axis contains scores on a measure of arm strength. Notice that there is a strong relationship. Also note that the two scatterplots are identical and both show that for the entire dataset of 147 subjects, the correlaton is 0.63.
The two vertical bars on the upper graph can be used to select a subset of scores to be plotted on the lower graph. If you click on the left-hand bar and drag it to the right, scores to the left of the bar will not be included on the lower graph. The correlaton shown for the lower graph shows only the correlation for the data in the subset.
Similarly, if you drag the right-hand bar to the left, scores to the right of the bar will be excluded.
If you click the "Data outside bars" button, the lower graph will, as you might expect, only include scores to the left of the left-hand bar and to the right of the right-hand bar.
First look at the correlation for just those subjects who scores over 100 on Grip Strength. Do this by moving the left-hand bar to the right until the right-most portion of the bar is on 100. This will exclude all scores below 100 on the graph at the bottom of the screen. Look to see how many subjects are incuded on the bottom. It should be 95. Looking at the graph, the relationship between the two measures of strength appears smaller. The correlation is 0.50. This is lower than the correlation of 0.63 found with the whole sample.
Now look at the correlation if only scores above 140 are included. The correlation drops to 0.39.
Now consider the correlation when you exclude the middle portion of the distribution. Move the left bar to 80 and the right bar to 140. Then click the button "Data outside bars." This will include only the date outside the bars on the plot below. Observe the correlation when only these values are considered.
Choose another dataset such as SAT and see if the same results occur.
Restricting the range of a variable decreases the correlation of that variable with other variables. If only the highest and lowest values are included, the correlaton is higher than if all the values are included.