Skip to content Skip to navigation Skip to collection information

Connexions

You are here: Home » Content » Siyavula textbooks: Grade 10 Maths [NCS] » Summarising data

Navigation

Lenses

What is a lens?

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

This content is ...

Affiliated with (What does "Affiliated with" mean?)

This content is either by members of the organizations listed or about topics related to the organizations listed. Click each link to see a list of all content affiliated with the organization.
  • Siyavula: Mathematics display tagshide tags

    This collection is included inLens: Siyavula Textbooks: Maths
    By: Free High School Science Texts Project

    Click the "Siyavula: Mathematics" link to see all content affiliated with them.

    Click the tag icon tag icon to display tags associated with this content.

  • Bookshare

    This collection is included inLens: Bookshare's Lens
    By: Bookshare - A Benetech Initiative

    Comments:

    "Accessible versions of this collection are available at Bookshare. DAISY and BRF provided."

    Click the "Bookshare" link to see all content affiliated with them.

  • FETMaths display tagshide tags

    This module and collection are included inLens: Siyavula: Mathematics (Gr. 10-12)
    By: Siyavula

    Module Review Status: In Review
    Collection Review Status: In Review

    Click the "FETMaths" link to see all content affiliated with them.

    Click the tag icon tag icon to display tags associated with this content.

Recently Viewed

This feature requires Javascript to be enabled.

Tags

(What is a tag?)

These tags come from the endorsement, affiliation, and other lenses that include this content.
 

Summarising Data

If the data set is very large, it is useful to be able to summarise the data set by calculating a few quantities that give information about how the data values are spread and about the central values in the data set.

Measures of Central Tendency

Mean or Average

The mean, (also known as arithmetic mean), is simply the arithmetic average of a group of numbers (or data set) and is shown using the bar symbol ¯¯. So the mean of the variable xx is x¯x¯ pronounced "x-bar". The mean of a set of values is calculated by adding up all the values in the set and dividing by the number of items in that set. The mean is calculated from the raw, ungrouped data.

Definition 1: Mean

The mean of a data set, xx, denoted by x¯x¯, is the average of the data values, and is calculated as:

x ¯ = sum of all values number of values = x 1 + x 2 + x 3 + ... + x n n x ¯ = sum of all values number of values = x 1 + x 2 + x 3 + ... + x n n
(1)

Method: Calculating the mean

  1. Find the total of the data values in the data set.
  2. Count how many data values there are in the data set.
  3. Divide the total by the number of data values.
Exercise 1: Mean

What is the mean of x={10,20,30,40,50}x={10,20,30,40,50}?

Solution
  1. Step 1. Find the total of the data values :
    10 + 20 + 30 + 40 + 50 = 150 10 + 20 + 30 + 40 + 50 = 150
    (2)
  2. Step 2. Count the number of data values in the data set :

    There are 5 values in the data set.

  3. Step 3. Divide the total by the number of data values. :
    150 ÷ 5 = 30 150 ÷ 5 = 30
    (3)
  4. Step 4. Answer :

    the mean of the data set x={10,20,30,40,50}x={10,20,30,40,50} is 30.

Median

Definition 2: Median

The median of a set of data is the data value in the central position, when the data set has been arranged from highest to lowest or from lowest to highest. There are an equal number of data values on either side of the median value.

The median is calculated from the raw, ungrouped data, as follows.

Method: Calculating the median

  1. Order the data from smallest to largest or from largest to smallest.
  2. Count how many data values there are in the data set.
  3. Find the data value in the central position of the set.
Exercise 2: Median

What is the median of {10,14,86,2,68,99,1}{10,14,86,2,68,99,1}?

Solution
  1. Step 1. Order the data set from lowest to highest :

    1,2,10,14,68,86,99

  2. Step 2. Count the number of data values in the data set :

    There are 7 points in the data set.

  3. Step 3. Find the central position of the data set :

    The central position of the data set is 4.

  4. Step 4. Find the data value in the central position of the ordered data set. :

    14 is in the central position of the data set.

  5. Step 5. Answer :

    14 is the median of the data set {1,2,10,14,68,86,99}{1,2,10,14,68,86,99}.

This example has highlighted a potential problem with determining the median. It is very easy to determine the median of a data set with an odd number of data values, but what happens when there is an even number of data values in the data set?

When there is an even number of data values, the median is the mean of the two middle points.

Tip:
Finding the Central Position of a Data Set

An easy way to determine the central position or positions for any ordered data set is to take the total number of data values, add 1, and then divide by 2. If the number you get is a whole number, then that is the central position. If the number you get is a fraction, take the two whole numbers on either side of the fraction, as the positions of the data values that must be averaged to obtain the median.

Exercise 3: Median

What is the median of {11,10,14,86,2,68,99,1}{11,10,14,86,2,68,99,1}?

Solution
  1. Step 1. Order the data set from lowest to highest :

    1,2,10,11,14,68,85,99

  2. Step 2. Count the number of data values in the data set :

    There are 8 points in the data set.

  3. Step 3. Find the central position of the data set :

    The central position of the data set is between positions 4 and 5.

  4. Step 4. Find the data values around the central position of the ordered data set. :

    11 is in position 4 and 14 is in position 5.

  5. Step 5. Answer :

    the median of the data set {1,2,10,11,14,68,85,99}{1,2,10,11,14,68,85,99} is

    ( 11 + 14 ) ÷ 2 = 12 , 5 ( 11 + 14 ) ÷ 2 = 12 , 5
    (4)

Mode

Definition 3: Mode

The mode is the data value that occurs most often, i.e. it is the most frequent value or most common value in a set.

Method: Calculating the mode Count how many times each data value occurs. The mode is the data value that occurs the most.

The mode is calculated from grouped data, or single data items.

Exercise 4: Mode

Find the mode of the data set x={1,2,3,4,4,4,5,6,7,8,8,9,10,10}x={1,2,3,4,4,4,5,6,7,8,8,9,10,10}

Solution
  1. Step 1. Count how many times each data value occurs. :
    Table 1
    data value frequency data value frequency
    1 1 6 1
    2 1 7 1
    3 1 8 2
    4 3 9 1
    5 1 10 2
  2. Step 2. Find the data value that occurs most often. :

    4 occurs most often.

  3. Step 3. Answer :

    The mode of the data set x={1,2,3,4,4,4,5,6,7,8,8,9,10,10}x={1,2,3,4,4,4,5,6,7,8,8,9,10,10} is 4. Since the number 4 appears the most frequently.

A data set can have more than one mode. For example, both 2 and 3 are modes in the set 1, 2, 2, 3, 3. If all points in a data set occur with equal frequency, it is equally accurate to describe the data set as having many modes or no mode.

Figure 1
Khan academy video on statistics

Measures of Dispersion

The mean, median and mode are measures of central tendency, i.e. they provide information on the central data values in a set. When describing data it is sometimes useful (and in some cases necessary) to determine the spread of a distribution. Measures of dispersion provide information on how the data values in a set are distributed around the mean value. Some measures of dispersion are range, percentiles and quartiles.

Range

Definition 4: Range

The range of a data set is the difference between the lowest value and the highest value in the set.

Method: Calculating the range

  1. Find the highest value in the data set.
  2. Find the lowest value in the data set.
  3. Subtract the lowest value from the highest value. The difference is the range.
Exercise 5: Range

Find the range of the data set x={1,2,3,4,4,4,5,6,7,8,8,9,10,10}x={1,2,3,4,4,4,5,6,7,8,8,9,10,10}

Solution
  1. Step 1. Find the highest and lowest values. :

    10 is the highest value and 1 is the lowest value.

  2. Step 2. Subtract the lowest value from the highest value to calculate the range. :
    10 - 1 = 9 10 - 1 = 9
    (5)
  3. Step 3. Answer :

    For the data set x={1,2,3,4,4,4,5,6,7,8,8,9,10,10}x={1,2,3,4,4,4,5,6,7,8,8,9,10,10}, the range is 9.

Quartiles

Definition 5: Quartiles

Quartiles are the three data values that divide an ordered data set into four groups containing equal numbers of data values. The median is the second quartile.

The quartiles of a data set are formed by the two boundaries on either side of the median, which divide the set into four equal sections. The lowest 25% of the data being found below the first quartile value, also called the lower quartile. The median, or second quartile divides the set into two equal sections. The lowest 75% of the data set should be found below the third quartile, also called the upper quartile. For example:

Table 2
22 24 48 51 60 72 73 75 80 88 90
               
    Lower quartile     Median     Upper quartile    
    (Q1Q1)     (Q2Q2)     (Q3Q3)    

Method: Calculating the quartiles

  1. Order the data from smallest to largest or from largest to smallest.
  2. Count how many data values there are in the data set.
  3. Divide the number of data values by 4. The result is the number of data values per group.
  4. Determine the data values corresponding to the first, second and third quartiles using the number of data values per quartile.
Exercise 6: Quartiles

What are the quartiles of {3,5,1,8,9,12,25,28,24,30,41,50}{3,5,1,8,9,12,25,28,24,30,41,50}?

Solution
  1. Step 1. Order the data set from lowest to highest :

    { 1 , 3 , 5 , 8 , 9 , 12 , 24 , 25 , 28 , 30 , 41 , 50 } { 1 , 3 , 5 , 8 , 9 , 12 , 24 , 25 , 28 , 30 , 41 , 50 }

  2. Step 2. Count the number of data values in the data set :

    There are 12 values in the data set.

  3. Step 3. Divide the number of data values by 4 to find the number of data values per quartile. :
    12 ÷ 4 = 3 12 ÷ 4 = 3
    (6)
  4. Step 4. Find the data values corresponding to the quartiles. :
    Table 3
    1 3 5 8 9 12 24 25 28 30 41 50
          Q 1 Q 1       Q 2 Q 2       Q 3 Q 3      

    The first quartile occurs between data position 3 and 4 and is the average of data values 5 and 8. The second quartile occurs between positions 6 and 7 and is the average of data values 12 and 24. The third quartile occurs between positions 9 and 10 and is the average of data values 28 and 30.

  5. Step 5. Answer :

    The first quartile = 6,5. (Q1Q1)

    The second quartile = 18. (Q2Q2)

    The third quartile = 29. (Q3Q3)

Inter-quartile Range

Definition 6: Inter-quartile Range

The inter quartile range is a measure which provides information about the spread of a data set, and is calculated by subtracting the first quartile from the third quartile, giving the range of the middle half of the data set, trimming off the lowest and highest quarters, i.e. Q3-Q1Q3-Q1.

The semi-interquartile range is half the interquartile range, i.e. Q3-Q12Q3-Q12

Exercise 7: Medians, Quartiles and the Interquartile Range

A class of 12 students writes a test and the results are as follows: 20, 39, 40, 43, 43, 46, 53, 58, 63, 70, 75, 91. Find the range, quartiles and the Interquartile Range.

Solution
  1. Step 1. :
    Table 4
    20 39 40 43 43 46 53 58 63 70 75 91
          Q 1 Q 1       M M       Q 3 Q 3      
  2. Step 2. The Range :

    The range = 91 - 20 = 71. This tells us that the marks are quite widely spread.

  3. Step 3. The median lies between the 6th and 7th mark :

    i.e. M=46+532=992=49,5M=46+532=992=49,5

  4. Step 4. The lower quartile lies between the 3rd and 4th mark :

    i.e. Q1=40+432=832=41,5Q1=40+432=832=41,5

  5. Step 5. The upper quartile lies between the 9th and 10th mark :

    i.e. Q3=63+702=1332=66,5Q3=63+702=1332=66,5

  6. Step 6. Analysing the quartiles :

    The quartiles are 41,5, 49,5 and 66,5. These quartiles tell us that 25%% of the marks are less than 41,5; 50%% of the marks are less than 49,5 and 75%% of the marks are less than 66,5. They also tell us that 50%% of the marks lie between 41,5 and 66,5.

  7. Step 7. The Interquartile Range :

    The Interquartile Range = 66,5 - 41,5 = 25. This tells us that the width of the middle 50%% of the data values is 25.

  8. Step 8. The Semi-interquatile Range :

    The Semi-interquartile Range = 252252 = 12,5

Percentiles

Definition 7: Percentiles

Percentiles are the 99 data values that divide a data set into 100 groups.

The calculation of percentiles is identical to the calculation of quartiles, except the aim is to divide the data values into 100 groups instead of the 4 groups required by quartiles.

Method: Calculating the percentiles

  1. Order the data from smallest to largest or from largest to smallest.
  2. Count how many data values there are in the data set.
  3. Divide the number of data values by 100. The result is the number of data values per group.
  4. Determine the data values corresponding to the first, second and third quartiles using the number of data values per quartile.

Exercises - Summarising Data

  1. Three sets of data are given:
    1. Data set 1: 9 12 12 14 16 22 24
    2. Data set 2: 7 7 8 11 13 15 16 16
    3. Data set 3: 11 15 16 17 19 19 22 24 27 For each one find:
      1. the range
      2. the lower quartile
      3. the interquartile range
      4. the semi-interquartile range
      5. the median
      6. the upper quartile
    Click here for the solution
  2. There is 1 sweet in one jar, and 3 in the second jar. The mean number of sweets in the first two jars is 2.
    1. If the mean number in the first three jars is 3, how many are there in the third jar?
    2. If the mean number in the first four jars is 4, how many are there in the fourth jar?
    Click here for the solution
  3. Find a set of five ages for which the mean age is 5, the modal age is 2 and the median age is 3 years.
    Click here for the solution
  4. Four friends each have some marbles. They work out that the mean number of marbles they have is 10. One of them leaves. She has 4 marbles. How many marbles do the remaining friends have together?
    Click here for the solution

Exercise 8: Mean, Median and Mode for Grouped Data

Consider the following grouped data and calculate the mean, the modal group and the median group.

Table 5
Mass (kg) Frequency
41 - 45 7
46 - 50 10
51 - 55 15
56 - 60 12
61 - 65 6
  Total = 50
Solution
  1. Step 1. Calculating the mean :

    To calculate the mean we need to add up all the masses and divide by 50. We do not know actual masses, so we approximate by choosing the midpoint of each group. We then multiply those midpoint numbers by the frequency. Then we add these numbers together to find the approximate total of the masses. This is show in the table below.

    Table 6
    Mass (kg) Midpoint Frequency Midpt ×× Freq
    41 - 45 (41+45)/2 = 43 7 43 ×× 7 = 301
    46 - 50 48 10 480
    51 - 55 53 15 795
    56 - 60 58 12 696
    61 - 65 63 6 378
        Total = 50 Total = 2650
  2. Step 2. Answer :

    The mean = 265050=53265050=53.

    The modal group is the group 51 - 53 because it has the highest frequency.

    The median group is the group 51 - 53, since the 25th and 26th terms are contained within this group.

More mean, modal and median group exercises.

In each data set given, find the mean, the modal group and the median group.

  1. Times recorded when learners played a game.
    Table 7
    Time in secondsFrequency
      
    36 - 455
    46 - 5511
    56 - 6515
    66 - 7526
    76 - 8519
    86 - 9513
    96 - 1056
    Click here for the solution
  2. The following data were collected from a group of learners.
    Table 8
    Mass in kilogramsFrequency
      
    41 - 453
    46 - 505
    51 - 558
    56 - 6012
    61 - 6514
    66 - 709
    71 - 757
    76 - 802
    Click here for the solution

Collection Navigation

Content actions

Download:

Collection as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Module as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Add:

Collection to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks

Module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks