Skip to content Skip to navigation

OpenStax-CNX

You are here: Home » Content » Statistics: distribution of data

Navigation

Lenses

What is a lens?

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

This content is ...

Affiliated with (What does "Affiliated with" mean?)

This content is either by members of the organizations listed or about topics related to the organizations listed. Click each link to see a list of all content affiliated with the organization.
  • FETMaths display tagshide tags

    This module is included inLens: Siyavula: Mathematics (Gr. 10-12)
    By: Siyavula

    Review Status: In Review

    Click the "FETMaths" link to see all content affiliated with them.

    Click the tag icon tag icon to display tags associated with this content.

  • Bookshare

    This module is included inLens: Bookshare's Lens
    By: Bookshare - A Benetech InitiativeAs a part of collection: "FHSST: Grade 11 Maths"

    Comments:

    "Accessible versions of this collection are available at Bookshare. DAISY and BRF provided. "

    Click the "Bookshare" link to see all content affiliated with them.

  • Siyavula: Mathematics display tagshide tags

    This module is included inLens: Siyavula Textbooks: Maths
    By: Free High School Science Texts ProjectAs a part of collection: "FHSST: Grade 11 Maths"

    Click the "Siyavula: Mathematics" link to see all content affiliated with them.

    Click the tag icon tag icon to display tags associated with this content.

Recently Viewed

This feature requires Javascript to be enabled.

Tags

(What is a tag?)

These tags come from the endorsement, affiliation, and other lenses that include this content.
 

Statistics: distribution of data

Module by: Free High School Science Texts Project. E-mail the author

Distribution of Data

Symmetric and Skewed Data

The shape of a data set is important to know.

Definition 1: Shape of a data set

This describes how the data is distributed relative to the mean and median.

  • Symmetrical data sets are balanced on either side of the median.
    Figure 1
    Figure 1 (MG11C18_005.png)
  • Skewed data is spread out on one side more than on the other. It can be skewed right or skewed left.
    Figure 2
    Figure 2 (MG11C18_006.png)

Relationship of the Mean, Median, and Mode

The relationship of the mean, median, and mode to each other can provide some information about the relative shape of the data distribution. If the mean, median, and mode are approximately equal to each other, the distribution can be assumed to be approximately symmetrical. With both the mean and median known, the following can be concluded:

  • (mean - median) 00 then the data is symmetrical
  • (mean - median) >0>0 then the data is positively skewed (skewed to the right). This means that the median is close to the start of the data set.
  • (mean - median) <0<0 then the data is negatively skewed (skewed to the left). This means that the median is close to the end of the data set.

Distribution of Data

  1. Three sets of 12 pupils each had test score recorded. The test was out of 50. Use the given data to answer the following questions.
    Table 1: Cumulative Frequencies for Data Set 2.
    Set 1Set 2Set 3
    253243
    473447
    153516
    173243
    162538
    261644
    243842
    274750
    224350
    242944
    121843
    312542
    1. For each of the sets calculate the mean and the five number summary.
    2. For each of the classes find the difference between the mean and the median. Make box and whisker plots on the same set of axes.
    3. State which of the three are skewed (either right or left).
    4. Is set A skewed or symmetrical?
    5. Is set C symmetrical? Why or why not?
  2. Two data sets have the same range and interquartile range, but one is skewed right and the other is skewed left. Sketch the box and whisker plots and then invent data (6 points in each set) that meets the requirements.

Scatter Plots

A scatter-plot is a graph that shows the relationship between two variables. We say this is bivariate data and we plot the data from two different sets using ordered pairs. For example, we could have mass on the horizontal axis (first variable) and height on the second axis (second variable), or we could have current on the horizontal axis and voltage on the vertical axis.

Ohm's Law is an important relationship in physics. Ohm's law describes the relationship between current and voltage in a conductor, like a piece of wire. When we measure the voltage (dependent variable) that results from a certain current (independent variable) in a wire, we get the data points as shown in Table 2.

Table 2: Values of current and voltage measured in a wire.
Current Voltage Current Voltage
0 0,4 2,4 1,4
0,2 0,3 2,6 1,6
0,4 0,6 2,8 1,9
0,6 0,6 3 1,9
0,8 0,4 3,2 2
1 1 3,4 1,9
1,2 0,9 3,6 2,1
1,4 0,7 3,8 2,1
1,6 1 4 2,4
1,8 1,1 4,2 2,4
2 1,3 4,4 2,5
2,2 1,1 4,6 2,5

When we plot this data as points, we get the scatter plot shown in Figure 3.

Figure 3
Figure 3 (MG11C18_007.png)

If we are to come up with a function that best describes the data, we would have to say that a straight line best describes this data.

Ohm's Law

Ohm's Law describes the relationship between current and voltage in a conductor. The gradient of the graph of voltage vs. current is known as the resistance of the conductor.

Research Project : Scatter Plot

The function that best describes a set of data can take any form. We will restrict ourselves to the forms already studied, that is, linear, quadratic or exponential. Plot the following sets of data as scatter plots and deduce the type of function that best describes the data. The type of function can either be quadratic or exponential.

  1. Table 3
    x y x y x y x y
    -5 9,8 0 14,2 -2,5 11,9 2,5 49,3
    -4,5 4,4 0,5 22,5 -2 6,9 3 68,9
    -4 7,6 1 21,5 -1,5 8,2 3,5 88,4
    -3,5 7,9 1,5 27,5 -1 7,8 4 117,2
    -3 7,5 2 41,9 -0,5 14,4 4,5 151,4
  2. Table 4
    x y x y x y x y
    -5 75 0 5 -2,5 27,5 2,5 7,5
    -4,5 63,5 0,5 3,5 -2 21 3 11
    -4 53 1 3 -1,5 15,5 3,5 15,5
    -3,5 43,5 1,5 3,5 -1 11 4 21
    -3 35 2 5 -0,5 7,5 4,5 27,5
  3. Table 5
    Height (cm) 147 150 152 155 157 160 163 165
      168 170 173 175 178 180 183  
    Weight (kg) 52 53 54 56 57 59 60 61
      63 64 66 68 70 72 74  
Definition 2: outlier

A point on a scatter plot which is widely separated from the other points or a result differing greatly from others in the same sample is called an outlier.

The following simulation allows you to plot scatter plots and fit a curve to the plot. Ignore the error bars (blue lines) on the points.

Figure 4
Phet simulation for scatter plots

Scatter Plots

  1. A class's results for a test were recorded along with the amount of time spent studying for it. The results are given below.
    Table 6
    Score (percent)Time spent studying (minutes)
    67100
    5585
    70150
    90180
    4570
    75160
    5080
    6090
    84110
    3060
    6696
    96200
    1. Draw a diagram labelling horizontal and vertical axes.
    2. State with reasons, the cause or independent variable and the effect or dependent variable.
    3. Plot the data pairs
    4. What do you observe about the plot?
    5. Is there any pattern emerging?
  2. The rankings of eight tennis players is given along with the time they spend practising.
    Table 7
    Practice time (min)Ranking
    1545
    3901
    1306
    708
    2403
    2802
    1754
    1037
    1. Construct a scatter plot and explain how you chose the dependent (cause) and independent (effect) variables.
    2. What pattern or trend do you observe?
  3. Eight childrens sweet consumption and sleep habits were recorded. The data is given in the following table.
    Table 8
    Number of sweets (per week)Average sleeping time (per day)
    154
    124,5
    58
    38,5
    183
    232
    115
    48
    1. What is the dependent (cause) variable?
    2. What is the independent (effect) variable?
    3. Construct a scatter plot of the data.
    4. What trend do you observe?

Content actions

Download module as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Add module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks