Skip to content Skip to navigation Skip to collection information

OpenStax-CNX

You are here: Home » Content » Maths Grade 10 Rought draft » Bias, error and misuse

Navigation

Lenses

What is a lens?

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

This content is ...

Affiliated with (What does "Affiliated with" mean?)

This content is either by members of the organizations listed or about topics related to the organizations listed. Click each link to see a list of all content affiliated with the organization.
  • FETMaths display tagshide tags

    This module is included inLens: Siyavula: Mathematics (Gr. 10-12)
    By: Siyavula

    Review Status: In Review

    Click the "FETMaths" link to see all content affiliated with them.

    Click the tag icon tag icon to display tags associated with this content.

Recently Viewed

This feature requires Javascript to be enabled.

Tags

(What is a tag?)

These tags come from the endorsement, affiliation, and other lenses that include this content.
 

Bias, error and misuse

Module by: Free High School Science Texts Project. E-mail the author

Bias and error in measurements

All measurements have some error associated with them. Random errors occur in all data sets and are sometimes known as non-systematic errors. Random errors can arise from estimation of data values, imprecision of instruments, etc. For example if you are reading lengths off a ruler, random errors will arise in each measurement as a result of estimating between which two lines the length lies. Bias is also sometimes known as systematic error. Bias in a data set is where a value is consistently under or overestimated. Bias can arise from forgetting to take into account a correction factor or from instruments that are not properly calibrated (calibration is the process of marking off predefined measurements). Bias leads to a sample mean that is either lower or higher than the true mean.

Data interpretation

Many people take statistics and just blindly apply it to life or quote it. This, however, is not wise since the data that led to the statistics also needs to be considered. A well known example of several sets of data that lead to the same statistical analysis (the process of examining data and determining values such as central tendency, etc.) but are in fact very different is Anscombe's quartet. This is shown in (Reference). In Grade 11 you will learn about the methods used to represent data graphically. For now, however, you should simply appreciate the fact that we can plot data values on the Cartesian plane in a similar way to plotting graphs. If each of the datasets in Anscombe's quartet are analysed statistically, then one finds that the mean, variance, correlation and linear regression (these terms will be explained in later grades) are identical. If, instead of analysing the data statistically, we simply plot the data points we can see that the data sets are very different. This example shows us that it is very important to consider the underlying data set as well as the statistics that we obtain from the data. We cannot simply assume that just because we know the statistics of a data set, we know what the data set is telling us. For general interest, some of the ways that statistics and data can be misinterpreted are given in the following extension section.

Figure 1: Anscombe's quartet
Figure 1 (anscombe.png)

Misuse of Statistics - For enrichment, not in CAPS

In many cases groups can gain an advantage by misleading people with the misuse of statistics. Companies misuse statistics to attempt to show that they are performing better than a competitor, advertisers abuse statistics to try to convince you to buy their product, researchers misuse statistics to attempt to show that their data is of better quality than it really is, etc.

Common techniques used include:

  • Three dimensional graphs.
  • Axes that do not start at zero.
  • Axes without scales.
  • Graphic images that convey a negative or positive mood.
  • Assumption that a correlation shows a necessary causality.
  • Using statistics that are not truly representative of the entire population.
  • Using misconceptions of mathematical concepts

For example, the following pairs of graphs show identical information but look very different. Explain why.

Figure 2
Figure 2 (MG10C16_010.png)

Exercises - Misuse of Statistics

  1. A company has tried to give a visual representation of the increase in their earnings from one year to the next. Does the graph below convince you? Critically analyse the graph.
    Figure 3
    Figure 3 (MG10C16_011.png)
    Click here for the solution
  2. In a study conducted on a busy highway, data was collected about drivers breaking the speed limit and the colour of the car they were driving. The data were collected during a 20 minute time interval during the middle of the day, and are presented in a table and pie chart below.
    • Conclusions made by a novice based on the data are summarised as follows:
    • “People driving white cars are more likely to break the speed limit.”
    • “Drivers in blue and red cars are more likely to stick to the speed limit.”
    • Do you agree with these conclusions? Explain.
    Click here for the solution
  3. A record label produces a graphic, showing their advantage in sales over their competitors. Identify at least three devices they have used to influence and mislead the readers impression.
    Figure 4
    Figure 4 (MG10C16_013.png)
    Click here for the solution
  4. In an effort to discredit their competition, a tour bus company prints the graph shown below. Their claim is that the competitor is losing business. Can you think of a better explanation?
    Figure 5
    Figure 5 (MG10C16_014.png)
    Click here for the solution
  5. To test a theory, 8 different offices were monitored for noise levels and productivity of the employees in the office. The results are graphed below.
    Figure 6
    Figure 6 (MG10C16_015.png)
    The following statement was then made: “If an office environment is noisy, this leads to poor productivity.” Explain the flaws in this thinking.
    Click here for the solution

End of chapter summary

  • Data types can be divided into primary and secondary data. Primary data may be further divided into qualitative and quantitative data.
  • We use the following as measures of central tendency:
    • Mean: The mean of a data set, xx, denoted by x¯x¯, is the average of the data values, and is calculated as:
      x¯=sum of valuesnumber of valuesx¯=sum of valuesnumber of values
      (1)
    • Median: The median is the centre data value in a data set that has been ordered from lowest to highest
    • Mode: The mode is the data value that occurs most often in a data set.
  • The following are measures of dispersion:
    • Range: The range of a data set is the difference between the lowest value and the highest value in the set.
    • Quartiles: Quartiles are the three data values that divide an ordered data set into four groups containing equal numbers of data values. The median is the second quartile.
    • Percentiles: Percentiles are the 99 data values that divide a data set into 100 groups.
    • Inter quartile range: The inter quartile range is a measure which provides information about the spread of a data set, and is calculated by subtracting the first quartile from the third quartile, giving the range of the middle half of the data set, trimming off the lowest and highest quarters, i.e. Q3-Q1Q3-Q1. Half of this value is the semi-interquartile range.
  • The five number summary is a way to summarise data. A box and whisker plot is a graphical representation of the five number summary.
  • Random errors are found in all sets of data and arise from estimating data values. Bias or systematic error occurs when you consistently under or over estimate data values.
  • You must always consider the data and the statistics that summarise the data

Exercises

  1. Calculate the mean, median, and mode of Data Set 3.
    Click here for the solution
  2. The tallest 7 trees in a park have heights in metres of 41, 60, 47, 42, 44, 42, and 47. Find the median of their heights.
    Click here for the solution
  3. The students in Bjorn's class have the following ages: 5, 6, 7, 5, 4, 6, 6, 6, 7, 4. Find the mode of their ages.
    Click here for the solution
  4. An engineering company has designed two different types of engines for motorbikes. The two different motorbikes are tested for the time it takes (in seconds) for them to accelerate from 0 km/h to 60 km/h.
    Table 1
     Test 1Test 2Test 3Test 4Test 5Test 6Test 7Test 8Test 9Test 10Average
    Bike 11.551.000.920.801.490.711.060.680.871.09 
    Bike 20.91.01.11.01.00.90.91.00.91.1 
    1. What measure of central tendency should be used for this information?
    2. Calculate the average you chose in the previous question for each motorbike.
    3. Which motorbike would you choose based on this information? Take note of accuracy of the numbers from each set of tests.
    Click here for the solution
  5. The heights of 40 learners are given below.
    Table 2
    154140145159150132149150138152
    141132169173139161163156157171
    168166151152132142170162146152
    142150161138170131145146147160
    1. Set up a frequency table using 6 intervals.
    2. Calculate the approximate mean.
    3. Determine the mode.
    4. How many learners are taller than your approximate average in (b)?
    Click here for the solution
  6. In a traffic survey, a random sample of 50 motorists were asked the distance they drove to work daily. This information is shown in the table below.
    Table 3
    Distance in km1-56-1011-1516-2021-2526-3031-3536-4041-45
    Frequency4591078322
    1. Find the approximate mean.
    2. What percentage of samples drove
      1. less than 16 km?
      2. more than 30 km?
      3. between 16 km and 30 km daily?
    Click here for the solution
  7. A company wanted to evaluate the training programme in its factory. They gave the same task to trained and untrained employees and timed each one in seconds.
    Table 4
    Trained121137131135130
     128130126132127
     129120118125134
    Untrained135142126148145
     156152153149145
     144134139140142
    1. Find the medians and quartiles for both sets of data.
    2. Find the Interquartile Range for both sets of data.
    3. Comment on the results.
    Click here for the solution
  8. A small firm employs nine people. The annual salaries of the employers are:
    Table 5
    R600 000R250 000R200 000
    R120 000R100 000R100 000
    R100 000R90 000R80 000
    1. Find the mean of these salaries.
    2. Find the mode.
    3. Find the median.
    4. Of these three figures, which would you use for negotiating salary increases if you were a trade union official? Why?
    Click here for the solution
  9. The marks for a particular class test are listed here:
    Table 6
    67589167588271516084
    31679664787187788938
    6962607360877149  

    Complete the frequency table using the given class intervals.

    Table 7
    ClassTallyFrequencyMid-pointFreq ×× Midpt
    30-39 34,5  
    40-49 44,5  
    50-59    
    60-69    
    70-79    
    80-89    
    90-99    
      Sum = Sum =

    Click here for the solution

Collection Navigation

Content actions

Download:

Collection as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Module as:

PDF | More downloads ...

Add:

Collection to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks

Module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks