# Connexions

You are here: Home » Content » Mean, Variance, and Histograms

### Lenses

What is a lens?

#### Definition of a lens

##### Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

##### What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

##### Who can create a lens?

Any individual member, a community, or a respected organization.

##### What are tags?

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

#### Affiliated with (What does "Affiliated with" mean?)

This content is either by members of the organizations listed or about topics related to the organizations listed. Click each link to see a list of all content affiliated with the organization.
• Rice Digital Scholarship

This module is included in aLens by: Digital Scholarship at Rice UniversityAs a part of collection: "The Art of the PFUG"

Click the "Rice Digital Scholarship" link to see all content affiliated with them.

#### Also in these lenses

• Lens for Engineering

This module is included inLens: Lens for Engineering
By: Sidney Burrus

Click the "Lens for Engineering" link to see all content selected in this lens.

### Recently Viewed

This feature requires Javascript to be enabled.

# Mean, Variance, and Histograms

Module by: Kathryn Ward. E-mail the author

Summary: This module introduces the concepts of mean, variance, and histograms. This work was done as part of the TaTGAP program, a VIGRE PFUG that pairs Rice students with local high school students. VIGRE is a program of Vertically Integrated Grants for Research and Education in the Mathematical Sciences under the direction of the National Science Foundation. A PFUG is a group of Postdocs, Faculty, Undergraduates and Graduate students formed around the study of a common problem.

By the end of this module, you should be able to analyze large data sets.

Key Concepts:

1. EEG
2. Mean
3. Variance
4. Standard Deviation
5. Histogram

## Motivation from Epilepsy

An EEG (electroencephalogram) is a tool that tracks and records electrical activity (voltage) in the brain. Data from an EEG can be used to diagnose and monitor seizure disorders.

### Exercise 1.1

Go to the webpage EEG in Common Epilepsy Syndromes. This page shows the EEG signals recorded from patients with various types of epilepsy. What do you notice about these signals, particularly media files 3, 5, 6, 9, and 10? Can you spot the period in the data where the seizures occur?

While in some cases, it may be easy to recognize a seizure from a plot of EEG data, the EEG itself produces a large set of numbers corresponding to the voltage at discrete points in time. If the sampling rate were 250 Hz, then the EEG would record the voltage every 4 ms. This means that to record the activity in the brain for 10 minutes, the EEG would give 150,000 voltage readings. It would be very difficult to look at such a large set of data and make conclusions about what types of seizures, if any, are present. This module will give you the basic tools needed to analyze large sets of data, such as the data taken from an EEG. Below is a summary of the concepts you will learn to compute in this module.

Summary of Key Concepts

 Term Matlab command Definition Mean mean Average number Variance var Measure of the spread of the data Standard Deviation std Square root of the variance Histogram hist Proportion of numbers that falls within given intervals

## Mean

The mean of a set of numbers is the average. If x is a vector of n numbers, then the mean of x is given by

x ¯ = 1 n k = 1 n x k = x 1 + x 2 + . . . + x n n . x ¯ = 1 n k = 1 n x k = x 1 + x 2 + . . . + x n n .
(1)

In Matlab, the mean of the vector x can be computed by typing mean(x).

### Example 2.1

Let x = [1, 7, 2, 5, 9, 6]. Then, x¯=(1+7+2+5+9+6)/6=5.x¯=(1+7+2+5+9+6)/6=5.

### Example 2.2

Suppose you are interviewing for a job where the employees make the following daily salaries:

• 16 general employees: $100 each • 3 managers:$900 each
• 1 owner: $1700 Then, the mean daily salary is given by x ¯ = 16 · 100 + 3 · 900 + 1700 16 + 3 + 1 = 6000 / 20 = 300 . x ¯ = 16 · 100 + 3 · 900 + 1700 16 + 3 + 1 = 6000 / 20 = 300 . (2) Notice that the mean is$300, which is 3 times the daily salary of 80% of the workers! Thus, if you are simply told the mean of the daily salaries, you will not have an accurate idea of how much money you would be making if you got the job. The variance and standard deviation are two measures that can give you an idea of how well the mean represents the data.

### Exercise 2.1

Compute the mean of the vector y = [3, 8, 2, 5, 5, 7], both on paper and using Matlab.

### Exercise 2.2

Suppose you have set the goal of making an A in your math class. If your class grades consist of 4 tests, and you have made a 98, 80, and 90 on your first three tests, what do you need to make on your last test so that the mean of your grades is 90?

### Exercise 2.3

(for the advanced) Suppose that, for the same class, you have already computed the mean of the first three tests when you receive your fourth test grade. Instead of computing the mean of all four tests from scratch, it's possible to update the mean that you've already computed. Write a Matlab code that takes two inputs, the mean of your first three tests and the grade of your fourth test, and computes the mean of all four tests.

## Variance and Standard Deviation

As you saw in "Example 2.2", the mean is not always representative of the data, and other measures are needed to analyze the spread of the data. The variance is a measure of the distance of each number from the mean. Given a vector x of n numbers and mean value x¯,x¯, the variance of x is given by

var ( x ) = 1 n - 1 k = 1 n ( x k - x ¯ ) 2 = ( x 1 - x ¯ ) 2 + ( x 2 - x ¯ ) 2 + . . . + ( x n - x ¯ ) 2 n - 1 . var ( x ) = 1 n - 1 k = 1 n ( x k - x ¯ ) 2 = ( x 1 - x ¯ ) 2 + ( x 2 - x ¯ ) 2 + . . . + ( x n - x ¯ ) 2 n - 1 .
(3)

The standard deviation of the data is related to the variance and is given by

std ( x ) = var ( x ) . std ( x ) = var ( x ) .
(4)

You can compute the variance and standard deviation of x in Matlab by typing the commands var(x) and std(x).

### Example 3.1

Consider the vector given in "Example 2.1", x = [1, 7, 2, 5, 9, 6]. Recall that the mean of x = 5.

var ( x ) = ( 1 - 5 ) 2 + ( 7 - 5 ) 2 + ( 2 - 5 ) 2 + ( 5 - 5 ) 2 + ( 9 - 5 ) 2 + ( 6 - 5 ) 2 5 = 9 . 2 var ( x ) = ( 1 - 5 ) 2 + ( 7 - 5 ) 2 + ( 2 - 5 ) 2 + ( 5 - 5 ) 2 + ( 9 - 5 ) 2 + ( 6 - 5 ) 2 5 = 9 . 2
(5)
std ( x ) = var ( x ) 3 . 03 std ( x ) = var ( x ) 3 . 03
(6)

### Example 3.2

Consider the data from "Example 2.2", where the mean x¯x¯ = 300. The variance is

var ( x ) = 16 · ( 100 - 300 ) 2 + 3 · ( 900 - 300 ) 2 + ( 1700 - 300 ) 2 13 193 , 684 var ( x ) = 16 · ( 100 - 300 ) 2 + 3 · ( 900 - 300 ) 2 + ( 1700 - 300 ) 2 13 193 , 684
(7)

and the standard deviation is

std ( x ) = ( var ( x ) ) 440 std ( x ) = ( var ( x ) ) 440
(8)

Because the standard deviation is considerably larger than the mean, the variance tells us that the mean is not very representative of the data.

### Exercise 3.1

Compute the variance and standard deviation of y = [3, 8, 2, 5, 5, 7], using both the formulas and the Matlab commands.

### Exercise 3.2

Suppose that in the situation of "Example 2.2", there are 50 general exmployees instead of 16. Compute the mean and variance of the daily salary. Is the mean more or less representative of the data than it was in Example 2.2?

## Histograms

Although the mean, variance, and standard deviation provide information about the data, it is often useful to visualize the data. A histogram is a tool that allows you to visualize the proportion of numbers that fall within a given bin, or interval. To compute the histogram of a set of data, x, follow the algorithm below.

1. Choose the bin size ΔxΔx. The bins are the intervals [0, ΔxΔx], (ΔxΔx, 2ΔxΔx], (2ΔxΔx, 3ΔxΔx], and so on.
2. For each bin, count the number of data points that lie within the bin.
3. Create a bar graph showing the number of data points within each bin.

### Example 4.1

Consider again the vector from "Example 2.1", x = [1, 7, 2, 5, 9, 6]. Using a bin size ΔxΔx = 2, there are 5 bins.

• Bin 1 = [0, 2] has 2 elements of x
• Bin 2 = (2, 4] has 0 elements of x
• Bin 3 = (4, 6] has 2 elements of x
• Bin 4 = (6, 8] has 1 element of x
• Bin 5 = (8, 10] has 1 element of x

In Matlab, you can plot the histogram of a vector x by typing hist(x). Matlab will automatically use 10 bins. If you'd like to specify the bin centers, type hist(x,c), where c is a vector of bin centers. The histogram of "Example 4.1" was generated by the Matlab command hist(x, [1, 3, 5, 7, 9]).

### Exercise 4.1

Plot the histogram of the vector y = [3, 8, 2, 5, 5, 7], both on paper and in Matlab.

### Exercise 4.2

Plot the histogram of the daily salaries from "Example 2.2". For this example, does the histogram or the mean give you a better idea of what salary you would be making if you got the job?

## Content actions

PDF | EPUB (?)

### What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

#### Definition of a lens

##### Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

##### What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

##### Who can create a lens?

Any individual member, a community, or a respected organization.

##### What are tags?

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks

### Reuse / Edit:

Reuse or edit module (?)

#### Check out and edit

If you have permission to edit this content, using the "Reuse / Edit" action will allow you to check the content out into your Personal Workspace or a shared Workgroup and then make your edits.

#### Derive a copy

If you don't have permission to edit the content, you can still use "Reuse / Edit" to adapt the content by creating a derived copy of it and then editing and publishing the copy.