Skip to content Skip to navigation

OpenStax-CNX

You are here: Home » Content » Statistics

Navigation

Recently Viewed

This feature requires Javascript to be enabled.
 

Statistics

Module by: Ananda Mahto. E-mail the author

Based on: Sampling and Data: Statistics by Susan Dean, Barbara Illowsky, Ph.D.

Summary: This module introduces the concept of statistics, specifically the ability to use statistics to describe data (descriptive statistics) as well as draw conclusions (inferential statistics). An optional classroom exercise is included.

The science of statistics deals with the collection, analysis, interpretation, and presentation of data. We see and use data in our everyday lives.

Optional Collaborative Classroom Exercise

In your classroom, try this exercise. Have class members write down the average time (in hours, to the nearest half-hour) they sleep per night. Your instructor will record the data. Then create a simple graph (called a dot plot) of the data. A dot plot consists of a number line and dots (or points) positioned above the number line. For example, consider the following data:

5, 5.5, 6, 6, 6, 6.5, 6.5, 6.5, 6.5, 7, 7, 8, 8, 9

The dot plot for this data would be as follows:

Figure 1
Frequency of Average Time (in Hours) Spent Sleeping per Night
Dot plot with hours of sleep on the X-axis and frequency on Y-axis

Does your dot plot look the same as or different from the example? Why? If you did the same example in an English class with the same number of students, do you think the results would be the same? Why or why not?

Where do your data appear to cluster? How could you interpret the clustering?

The questions above ask you to analyze and interpret your data. With this example, you have begun your study of statistics.

In this course, you will learn how to organize and summarize data. Organizing and summarizing data is called descriptive statistics. Two ways to summarize data are by graphing and by numbers (for example, finding an average). After you have studied probability and probability distributions, you will use formal methods for drawing conclusions from "good" data. The formal methods are called inferential statistics. Statistical inference uses probability to determine how confident we can be that the conclusions are correct.

Effective interpretation of data (inference) is based on good procedures for producing data and thoughtful examination of the data. You will encounter what will seem to be too many mathematical formulas for interpreting data. The goal of statistics is not to perform numerous calculations using the formulas, but to gain an understanding of your data. The calculations can be done using a calculator or a computer. The understanding must come from you. If you can thoroughly grasp the basics of statistics, you can be more confident in the decisions you make in life.

Creating dot plots in R

A dot plot is a very basic graph that can quickly show you how your data are distributed, and are useful with small datasets. In R, a dot plot is referred to as a strip chart, and is plotted using the stripchart() function.

The stripchart() function over-plots points on top of each other by default. To override this behavior, optional arguments such as method and offset are added to the R command.



hours.sleep = c(5, 5.5, 6, 6, 6, 6.5, 6.5, 
                6.5, 6.5, 7, 7, 8, 8, 9)
stripchart(hours.sleep, method = "stack", 
           offset = 1, frame.plot = FALSE, 
           at = .25)

Figure 2
Dot plot with hours of sleep on the X-axis and frequency on Y-axis

Glossary

Data:
A set of observations (a set of possible outcomes). Most data can be put into two groups: qualitative (hair color, ethnic groups and other attributes of the population) and quantitative (distance traveled to college, number of children in a family, etc.). Quantitative data can be separated into two subgroups: discrete and continuous. Data is discrete if it is the result of counting (the number of students of a given ethnic group in a class, the number of books on a shelf, etc.). Data is continuous if it is the result of measuring (distance traveled, weight of luggage, etc.)
Statistic:
A numerical characteristic of the sample. A statistic estimates the corresponding population parameter. For example, the average number of full-time students in a 7:30 a.m. class for this term (statistic) is an estimate for the average number of full-time students in any class this term (parameter).

Content actions

Download module as:

Add module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks