Skip to content Skip to navigation


You are here: Home » Content » Hypothesis Testing: Two Population Means and Two Population Proportions: Comparing Two Independent Population Means with Unknown Population Standard Deviations


Recently Viewed

This feature requires Javascript to be enabled.


(What is a tag?)

These tags come from the endorsement, affiliation, and other lenses that include this content.

Hypothesis Testing: Two Population Means and Two Population Proportions: Comparing Two Independent Population Means with Unknown Population Standard Deviations

Module by: Barbara Illowsky, Ph.D., Susan Dean. E-mail the authors

Summary: Note: This module is currently under revision, and its content is subject to change. This module is being prepared as part of a statistics textbook that will be available for the Fall 2008 semester.

Note: You are viewing an old version of this document. The latest version is available here.

  1. The two independent samples are simple random samples from two distinct populations.
  2. Both populations are normally distributed with the population means and standard deviations unknown.

The comparison of two population means is very common. A difference between the two samples depends on both the means and the standard deviations. Very different means can occur by chance if there is great variation among the individual samples. In order to account for the variation, we take the difference of the sample means, X 1 ¯ X 1 - X 2 ¯ X 2 , and divide by the standard error (shown below) in order to standardize the difference. The result is a t-score test statistic (shown below).

Because we do not know the population standard deviations, we estimate them using the two sample standard deviations from our independent samples. For the hypothesis test, we calculate the estimated standard deviation, or standard error, of the difference in sample means, X 1 ¯ X 1 - X 2 ¯ X 2 .

The standard error is:

( S 1 ) 2 n 1 + ( S 2 ) 2 n 2 ( S 1 ) 2 n 1 + ( S 2 ) 2 n 2

The test statistic (t-score) is calculated as follows:


( X 1 ¯ - X 2 ¯ ) - ( μ 1 ¯ - μ 2 ¯ ) ( S 1 ) 2 n 1 + ( S 2 ) 2 n 2 ( X 1 - X 2 ) - ( μ 1 - μ 2 ) ( S 1 ) 2 n 1 + ( S 2 ) 2 n 2


  • s1s1 and s2s2, the sample standard deviations, are estimates of σ1σ1 and σ2σ2, respectively.
  • σ1σ1 and σ2σ2 are the unknown population standard deviations.
  • x 1 ¯ x 1
  • and x 2 ¯ x 2 are the sample means. μ1μ1 and μ2μ2 are the population means.

The degrees of freedom (df) is a somewhat complicated calculation. However, a computer or calculator calculates it easily. The dfs are not always a whole number. The test statistic calculated above is approximated by the Student-t distribution with dfs as follows:

Degrees of freedom

df = [ ( s 1 ) 2 n 1 + ( s 2 ) 2 n 2 ] 2 1 n 1 1 · [ ( s 1 ) 2 n 1 ] 2 + 1 n 2 1 · [ ( s 2 ) 2 n 2 ] 2 df= [ ( s 1 ) 2 n 1 + ( s 2 ) 2 n 2 ] 2 1 n 1 1 · [ ( s 1 ) 2 n 1 ] 2 + 1 n 2 1 · [ ( s 2 ) 2 n 2 ] 2

When both sample sizes n1n1 and n2n2 are five or larger, the Student-t approximation is very good. Notice that the sample variances s 1 2 s 1 2 and s 2 2 s 2 2 are not pooled. (If the question comes up, do not pool the variances.)


It is not necessary to compute this by hand. A calculator or computer easily computes it.

Example 1: Independent groups

The average amount of time boys and girls ages 7 through 11 spend playing sports each day is believed to be the same. An experiment is done, data is collected, resulting in the table below:

Table 1
  Sample Size Average Number of Hours Playing Sports Per Day Sample Standard Deviation
Girls 9 2 hours 0.750.75
Boys 16 3.2 hours 1.00

Problem 1

Is there a difference in the average amount of time boys and girls ages 7 through 11 play sports each day? Test at the 5% level of significance.


The population standard deviations are not known. Let gg be the subscript for girls and bb be the subscript for boys. Then, μgμg is the population mean for girls and μbμb is the population mean for boys. This is a test of two independent groups, two population means.

Random variable: X g ¯ - X b ¯ X g - X b = difference in the average amount of time girls and boys play sports each day.

H o H o : μ g = μ b ( μ g μ b = 0 ) μ g = μ b ( μ g μ b =0)

H a H a : μ g μ b ( μ g μ b 0 ) μ g μ b ( μ g μ b 0)

The words "the same" tell you H o H o has an "=". Since there are no other words to indicate H a H a , then assume "is different." This is a two-tailed test.

Distribution for the test: Use t df t df where df df is calculated using the df df formula for independent groups, two population means. Using a calculator, df df is approximately 18.8462. Do not pool the variances.

Calculate the p-value using a Student-t distribution: p-value = 0.0054


Figure 1
Figure 1 (hyptest22_cmp1.png)

s g = 0.75 s g = 0.75

s b = 1 s b =1

So, x g ¯ - x b ¯ = 2 - 3.2 = - 1.2 x g - x b =2-3.2=-1.2

Half the p-value is below -1.2 and half is above 1.2.

Make a decision: Since α>α> p-value, reject H o H o .

This means you reject μ g = μ b μ g = μ b . The means are different.

Conclusion: At the 5% level of significance, the sample data show there is sufficient evidence to conclude that the average number of hours that girls and boys aged 7 through 11 play sports per day is different.

TI-83+ and TI-84: Press STAT. Arrow over to TESTS and press 4:2-SampTTest. Arrow over to Stats and press ENTER. Arrow down and enter 2 for the first sample mean, .75 for Sx1, 9 for n1, 3.2 for the second sample mean, 1 for Sx2, and 16 for n2. Arrow down to μ1: and arrow to does not equal μ2. Press ENTER. Arrow down to Pooled: and No. Press ENTER. Arrow down to Calculate and press ENTER. The p-value is p = 0.0054, the dfs are approximately 18.8462, and the test statistic is -3.14. Do the procedure again but instead of Calculate do Draw.

Example 2

A study is done by a community group in two neighboring colleges to determine which one graduates students with more math classes. College A samples 11 graduates. Their average is 4 math classes with a standard deviation of 1.5 math classes. College B samples 9 graduates. Their average is 3.5 math classes with a standard deviation of 1 math class. The community group believes that a student who graduates from college A has taken more math classes, on the average. Test at a 1% significance level. Answer the following questions.

Problem 1

Is this a test of two means or two proportions?


two means

Problem 2

Are the populations standard deviations known or unknown?



Problem 3

Which distribution do you use to perform the test?



Problem 4

What is the random variable?


X A ¯ - X B ¯ X A - X B

Problem 5

What are the null and alternate hypothesis?


  • H o : μ A μ B H o : μ A μ B
  • H a : μ A > μ B H a : μ A > μ B

Problem 6

Is this test right, left, or two tailed?



Problem 7

What is the p-value?



Problem 8

Do you reject or not reject the null hypothesis?


Do not reject.


At the 1% level of significance, from the sample data, there is not sufficient evidence to conclude that a student who graduates from college A has taken more math classes, on the average, than a student who graduates from college B.


Degrees of Freedom (df):
The number of objects in a sample that are free to vary.
Standard Deviation:
A number that is equal to the square root of the variance and measures how far data values are from their mean. Notations: s for sample standard deviation and σσ for population standard deviation.
Variable (Random Variable):
A characteristic of interest in a population being studied. Common notation for variables are upper case Latin letters XX size 12{X} {}, YY size 12{Y} {}, ZZ size 12{Z} {},...; common notation for specific value from the domain (set of all possible values of a variable) are lower case Latin letters xx size 12{x} {}, yy size 12{y} {}, zz size 12{z} {},.... For example, if XX size 12{X} {} is a number of children in a family, then domain is and xx size 12{x} {} represents any integer from 0 to 20. Variable in statistics differs from variable in intermediate algebra in two following ways.
  • The domain of random variable (RV) is not necessarily numerical set; it can be some “wording” set; for example, if XX size 12{X} {} = hair color then the domain is {black, blond, gray, green, orange}.
  • We can tell what specific value of xx size 12{x} {} does the variable XX size 12{X} {} take only after performing the experiment.
Before the experiment any value from domain is possible. For example, without ultrasound we can not tell the gender of a baby that should be delivered, but after delivery the gender is evident. More exact, every value from the domain is accompanied with some number pp size 12{p} {}, 0p10p1 size 12{0 <= p <= 1} {}, that characterizes the chance to have this value as an outcome of the experiment. In the example with gender, p=12p=12 size 12{p= { {1} over {2} } } {}. That’s why statisticians use more exact name “Random variable” (RV) instead of variable. Even more, they use word “distribution” having in the mind the RV, that is the pairing (value, probability of the value).

Content actions

Download module as:

Add module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens


A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks