You are on page 1of 48

Applied Statistics in Business &

Economics

David P. Doane and Lori E. Seward

Vũ Võ
vu.vo@ueh.edu.vn

10-1
Chapter 10
Two-Sample Hypothesis Tests

Chapter Contents

10.1 Two-Sample Tests


10.2 Comparing Two Means: Independent Samples
10.3 Confidence Interval for the Difference of Two Means, µ1 − µ2
10.4 Comparing Two Means: Paired Samples
10.5 Comparing Two Proportions
10.6 Confidence Interval for the Difference of Two Proportions, π1 − π2
10.7 Comparing Two Variances

10-2
Chapter 10
Chapter Learning Objectives (LOs)

LO10-1: Recognize and perform a test for two means.


LO10-2: Explain the assumptions underlying the two-sample
test of means.
LO10-3: Construct a confidence interval for µ1 − µ2.
LO10-4: Recognize paired data and be able to perform a
paired t test.
LO10-5: Perform a test to compare two proportions using z.

10-3
Chapter 10
Chapter Learning Objectives (LOs), continued

LO10-6: Check whether normality may be assumed for two


proportions.
LO10-7: Construct a confidence interval for π1− π2.
LO10-8: Carry out a test of two variances using the
F distribution.

10-4
Chapter 10
10.1 Two-Sample Tests
What Is a Two-Sample Test?
• A two-sample test compares two sample estimates with each
other.
• A one-sample test compares a sample estimate to a non-sample
benchmark.

Basis of Two-Sample Tests


• Two-sample tests are especially useful because they possess a
built-in point of comparison.
• The logic of two-sample tests is based on the fact that two
samples drawn from the same population may yield different
estimates of a parameter due to chance.

10-5
Chapter 10
10.1 Two-Sample Tests (continued)
Basis of Two-Sample Tests (continued)
• If the two sample statistics differ by more than the amount
attributable to chance, then we conclude that the samples came
from populations with different parameter values.

10-6
Chapter 10
10.1 Two-Sample Tests (continued, 2)

Test Procedure

• State the hypotheses.


• Set up the decision rule.
• Insert the sample statistics.
• Make a decision based on the critical values or using p-
values.

10-7
Chapter 10
10.2 Comparing Two Means:
Independent Samples

LO10-1: Recognize and perform a test for two means.


Format of Hypotheses
• The hypotheses for comparing two independent
population means µ1 and µ2 are:

Note: D0 is the difference in the means.


10-8
Chapter 10
LO10-2: Explain the assumptions underlying the
two-sample test of means.

Case 1: Known Variances

• When the variances are known, use the normal distribution for the
test (assuming a normal population). The test statistic is:

10-9
Chapter 10
LO10-2: Explain the assumptions underlying the
two-sample test of means (continued).

Case 2: Unknown Variances, Assumed Equal

• Since the variances are unknown, they must be estimated


and the Student’s t distribution is used to test the means.
• Assuming the population variances are equal, s12 and s22
can be used to estimate a common pooled variance sp2.

10-10
Chapter 10
LO10-2: Explain the assumptions underlying the
two-sample test of means (continued, 2).

Case 3: Unknown Variances, Assumed Unequal

• If the unknown variances are assumed to be unequal, they


are not pooled together.

• In this case, the distribution of the random variable ̅ 1 − ̅ 2


is not certain (Behrens-Fisher problem).

• Use the Welch-Satterthwaite test which replaces σ12 and


σ22 with s12 and s22 in the known variance z formula, then
use the Student’s t test with adjusted degrees of freedom.

10-11
Chapter 10
LO10-2: Explain the assumptions underlying the
two-sample test of means (continued, 3).

Case 3: Unknown Variances, Assumed Unequal (continued)

• Welch-Satterthwaite test

• A conservative quick rule for degrees of freedom is to


use min(n1 – 1, n2 – 1).
10-12
Chapter 10
LO10-2: Explain the assumptions underlying the
two-sample test of means (continued, 4).
Summary for the Test Statistic
• For the common situation of testing for a zero difference (D0 = 0) in
two population means the possible pairs of null and alternative
hypotheses are:

• If the population variances σ12 and σ22 are known, then use
the normal distribution.
• If population variances are unknown and estimated using
s12 and s22, then use the Students t distribution.
10-13
Chapter 10
LO10-2: Explain the assumptions underlying the
two-sample test of means (continued, 5).

Summary for the Test Statistic

10-14
Chapter 10
LO10-2: Explain the assumptions underlying the
two-sample test of means (continued, 6).

Steps in Testing Two Means (See text for examples)

• Step 1: State the hypotheses.


• Step 2: Specify the decision rule.
Choose α (the level of significance) and determine the critical
value(s).
• Step 3: Calculate the Test Statistic.
• Step 4: Make the decision Reject H0 if the test statistic falls in the
rejection region(s) as defined by the critical value(s).
• Step 5: Take action based on the decision.

10-15
Chapter 10
LO10-2: Explain the assumptions underlying the
two-sample test of means (continued, 7).

Which Assumption Is Best?


• If the sample sizes are equal, the Case 2 and Case 3 test statistics
will be identical, although the degrees of freedom may differ.
• If the variances are similar, the two tests will usually agree.
• If no information about the population variances is available, then
the best choice is Case 3.
• The fewer assumptions, the better.

Must Sample Sizes Be Equal?


• Unequal sample sizes are common and the formulas still apply.

10-16
Chapter 10
LO10-2: Explain the assumptions underlying the
two-sample test of means (continued, 8).
Large Samples
• For unknown variances, if both samples are large (n1 ≥ 30 and
n2 ≥ 30) and the population is not badly skewed, use the following
formula with appendix C:

10-17
Chapter 10
LO10-2: Explain the assumptions underlying the
two-sample test of means (continued, 9).
Large Samples
• For unknown variances, if both samples are large (n1 ≥ 30 and
n2 ≥ 30) and the population is not badly skewed, use the following
formula with appendix C:

Caution: Three Issues


1. Are the populations skewed? Are there outliers?

Check using histograms and/or dot plots of each sample.


t tests are okay if moderately skewed, especially if samples are
large. Outliers are more serious.
10-18
Chapter 10
10.2 Comparing Two Means:
Independent Samples (continued)
Caution: Three Issues
1. Are the populations skewed? Are there outliers?

Check using histograms and/or dot plots of each sample.


t tests are okay if moderately skewed, especially if samples are
large. Outliers are more serious.

2. Are the sample sizes large (n ≥ 30)?


If the samples are small, the mean is not a reliable indicator of
central tendency, and the test may lack power.
3. Is the difference important as well as significant?
A small difference in means or proportions could be significant if
the sample size is large.
10-19
Chapter 10
10.3 Confidence Interval for the
Difference of Two Means µ1 − µ2
LO10-3: Construct a confidence interval for µ1 − µ2.

Confidence Intervals for the Difference of Two Means

10-20
Chapter 10
LO10-3: Construct a confidence interval for µ1 − µ2
(continued).

Confidence Intervals for the Difference of Two Means

10-21
Chapter 10
LO10-3: Construct a confidence interval for µ1 − µ2
(continued, 2).

Example (Marketing Teams)

Do teams that collaborate virtually feel they get along better


than teams that collaborate face-to-face? A study was
conducted with senior marketing majors at a large business
school. Students were randomly assigned to a team that
collaborated online or a team that collaborated face-to-face.
Both teams were given five cases to analyze. At the end of the
study, each team member was asked to rate how well he or she
felt the team got along by responding to the statement “I felt our
members got along well together.” The response scale was a
1–5 Likert scale with “1” = strongly disagree and “5” = strongly
agree.
10-22
Chapter 10
LO10-3: Construct a confidence interval for µ1 − µ2
(continued, 3).

Example (Marketing Teams) (continued)


Table 10.4 shows the means and standard deviations for the two
groups. The population variances are unknown but will be
assumed equal (note the similar standard deviations). For a
confidence level of 90 percent, we use Student’s t with d.f. = 44
+ 42 − 2 = 84. From Appendix D we obtain t.05 = 1.664 (using 80
degrees of freedom, the next lower value).

10-23
Chapter 10
LO10-3: Construct a confidence interval for µ1 − µ2
(continued, 4).
Example (Marketing Teams) (continued)
The confidence interval is

Because this confidence interval does not include zero, we can say with 90
percent confidence that there is a difference between the means (i.e., the
virtual team’s mean differs from the face-to-face team’s mean). If we had not
assumed equal variances, the results would be the same in this case because
the samples are large and of similar size, and the variances do not differ
greatly. But when you have small, unequal sample sizes or unequal variances,
the methods can yield different conclusions.
10-24
Chapter 10
LO10-3: Construct a confidence interval for µ1 − µ2
(continued, 5).
Note:

Because the calculations for the comparison of two sample


means are time-consuming, it is helpful to use software. See the
Software Supplement at the end of this chapter in the text for an
illustration of MegaStat’s menu for comparing two sample means.

Should Sample Sizes Be Equal?


Many people instinctively try to choose equal sample sizes for
tests of means. It is preferable to avoid unbalanced sample sizes
to increase the power of the test, but it is not necessary. Unequal
sample sizes are common, and the formulas still apply.

10-25
Chapter 10
10.4 Comparing Two Means: Paired
Samples
LO10-4: Recognize paired data and be able to perform a
paired t test.
Paired Data
• Data occurs in matched pairs when the same item is observed twice but
under different circumstances.
• For example, blood pressure is taken before and after a treatment is
given.
• Paired data are typically displayed in columns.

10-26
Chapter 10
LO10-4: Recognize paired data and be able to perform a
paired t test (continued).

Paired t Test
• Paired data typically come from a before/after
experiment.
• In the paired t test, the difference between x1 and x2 is
measured as d = x1 – x2
• The mean and standard deviation for the differences d
are given below.

10-27
Chapter 10
LO10-4: Recognize paired data and be able to perform a
paired t test (continued, 2).
Paired t Test
Because the population variance of d is unknown, we will do a
paired t test using Student’s t with n − 1 degrees of freedom to
compare the sample mean difference ̅ with a hypothesized
difference μd (usually μd = 0). The test statistic is really a one-
sample t test, just like those in Chapter 9.

Test Statistic for Paired Sample

10-28
Chapter 10
LO10-4: Recognize paired data and be able to perform a
paired t test (continued, 3).

Steps in Testing Paired Data

• Step 1: State the hypotheses, for example:


H0: µd = 0
H1: µd ≠ 0
• Step 2: Specify the decision rule.
Choose α (the level of significance) and determine the critical
values from Appendix D or with the use of technology.
• Step 3: Calculate the test statistic t.
• Step 4: Make the decision.
Reject H0 if the test statistic falls in the rejection region(s) as
defined by the critical values.

10-29
Chapter 10
LO10-4: Recognize paired data and be able to perform a
paired t test (continued, 4).

Analogy to Confidence Interval


A two-tailed test for a zero difference is equivalent to asking
whether the confidence interval for the true mean difference
µd includes zero.

Why Not Treat Paired Data as Independent Samples?


When observations are matched pairs, the paired t test is
more powerful because it utilizes information that is
ignored if we treat the samples separately.

10-30
Chapter 10
10.5 Comparing Two Proportions

LO10-5: Perform a test to compare two proportions using z.

Testing for Zero Difference: π1 − π2 = 0

• To compare two population proportions, π1 and π2,


use the following hypotheses:

10-31
Chapter 10
LO10-5: Perform a test to compare two proportions
using z (continued).
Testing for Zero Difference: π1 − π2 = 0 (continued)

Sample Proportions

• The sample proportion p1 is a point estimate


of π1 and p2 is a point estimate of π2:

10-32
Chapter 10
LO10-5: Perform a test to compare two proportions
using z (continued, 2).
Testing for Zero Difference: π1 − π2 = 0 (continued)

Pooled Proportion
• If H0 is true, there is no difference between π1 and π2,
so the samples are pooled (or averaged) in order to
estimate the common population proportion.

10-33
Chapter 10
LO10-5: Perform a test to compare two proportions
using z (continued, 3).
Testing for Zero Difference: π1 − π2 = 0 (continued)
Test Statistic
• If the samples are large, p1 – p2 may be assumed normally
distributed.
• The test statistic is the difference of the sample proportions divided
by the standard error of the difference.
• The standard error is calculated by using the pooled proportion.
• The test statistic for the hypothesis π1 − π2 = 0 is:

10-34
Chapter 10
LO10-5: Perform a test to compare two proportions
using z (continued, 4).

Testing for Zero Difference: π1 − π2 = 0 (continued)

Steps in Testing Two Proportions

• Step 1:State the hypotheses.


• Step 2: Specify the decision rule.
Choose α (the level of significance) and determine the critical
value(s).
• Step 3: Calculate the Test Statistic. Assuming that π1 = π2, use a
pooled estimate of the common proportion.
• Step 4: Make the decision Reject H0 if the test statistic falls in the
rejection region(s) as defined by the critical value(s).

10-35
Chapter 10
10.5 Comparing Two Proportions
(continued)
LO10-6: Check whether normality may be assumed for
two proportions.
Testing for Zero Difference: π1 − π2 = 0 (continued)
Checking for Normality
• We have assumed a normal distribution for the statistic p1 – p2.
• This assumption can be checked.
• For a test of two proportions, the criterion for normality is nπ ≥ 10 and
n(1 − π) ≥ 10 for each sample, using each sample proportion in place
of π.
• If either sample proportion is not normal, their difference cannot
safely be assumed normal.
• The sample size rule of thumb is equivalent to requiring that each
sample contains at least 10 “successes” and at least 10 “failures.”
10-36
Chapter 10
LO10-6: Check whether normality may be assumed for
two proportions (continued).

Testing for Non-Zero Difference

10-37
Chapter 10
10.6 Confidence Interval for the Difference
of Two Proportions π1 − π2
LO10-7: Construct a confidence interval for π1 − π2.

• If the confidence interval does not include 0, then we


will reject the null hypothesis of no difference in the
proportions.

10-38
Chapter 10
10.7 Comparing Two Variances
LO10-8: Carry out a test of two variances using the
F distribution
Format of Hypotheses
• To test whether two population means are equal, we may also
need to test whether two population variances are equal.

10-39
Chapter 10
LO10-8: Carry out a test of two variances using the
F distribution (continued).
The F Test
• The test statistic is the ratio of the sample variances:

• If the variances are equal, this ratio should be near unity:


F = 1 (which implies that the null hypothesis is true).

10-40
Chapter 10
LO10-8: Carry out a test of two variances using the
F distribution (continued, 2).
The F Test (continued)

• If the test statistic is far below 1 or above 1, we would


reject the hypothesis of equal population variances.
• The numerator s12 has degrees of freedom df1 = n1 – 1
and the denominator s22 has degrees of freedom
df2 = n2 – 1.
• The F distribution is skewed with the mean > 1 and its
mode < 1.

10-41
Chapter 10
LO10-8: Carry out a test of two variances using the
F distribution (continued, 3).

The F Test: Critical Values

• Critical values for the F test are denoted


FL (left tail) and FR (right tail).
• A right-tail critical value FR may be found from
Appendix F using df1 and df2 degrees of freedom.
FR = Fdf1, df2
• A left-tail critical value FR may be found by reversing
the numerator and denominator degrees of freedom,
finding the critical value from Appendix F and taking its
reciprocal: FL = 1/Fdf2, df1

10-42
Chapter 10
LO10-8: Carry out a test of two variances using the
F distribution (continued, 4).
The F Test: Critical Values (continued)

10-43
Chapter 10
LO10-8: Carry out a test of two variances using the
F distribution (continued, 5).
Comparison of Variances: Two Tailed Test

• Step 1: State the hypotheses, for example:


H0: σ12 = σ22
H1: σ12 ≠ σ22
• Step 2: Specify the decision rule.
Degrees of freedom are:
Numerator: df1 = n1 – 1
Denominator: df2 = n2 – 1
Choose a and find the left-tail and right-tail critical
values from Appendix F.

10-44
Chapter 10
LO10-8: Carry out a test of two variances using the
F distribution (continued, 6).

Comparison of Variances: Two Tailed Test (continued)

• Step 3: Calculate the test statistic Fcalc = s12/s22.


• Step 4: Make the decision
Reject H0 if the test statistic falls in the rejection
regions as defined by the critical values FL and FU.

10-45
Chapter 10
LO10-8: Carry out a test of two variances using the
F distribution (continued, 7).

Comparison of Variances: One Tailed Test

• Step 1: State the hypotheses, for example:


H0: σ12 = σ22
H1: σ12 < σ22
• Step 2: State the decision rule.
Degrees of freedom are:
Numerator: df1 = n1 – 1
Denominator: df2 = n2 – 1
Choose a and find the left-tail critical value from
Appendix F.

10-46
Chapter 10
LO10-8: Carry out a test of two variances using the
F distribution (continued, 8).

Comparison of Variances: One Tailed Test (continued)

• Step 3: Calculate the Test Statistic Fcalc = s12/s22.


• Step 4: Make the decision.
Reject H0 if the test statistic falls in the left-tail rejection
region as defined by the critical value.

10-47
Chapter 10
LO10-8: Carry out a test of two variances using the
F distribution (continued, 9).
Folded F Test
• We can make the two-tailed test for equal variances
into a right-tailed test, so it is easier to look up the
critical values in Appendix F. This method requires that
we put the larger observed variance in the numerator,
and then look up the critical value for α/2 instead of the
chosen α.

• The test statistic for the folded F test is

10-48

You might also like