Unit 2 2000

STAT 2000 – Unit 2
Carrie Madden
Carrie Madden STAT 2000 – Unit 2 1 / 99

Unit 2 – Inference for the Mean of a Two Populations
Unit 2 – Inference for the Mean of a Two

Populations

Paired Data
We will often encounter a situation where data are collected in pairs. In

such situations, we will usually be interested not in the individual
observations, but rather in the differences in values of some variable for
each pair.

Paired Data
The term paired data means that the data have been observed in natural
pairs. There are several ways in which paired data can occur:
Two different variables are measured for each individual. We are

interested in the difference between the values of the two variables.
Each individual is measured twice. The two measurements of the
same characteristic are made under different conditions (or at
different times).
Similar individuals are placed in pairs and each member of the pair
then receives a different treatment. The same response variable is
measured and compared for the two individuals in each pair.

Matched Pairs t Procedures
The matched pairs t procedures will be used to help us detect and

estimate any differences between responses to the two treatments. Rather
than making just one comparison for the variables of interest, we will make
one comparison for each of the n pairs.

Matched Pairs t Procedures
The parameter of interest is µd , the true mean of the differences of all

pairs in the population.
As such, in using the t procedures, our distributional assumption is that
the differences follow a normal distribution with mean µd and standard
deviation σd . We must also assume that our pairs represent an SRS of all
possible pairs from the population.
The methods for constructing confidence intervals and conducting
hypothesis tests in the matched pairs setting are the same as the
one-sample case, only now we must keep in mind that we are examining
differences rather than individual observations.

Example – Mileage
Many drivers buy premium gasoline instead of regular in the belief that
they will get better gas mileage. To test this belief, we obtain a sample of
8 cars. Each car is run for one tank on regular gas and one tank on
premium gas, with the order randomly determined. The mileage (in miles
per gallon) is measured for each car for each type of gasoline.

Example – Gas Prices (Cont’d)
The data are as follows:
Car Regular Premium Difference (P-R)

1 20 24 4
2 26 26 0
3 23 25 2
4 22 23 1
5 23 25 2
6 19 18 -1
7 22 27 5
8 26 29 3

R-Code – Gas Prices

Here is the R code to find x¯d and sd :
reg<-c(20,26,23,22,23,19,22,26)
prem<-c(24,26,25,23,25,18,27,29)
diff<- prem-reg
diff
## [1] 4 0 2 1 2 -1 5 3
mean(diff)
## [1] 2
sd(diff)
## [1] 2
Example – Gas Mileage
We will construct a 95% confidence interval for the true mean difference in
mileage between cars run with regular and premium gas.
To do this we assume the differences follow a normal distribution.
We estimate the population mean difference µd by the sample mean
difference x̄d and the population standard deviation σd by the sample
standard deviation of differences sd .

A 95% confidence interval is:

∗ √ sd
x¯d ± tn−1
n
From the previous R-Code; x¯d = 2 and sd = 2
tˆ{*}_{7}= 2.365$
Therefore;
√
2 ± (2.365)(2/ 8)
(0.33, 3.67)

Example – Gas Mileage – R Code
mean<-mean(diff)
sd<-sd(diff)
cv<-qt(0.975,7)
moe<-cv*(sd/sqrt(8))
left<-mean-moe
left
## [1] 0.3279582
right<-mean+moe
right
## [1] 3.672042

Interpretation
If we took repeated samples of eight cars and calculated the interval in a
similar manner, then 95% of all such intervals would contain the true
mean difference in fuel economy for premium and regular gasoline.
Note that we could have define the difference the other way (i.e., Regular -
Premium). The signs of the differences (positive or negative) would simply
switch in this case, and the interval would give us the same information.
Now suppose we would like to test whether these data provide convincing
evidence that premium gasoline provides better fuel economy on average.
We will use the P-value method.

Example – Gas Mileage – Hypothesis Test
1 Let α = 0.05
2
H0 : µ d = 0
Ha : µ d > 0
– Note that we are testing whether the mean difference is greater

than zero. This is because if premium gasoline really is better, we
expect the average difference to be positive (since we defined the
differences as d = P − R). If we had defined the differences as
d = R − P, then this would be a lower-tailed test.


3 Reject H0 if the P-value ≤ α = 0.05.
4 The test statistic is
x̄d − µd0 2.0 − 0
t= sd = 2.0 = 2.83
√ √
n 8

5 The P-value is P T(7) ≥ 2.83 . We see from Table 3 that

P T(7) ≥ 2.517 = 0.02 and P T(7) ≥ 2.998 = 0.01
Since 2.517 < t = 2.83 < 2.998, the P-value is between 0.01 and
0.02.
For any P-value between 0.01 and 0.02, we reject the null hypothesis.
6 We have sufficient evidence that premium gasoline provides better
fuel economy on average than regular gasoline.
The interpretation of the P-value is analogous to the one-sample case:

Interpretation
If there was no difference in average fuel economy for regular and premium
gasoline, the probability of getting a sample mean difference at least as
high as 2.0 mpg would be between 0.01 and 0.02.
Suppose we had instead used the critical value method to conduct the
test. The decision rule would be to reject H0 if t ≥ t ∗ = 1.895. Our
conclusion would still remain the same, as we would reject H0 since
t = 2.83 > t ∗ = 1.895.

Example – Gas Mileage – Hypothesis Test – R Code
t.test(diff, mu= 0, alternative = ’greater’)
##
## One Sample t-test
##
## data: diff
## t = 2.8284, df = 7, p-value = 0.01273
## alternative hypothesis: true mean is greater than 0
## 95 percent confidence interval:
## 0.6603306 Inf
## sample estimates:
## mean of x
## 2

Example – Gas Mileage – Critical Value
cv<-qt(0.05, 7, lower.tail = FALSE)

cv
## [1] 1.894579

Example – Pharmaceutical Company
A pharmaceutical company is testing a new blood pressure medication.

The systolic blood pressures of 20 patients suffering from hypertension are
measured before and after taking the medication and we calculate the
differences d = After − Before. We calculate the sample mean difference
to be x̄d = −10.9 and the sample standard deviation of differences to be
sd = 3.7. Researchers would like to conduct a hypothesis test to determine
whether the medication is successful in reducing systolic blood pressure.
We will assume that the differences follow a normal distribution. We will
use the critical value method.

1 Let α = 0.10.
2
H0 : µ d = 0
Ha : µ d < 0
3 Reject H0 if t ≤ t ∗ = −1.328 where −t ∗ = −1.328 is the lower 0.10

critical value from the t distribution with 19df .
x̄d − µd0 −10.9 − 0
t= sd = 3.7 = −13.17
√ √
n 20

Since t = −13.17 < −t ∗ = −1.328, we reject the null hypothesis.

5 We have sufficient evidence that the medication reduces the mean
blood pressure of patients with hypertension.
Suppose we had instead used the P-value method to conduct the test.
The decision rule would be to reject H0 if the P-value ≤ α = 0.10.

The P-value is P T(19) ≤ −13.17 = P (T (19) ≥ 13.17) by the symmetry
of the t distributions. We see from Table 3 that
P (T ≥ (19) ≥ 3.883) = 0.0005
Since 13.17 > 3.883, the P-value is less than 0.0005.Therefore, we would
reject H0 .

Example – Psychology
Many psychology studies have examined the differences between the first
and second children in a family. Suppose we would like to compare their
academic performance. We examine the high school GPAs of a sample of
30 pairs of siblings and calculate the difference in GPAs,
d = first child − second child. We calculate an average difference of
x̄d = 0.12 and a standard deviation of sd = 0.31. We will assume that
difference follow a normal distribution. Calculate a 99% confidence
interval for the true mean difference in GPAs.

A 99% confidence interval for the true mean difference in GPAs is
sd 0.31

∗
x̄d ± t √ = 0.12 ± 2.756 √
n 30
= (−0.036, 0.276)

Example – Psychology – R Code
xbard<-0.12
sd<-0.31
cv<-qt(0.005, 29, lower.tail = FALSE)
cv
## [1] 2.756386

Example – Psychology – R Code
moe<-sd/sqrt(30)
lower<-xbard-(cv*moe)
upper<-xbard+(cv*moe)
lower
## [1] -0.03600592
upper
## [1] 0.2760059
I printed out the critical value, upper and lower bound so you can see the
calculations.

We will now conduct a hypothesis test to determine whether there is
evidence of a difference in the average high school academic performance
between the first and second children in a family. We will use the P-value
method.
1 Let α = 0.01.
2
H0 : µ d = 0
Ha : µd ̸= 0
3 Reject H0 if the P-value ≤ α = 0.01.

x̄d − µd0 0.12 − 0
t= sd = 0.31 = 2.12
√ √
n 30

5 The P-value is 2P T(29) ≥ 2.12 . We see from Table 3 that

P T(29) ≥ 2.045 = 0.025 and P T(29) ≥ 2.150 = 0.02

Since 2.045 < t = 2.12 < 2.150, P T(29) ≥ 2.12 is between 0.02
and 0.025, so the P-value is between 2(0.02) = 0.04 and
2(0.025) = 0.05.
For any P-value between 0.04 and 0.05, we fail to reject the null
hypothesis.
6 We have insufficient evidence that there is a difference in average
high school academic performance between the first and second
children in a family.

test. The decision rule would be to reject H0 if |t| ≥ t ∗ = 2.756, i.e., if
t ≥ 2.756 of t ≤ −2.756, where t ∗ = 2.756 is the upper 0.005 critical
value for the t distribution with 29df .
Our conclusion would be the same as we fail to reject H0 , since
−2.756 < t = 2.12 < 2.756.

Since this was a two-sided test with a 1% level of significance, we could

have used the 99% confidence interval to perform the test.
Since µd0 = 0 is contained in the 99% confidence interval for µd , we fail
to reject H0 at the 1% level of significance.

Practice Question
Measurements of the left-hand and right-hand gripping strengths of six

left-handed people are measured. We will conduct a hypothesis test to
determine whether left-handed people have a greater gripping strength in
their left hand than their right hand on average. Using the critical value
approach, with a 5% level of significance, the null hypothesis should be
rejected if:
1 t > 1.796
2 t < −2.571 or t > 2.571
3 t > 1.943
4 t < −2.201 or t > 2.201
5 t > 2.015

Practice Question
Some information that may be helpful is shown below:
Left Right Difference (L-R)

mean = 109 mean = 104 mean = 5
std.dev. = 15 std.dev. = 13 std.dev. = 6
The value of the appropriate test statistic is:

1 1.57
2 2.04
3 4.26
4 0.83
5 3.11

Practice Question Cont’d
Which of the following statements is/are true?

1 For each person, left-hand and right-hand gripping strength are
dependent.
2 For each person, left-hand and right-hand gripping strength are
independent.
3 In order to conduct a matched pairs t test, we must assume that
left-hand and right-hand gripping strengths are both normally
distributed.
4 In order to conduct a matched pairs t test, we must assume that the
differences in left-hand and right-hand gripping strength are normally
distributed.

Practice Question Cont’d
1 2
2 1 and 3
3 1 and 4
4 2 and 3
5 2 and 4

Comparing Means of Two Independent Samples
In a matched pairs setting, the observations in each pair are dependent.

We now turn our attention to the case of comparing two population means
using independent samples.

Comparing Means of Two Independent Samples
We will consider the difference between the means for Population 1 and
Population 2
µ1 − µ2

Comparing Means of Independent Samples
“How is this different from the matched pairs case?”
In the matched pairs case, samples were dependent. It made sense to

examine each pair of individuals separately before making an overall
comparison. In the two sample case, observations are not in pairs. The
first observation from the first group in the sample is not related to the
first observation from the second group any more than it is to any other
observation in the second group.
As such, it makes no sense to look at individual differences. Rather than
examining the mean difference (as we did in the matched pairs t
procedures), we examine the difference in means.

Two-Sample t Procedures
The values of the population standard deviations are usually unknown

(since we don’t have information about the entire population). In this case,
we estimate the population standard deviations σ1 and σ2 by the sample
standard deviations s1 and s2 respectively. Analogous to the one-sample
case, we substitute these values into the formula for the two-sample case.

The result is the two-sample t statistic:

X̄1 − X̄2 − (µ1 − µ2 )
t= r
s12 s22
n1 + n2
The inference procedures we will use will differ slightly, depending on

whether the population standard deviations σ1 and σ2 are equal or not.

The Case of Unequal Population Standard Deviations

When the population standard deviations σ1 and σ2 are not equal, the
two-sample t statistic

X̄1 − X̄2 − (µ1 − µ2 )
t= r
s12 s22
n1 + n2
does not have an exact t distribution.


A t distribution replaces the standard normal distribution only when a
single standard deviation σ is replaced by the sample standard deviation s.
In this case, we replace two standard deviations σ1 and σ2 by their
estimates s1 and s2 , which does not produce a statistic with an exact t
distribution.


We can, however, approximate the distribution of this two-sample t
statistic by using a t distribution with k degrees of freedom, where the
degrees of freedom k is approximated by
df = k = min{n1 − 1, n2 − 1} (1)


R uses a different method to calculate degrees of freedom known as
Satterthwaite Approximation.
!2
s12 s2
+ 2
n1 n2
df = k = !2 !2
1 s12 1 s22
+
n1 − 1 n1 n2 − 1 n2


We take a simple random sample of size n1 from Population 1 and an
independent simple random sample of size n2 from Population 2, where
the population standard deviations σ1 and σ2 are unknown and unequal. A
100 (1 − α) % confidence interval for X¯1 − X¯2 is
s
s12 s2
X¯1 − X¯2 ± t ∗ + 2
n1 n2
where t ∗ is the upper α2 critical value from the t distribution with degrees
of freedom k calculated using the formula shown in equation 1.


“How do we know whether the population standard deviations σ1 and σ2
are equal or not?”
We can never be sure, because we don’t have information about the whole
population.

There are formal hypothesis tests to assess the equality of variances for two
populations. However, in this course, we will use the following simple rule:
Rule of Thumb
We divide the higher of the two sample standard deviations by the lower:
max si
(i = 1, 2)
min si
If this quantity is:
2 or less, we can assume equal population standard deviations,
σ1 = σ2 .
greater than 2, we must assume that the population standard
deviations differ, σ1 ̸= σ2 .


In other words, if the higher of the two sample standard deviations is no
more than double the value of the lower sample standard deviation, we will
assume the population standard deviations are equal.
If the higher of the two sample standard deviations is more than double
the value of the lower sample standard deviation, we will assume the
population standard deviations are unequal.

Inference for the Means of Two Populations –

Example
We would like to estimate the difference in the mean gasoline price in

Winnipeg and Calgary.
The gasoline prices (in cents/litre) for a random sample of 8 Winnipeg gas
stations and 5 Calgary gas stations are recorded one day and are shown
below:
Winnipeg 119.9 122.4 121.7 120.9 121.0 122.9 119.9 121.7

Calgary 117.9 120.4 118.4 122.9 117.0

Example
Let X1 be the price of gas at a Winnipeg gas station and let X2 be the
price of gas for a Calgary gas station. From the data, we calculate
n1 = 8 x̄1 = 121.30 s1 = 1.09

n2 = 5 x̄2 = 119.32 s2 = 2.36
We see that
max si s2 2.36
= = = 2.17 > 2
min si s1 1.09
so we will conduct our inference using the method for unequal standard
deviations. We must assume that gas prices in both cities follow a normal
distribution.

Example
We estimate the degrees of freedom k to be:

df = k = min{n1 − 1, n2 − 1} = min{7, 4} = 4

Example – Cont’d
A 95% confidence interval for the difference in mean gas prices for the two
cities X¯1 − X¯2 is
s
(1.09)2 (2.36)2
= (121.30 − 119.32) ± 2.776 +
8 5
= 1.98 ± 3.12 = (−1.14, 5.10)
where t ∗ = 2.776 is the upper 0.025 critical value of the t distribution

with 4df.

Example (Cont’d)
Interpretation
If we were to take repeated samples of 8 Winnipeg gas stations and 5
Calgary gas stations and calculate the interval in a similar manner, then
95% of such intervals would contain the true difference in mean gas prices
for the two cities.


Now suppose we would like to conduct a hypothesis test comparing the
means for Population 1 and 2. We will be testing the null hypothesis
H0 : µ1 = µ2 (or equivalently, H0 : µ1 − µ2 = 0).
Our test statistic is:
x̄1 − x̄2
t=s
s12 s2
+ 2
n1 n2
Under the null hypothesis, this test statistic has an approximate t
distribution with degrees of freedom calculated from the approximation
formula.

Example - Cont’d
Suppose we would like to test, at the 5% level of significance, whether

there is a difference in mean gas prices for the two cities. We will use the
P-value method.
1 Let α = 0.05.
2 We are testing the hypotheses
H0 : µ 1 = µ 2
Ha : µ1 ̸= µ2
3 We will reject H0 if the P-value ≤ α = 0.05.


121.30 − 119.32
t=q = 1.76
(1.09)2 (2.36)2
8 + 5

5 The P-value is 2P T(4) ≥ 1.76 . We see from Table 3 that

P T(4) ≥ 1.533 = 0.10 and P T(4) ≥ 2.132 = 0.05
Since
t = 1.76 is between these two values, if follows that
P T(4) ≥ 1.76 is between 0.05 and 0.10. But our P-value is double
this probability, and so the P-value of the test is between 0.10 and
0.20.

Since the P-value > α = 0.05, we fail to reject H0 .

6 We have insufficient evidence to conclude that there is a difference in
mean gas prices for Winnipeg and Calgary.
Interpretation
If the true mean gas prices were the same for the two cities, the probability
of observing a difference in sample means at least as extreme as 1.98
would be between 0.10 and 0.20.

test. The decision rule would be to reject H0 if |t| ≥ 2.776 (i.e., if
t ≤ −2.776 or t ≥ 2.776), where t ∗ = 2.776 is the upper 0.025 critical
value for the t distribution with 4df .
Our conclusion would be the same as we fail to reject H0 , since
−t ∗ = −2.776 < t = 1.76 < t ∗ = 2.776.

Since this is a two-sided test, and since the confidence level of our
confidence interval and the significance level of our test add up to 1, we
could have used the 95% confidence interval to conduct this test.
Since the 95% confidence interval for µ1 − µ2 contains the value of 0, we
fail to reject H0 at the 5% level of significance.


By estimating our degrees of freedom as the smallest of n1 − 1 and n2 − 1,
we are taking a conservative approach in the sense that:
Confidence intervals we calculate will be slightly wider than the
confidence intervals we would obtain using the exact distribution (and
so our confidence level will slightly higher than the specified value).
Critical values and P-values will be slightly higher than those we
would obtain using the exact distribution. If we reject H0 using the
estimated degrees of freedom, we would definitely reject H0 using the
exact distribution.

The Case of Unequal Population Standard

Deviations
Example – R-Code
We would like to estimate the difference in the mean gasoline price in
Winnipeg and Calgary.
The gasoline prices (in cents/litre) for a random sample of 8 Winnipeg gas
stations and 5 Calgary gas stations are recorded one day and are shown
below:
Winnipeg 119.9 122.4 121.7 120.9 121.0 122.9 119.9 121.7

Calgary 117.9 120.4 118.4 122.9 117.0

Example – R-Code
First, I will create the two vectors in R:
Wpg<-c(119.9, 122.4, 121.7, 120.9, 121.0, 122.9, 119.9, 121.7)

Cgy<-c(117.9, 120.4, 118.4, 122.9, 117.0)

Example – R-Code
Hypothesis test and confidence interval for a two sided test. The degrees
of freedom are not found by min{n1 − 1, n2 − 1}
t.test(Wpg, Cgy, alternative = "two.sided", mu = 0,

paired = FALSE, var.equal=FALSE, conf.level = 0.95)
##
## Welch Two Sample t-test
##
## data: Wpg and Cgy
## t = 1.7647, df = 5.081, p-value = 0.137
## alternative hypothesis: true difference in means is not equ
## -0.89043 4.85043
## mean of x mean of y
## 121.30
Carrie Madden 119.32 STAT 2000 – Unit 2 62 / 99
Example – R Code – Output
From the previous slide, we notice:
t = 1.7647 which matches what we got when we did the hand

calculations.
df= 5 – this is different from how we calcuate the degrees of freedom
because with R we get an exact value. This would have an effect on
the t ∗ value we use in our confidence intervals as well.
our p-value is 0.137, which falls within the range of our hand
calculations. This is the exact p-value.

Practice Question
Carbon monoxide amounts, which are approximately normally distributed,

were measured for randomly selected filtered and non-filtered king-size
cigarettes to determine if filtering reduces the amount of carbon monoxide.
Let µF and µN represent the mean carbon monoxide levels for the filtered
and non-filtered cigarettes, respectively. Suppose we wish to test the claim
that the filters reduce the amount of carbon monoxide in cigarettes. The
appropriate hypotheses are:
A H0 : µF = µN vs. Ha : µF ̸= µN
B H0 : X̄F = X̄N vs. Ha : X̄F > X̄N
C H0 : µF = µN vs. Ha : µF > µN
D H0 : X̄F = X̄N vs. Ha : X̄F < X̄N
E H0 : µF = µN vs. Ha : µF < µN .

The Case of Equal Population Standard Deviations
When the population standard deviations σ1 and σ2 are equal, the

two-sample t statistic will have an exact t distribution. In this case, we
estimate the common variance σ 2 (= σ12 = σ22 ) by the quantity:
(n1 − 1)s12 + (n2 − 1)s22

sp2 =
n1 + n2 − 2
where sp2 is called the pooled sample variance of X1 and X2 . Note that the
pooled variance is just a weighted average of the variances of x1 and X2 ,
where the weights are the respective degrees of freedom.

We take a simple random sample of size n1 from Population 1 and an

independent simple random sample of size n2 from Population 2, where
the population standard deviations σ1 and σ2 are unknown and equal. A
100 (1 − α) % confidence interval for µ1 − µ2 is
s
1 1

∗
x¯1 − x¯2 ± t sp2 +
n1 n2
where t ∗ is the upper α2 critical value from the t distribution with

n1 + n2 − 2 degrees of freedom.

Now suppose we would like to conduct a hypothesis test comparing the

means from Population 1 and 2. We will be testing the null hypothesis
H0 : µ1 = µ2 (or equivalently, H0 : µ1 − µ2 = 0).
Our test statistic is
(x¯1 − x¯2 ) − 0 (x¯1 − x¯2 )
t=r = r
sp2 n11 + n12 sp2 n11 + n12
Under the null hypothesis, the test statistic has a t distribution with
n1 + n2 − 2 degrees of freedom.

Example – Driving
Would like to know if men drive faster on average than women. The
company took a random sample of 52 cars driven by men on a highway
and found the mean speed to be 114 km/h with a standard deviation of 10
km/h.Another sample of 30 cars driven by women on the same highway
gave a mean speed of 108 km/h with a standard deviation of 7 km/h.

Example – Driving
Let us construct a 98% confidence interval for the true difference between
the mean speeds of cars driven by men and women on this highway. Let
X1 be the speed of a car driven by a man and let X2 be the speed of a car
driven by a woman. Since
max si 10
= = 1.43 < 2
min si 7
we will use the pooled method, assuming equal population standard
deviations.

Example – Driving
First, we calculate the pooled sample variance:
(n1 − 1)s12 + (n2 − 1)s22

sp 2 =
n1 + n2 − 2
51(10)2 + 29(7)2 6521
= =
52 + 30 − 2 80
= 81.51

Example – Driving
The 98% confidence interval for µ1 − µ2 is:
s s
1 1 1 1

∗
x¯1 − x¯2 ± t sp2 + = (114 − 108) ± 2.374 81.51 +
n1 n2 52 30
= 6 ± 4.91 = (1.09, 10.91)
where t ∗ = 2.374 is the upper 0.01 critical value for the t distribution with
52 + 30 − 2 = 80df.
Interpretation
If we took repeated sample of the same sizes from the same populations
and constructed the interval in a similar manner, then 98% of such
intervals would contain the difference in the true mean speeds from men
and women on this highway.
Example – Driving
We now conduct the hypothesis test to determine whether there is
sufficient evidence that the average speed for men on the highway is
greater than that for women. We will use the P-value method.
1 Let α = 0.01.
2 We are testing the hypotheses
H0 : µ 1 = µ 2
Ha : µ1 > µ2
3 We will reject H0 if the P-value ≤ α = 0.01.

x¯1 − x¯2 114 − 108
t=r = r = 2.90
1 1 1 1
sp2 n1 + n2 81.51 52 + 30

Example – Driving

5 The P-value is P T(80) ≥ 2.90 . We see from Table 3 that

P T(80) ≥ 2.887 = 0.0025 and P T(80) ≥ 3.195 = 0.001
Since 2.887 < t = 2.90 < 3.195, it follows that the P-value is
between 0.001 and 0.0025.
Since the P-value < α = 0.01, we reject the null hypothesis.
6 We have sufficient evidence to conclude that the true mean speed of
male drivers on this highway is greater than that for females.
Interpretation
If the true mean speeds of men and women on this highway were equal,
the probability of observing a difference in sample means at least as high
as 6km/h would be between 0.001 and 0.0025.

Example – Driving
test. The decision rule would be to reject H0 if t ≥ t ∗ = 2.374, where
t ∗ = 2.374 is the upper 0.01 critical value from the t distribution with 80
degrees of freedom.
Our decision would be to reject H0 , since t = 2.90 > t ∗ = 2.374.

Example – Driving (R Code)
The reason we have p = 0.99 is because the entire area to the left of
t ∗ = 0.98 + 0.01
The R formula is qt(p, df )
cv<-qt(0.99, 80)
cv
## [1] 2.373868

Practice Question
A researcher is conducting an experiment in order to compare two drugs –
a new drug and an old drug. She would like to determine whether there is
evidence that the new drug is better than the old drug. In this experiment,
the researcher will commit a Type I Error if:
A she concludes that the drugs are equal in effectiveness when in fact
the new drug is better.
B she concludes that the drugs are equal in effectiveness when in fact
the old drug is better.
C she concludes that the old drug is better when in fact the new drug is
better.
D she concludes that the new drug is better when in fact the drugs are
equal in effectiveness.
E she concludes that the old drug is better when in fact the drugs are
equal in effectiveness.
Practice Question
The following are the percentages of alcohol found in samples of two
brands of beer:
Brand A 5.4 5.6 5.7 5.2

Brand B 4.9 5.3 5.5 5.2
We calculate s1 = 0.22 and s2 = 0.25 We are interested in testing the

hypothesis that the average percentage of alcohol in the two brands of
beer are the same.We should use:
A a matched pairs t test with 3 degrees of freedom.
B a matched pairs t test with 6 degrees of freedom.
C a two-sample t test with 3 degrees of freedom.
D a two-sample t test with 4 degrees of freedom.
E a two-sample t test with 6 degrees of freedom.
Example – Air Pollution
An engineer suspects that a power plant in the city is increasing the air
pollution in the vicinity of the plant. A sample of measurements of carbon
monoxide (in mg/m3 of air) in the vicinity of the plant is obtained for
comparison with another sample obtained from other locations in the city.
The data are as follows:
Near plant 40 16 44 47 26 64 35 53
45 52 31 38 44 29 45 47
Other 34 28 36 43 35 37 36
locations 39 32 25 37 40 44 38
Concentrations near the plant and at other locations in the city are both
assumed to follow normal distributions. We would like to test the
suspicion of the engineer at the 5% level of significance.

Solution
Let X1 be the concentration of carbon monoxide at a location near the
plant and let X2 be the concentration of carbon monoxide at another
location in the city. We calculate
n1 = 16 x̄1 = 41 s1 = 11.71
n2 = 14 x̄2 = 36 s2 = 5.19
Since
max si 11.71
= = 2.26 > 2
min si 5.19
we will use the method for unequal standard deviations. We will use the
critical value method.

Solution
1 Let α = 0.05.
2 We are test the hypotheses
H0 : µ 1 = µ 2
Ha : µ1 > µ2
3 We estimate the degrees of freedom, k
df = k = min{n1 − 1, n2 − 1}
= min{15, 13} = 13

Example – Air Pollution R Code
R calculates the degrees of freedom for unequal standard deviations using

something known as Satterthwaite Approximation. You will see this in the
R Code following the example:
!2
s12 s2
+ 2
n1 n2
df = k = !2 !2
1 s12 1 s22
+
n1 − 1 n1 n2 − 1 n2

Example – Air Pollution (Cont’d)
Solution
4 We will reject H if t ≥ 1.771, where t ∗ = 1.771 is the upper critical
0
value fro the t distribution with 13 degrees of freedom.
(x¯1 − x¯2 ) 41 − 36
t=r =q = 1.54
s12 s22 (11.71)2 (5.19)2
+
n1 + n2 16 14
Since t = 1.54 < t ∗ = 1.771, we fail to reject H0 .

6 We have insufficient evidence to conclude that the mean carbon
monoxide concentration near the plant is greater than the mean
concentration at other locations.

Example – Air Pollution R Code (Cont’d)

np<- c(40,16,44,47,26,64,35,53,45,52, 31, 38, 44, 29, 45, 47)
other<- c(34,28,36,43,35,37,36,39,32,25,37,40,44,38)
t.test(np, other, paired=FALSE, var.equal = FALSE,
alternative = "greater")
##
## Welch Two Sample t-test
##
## data: np and other
## t = 1.5438, df = 21.255, p-value = 0.0687
## alternative hypothesis: true difference in means is greater
## -0.570045 Inf
## 41 36
Practice Question
We wish to compare the carbohydrate content of two varieties of peaches
used for baby food. We take random samples of peaches of each variety.
The carbohydrate contents are measured and some summary statistics are
shown below:
Variety n mean std. dev
A 5 28.6 3.8
B 7 25.0 10.4
Assuming normally distributed populations, the value of the test statistic

to test the hypothesis that there is no difference in mean carbohydrate
content for the two varieties of peaches is:
A 0.84
B 2.40
C 0.73
D 0.79
E 2.31
Example – Memory
Nineteen people volunteer to participate in a memory experiment. The

volunteers are randomly divided into two groups. Group 1 (10 volunteers)
will be given a list of words to study and Group 2 (9 volunteers) will listen
to a reading of the same list of words. After half an hour, the subjects will
be asked to write down all words they can remember. The scores for each
of the subjects are shown below:
Group 1 52 71 43 39 42 63 50 48 47 35
Group 2 30 28 53 44 31 32 27 54 43
We would like to determine whether people’s memory differs depending on

whether they have read something or heard something. We will assume
scores follow a normal distribution for both groups.

Example – Memory
From the data, we calculate
x̄1 = 49 s1 = 10.93
x̄2 = 38 s2 = 10.68
Since max si 10.93

min si = 10.68 = 1.02 < 2 we will use the pooled method, assuming
equal standard deviations.
First, we calculate the pooled sample variance
9(10.93)2 +8(10.68)2 1987.68
sp2 = 10+9−2 = 17 = 116.92

Example – Memory
A 95% confidence interval for the difference in mean scores for the two
methods µ1 − µ2 is
r r
x¯1 − x¯2 ± t ∗ sp2 1
n1 + 1
n2 = (49 − 38) ± 2.110 116.92 1
10 + 1
9
= 11 ± 10.48 = (0.52, 21.48)

Example – Memory
1 Let α = 0.05.
2 We are testing the hypotheses:
H0 : µ 1 = µ 2
Ha : µ1 ̸= µ2
3 We will reject H0 if |t| ≥ 2.110 (i.e., if t ≥ 2.110 or t ≤ −2.110),

where t ∗ = 2.110 is the upper 0.025 critical value for the t
distribution with 17df .
x¯1 − x¯2 49 − 38
t=r = r = 2.21
2 1 1 1 1
sp n1 + n2 116.92 10 + 9

Example – Memory
Since t = 2.21 > t ∗ = 2.110, we reject H0 .

5 We have sufficient evidence to conclude that the true mean score for
the two methods differs.

Example – Memory
Suppose we had instead used the P-value method to conduct the test.
The decision rule would be to reject H0 if the P-value ≤ α = 0.05.

The P-value of the test is 2P T(17) ≥ 2.21 . We see from Table 3 that

P T(17) ≥ 2.110 = 0.025 and P T(17) ≥ 2.224 = 0.02

It follows that P T(17) ≥ 2.21 is between 0.02 and 0.025, but our
P-value is double this probability, so our P-value is between 0.04 and 0.05.
Our conclusion would remain the same, as we reject H0 since the P-value
is less than α = 0.05.

Example – Memory
Since this is a two-sided test, and since the confidence level of our
confidence interval and the significance level of our test add up to 1, we
could have used the 95% confidence interval to conduct the test. Since
the 95% confidence interval for µ1 − µ2 does not contain the value 0, we
reject H0 at the 5% level of significance.

Example – Memory – R-Code
So the first thing we can double check is that max si 10.93

min si = 10.68 = 1.02 < 2.
You can see that on the next slide under the ratio of variances


g1<-c(52,71,43,39,42,63,50,48,47,35)
g2<-c(30,28,53,44,31,32,27,54,43)
var.test(g1, g2, alternative = "two.sided")
##
## F test to compare two variances
##
## data: g1 and g2
## F = 1.0487, num df = 9, denom df = 8, p-value = 0.9573
## alternative hypothesis: true ratio of variances is not equa
## 0.2406878 4.3018561
## ratio of variances
## 1.048733

So now we know we have to conduct our inference using the pooled
method
t.test(g1, g2, paired=FALSE, var.equal = TRUE, conf.level = 0.
##
## Two Sample t-test
##
## data: g1 and g2
## t = 2.2139, df = 17, p-value = 0.04079
## alternative hypothesis: true difference in means is not equ
## 0.5170438 21.4829562
## 49 38
You can see the 95% confidence interval is the same as the one we got by
hand. You can also see the p-value is the within the range of the p-value
we got using the tables as well.

Practice Question
Which of the following conditions is NOT required for a pooled

two-sample t test to be valid?
A We have two independent random samples.
B The population variances must be known.
C The population variances must be equal.
D The populations must be normally distributed.
E The distributions of the variables we are comparing must both be
normally distributed.

Practice Question
In a survey of gas stations across Canada done on July 16, 2002, the price
(in cents/litre) of regular gasoline was recorded from 10 randomly selected
gas stations in Western Canada, and 13 in Eastern Canada. We will
assume that gas prices in both Western and Eastern Canada are normally
distributed. Some summary statistics are shown below:
Region n mean std. dev.

Western 10 71.71 4.25
Eastern 13 70.89 3.50
We conduct a hypothesis test to determine whether there is evidence that

the true mean gas price in the West was greater than in the East. With a
5% level of significance, the decision rule is to reject H0 if:

Practice Question – Cont’d
A t > 1.782
B t < −2.262 or t > 2.262
C t > 1.721
D t > 2.833
E t < −2.080 of t > 2.080

Robustness of the Two-Sample t Procedures
The two-sample t procedures are more robust in the case of non-normal

distributions than the one-sample t procedures. If the two sample sizes are
equal and the distributions of the two populations have similar shapes, our
P-values will be quite accurate even when the sample sizes are low. When
the two population distributions have different shapes, larger samples are
needed.

Unit 2 2000

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unit 2 2000

Uploaded by

Copyright:

Available Formats

STAT 2000 – Unit 2

Carrie Madden STAT 2000 – Unit 2 1 / 99

Unit 2 – Inference for the Mean of a Two

Carrie Madden STAT 2000 – Unit 2 3 / 99

We will often encounter a situation where data are collected in pairs. In

Carrie Madden STAT 2000 – Unit 2 4 / 99

Two different variables are measured for each individual. We are

Carrie Madden STAT 2000 – Unit 2 5 / 99

Matched Pairs t Procedures

The matched pairs t procedures will be used to help us detect and

Carrie Madden STAT 2000 – Unit 2 6 / 99

Matched Pairs t Procedures

The parameter of interest is µd , the true mean of the differences of all

Carrie Madden STAT 2000 – Unit 2 7 / 99

Carrie Madden STAT 2000 – Unit 2 8 / 99

Example – Gas Prices (Cont’d)

The data are as follows:

Car Regular Premium Difference (P-R)

Carrie Madden STAT 2000 – Unit 2 9 / 99

R-Code – Gas Prices

Example – Gas Mileage

Carrie Madden STAT 2000 – Unit 2 11 / 99

Example – Gas Mileage

A 95% confidence interval is:

Carrie Madden STAT 2000 – Unit 2 12 / 99

Example – Gas Mileage – R Code

Carrie Madden STAT 2000 – Unit 2 13 / 99

Example – Gas Mileage

Carrie Madden STAT 2000 – Unit 2 14 / 99

Example – Gas Mileage – Hypothesis Test

– Note that we are testing whether the mean difference is greater

Carrie Madden STAT 2000 – Unit 2 15 / 99

Example – Gas Mileage – Hypothesis Test

Example – Gas Mileage – Hypothesis Test

The interpretation of the P-value is analogous to the one-sample case:

Carrie Madden STAT 2000 – Unit 2 17 / 99

Example – Gas Mileage – Hypothesis Test – R Code

t.test(diff, mu= 0, alternative = ’greater’)

Carrie Madden STAT 2000 – Unit 2 18 / 99

Example – Gas Mileage – Critical Value

cv<-qt(0.05, 7, lower.tail = FALSE)

Carrie Madden STAT 2000 – Unit 2 19 / 99

Example – Pharmaceutical Company

A pharmaceutical company is testing a new blood pressure medication.

Carrie Madden STAT 2000 – Unit 2 20 / 99

Example – Pharmaceutical Company

3 Reject H0 if t ≤ t ∗ = −1.328 where −t ∗ = −1.328 is the lower 0.10

Carrie Madden STAT 2000 – Unit 2 21 / 99

Example – Pharmaceutical Company

Since t = −13.17 < −t ∗ = −1.328, we reject the null hypothesis.

P (T ≥ (19) ≥ 3.883) = 0.0005

Carrie Madden STAT 2000 – Unit 2 22 / 99

Carrie Madden STAT 2000 – Unit 2 23 / 99

A 99% confidence interval for the true mean difference in GPAs is

Carrie Madden STAT 2000 – Unit 2 24 / 99

Example – Psychology – R Code

Carrie Madden STAT 2000 – Unit 2 25 / 99

Example – Psychology – R Code

Carrie Madden STAT 2000 – Unit 2 26 / 99

3 Reject H0 if the P-value ≤ α = 0.01.

Carrie Madden STAT 2000 – Unit 2 28 / 99

Carrie Madden STAT 2000 – Unit 2 29 / 99

Since this was a two-sided test with a 1% level of significance, we could