Professional Documents
Culture Documents
Estimation and Hypothesis Testing: Two Populations: Prem Mann, Introductory Statistics, 7/E
Estimation and Hypothesis Testing: Two Populations: Prem Mann, Introductory Statistics, 7/E
ESTIMATION AND
HYPOTHESIS TESTING:
TWO POPULATIONS
Definition
Two samples drawn from two populations
are independent if the selection of one
sample from one population does not
affect the selection of the second sample
from the second population. Otherwise,
the samples are dependent.
2 2
x1 x2 1 2 and x1 x 2 1 2
n1 n2
( x1 x 2 ) z x1 x2
The value of z is obtained from the normal
distribution table for the given confidence
level. The value of x1 x2 is calculated as
explained earlier. Here x1 x 2 is the point
estimator of μ1 – μ2
(a)
Point estimate of μ1 – μ2 = x 1 x 2
= $10,235 - $9342
= $893
(b)
2
2
(2800) 2
(2500) 2
x1 x 2 1
2
n1 n2 1200 1400
$104.8695335
x1 x 2
The value of μ1 – μ2 is substituted from H0.
The value of x1 x2 is calculated as earlier in
this section.
Step 1:
H0: μ1 – μ2 = 0 (The two population
means are not different.)
H1: μ1 – μ2 ≠ 0 (The two population
means are different.)
Step 2:
Population standard deviations, σ1 and σ2
are known
Both samples are large; n1 > 30 and
n2 > 30
Therefore, we use the normal distribution
to perform the hypothesis test
Step 3:
α = .01.
The ≠ sign in the alternative hypothesis
indicates that the test is two-tailed
Area in each tail = α / 2 = .01 / 2 = .005
12 22 (2800)2 (2500)2
x1 x 2 $104.8695335
n1 n2 1200 1400
( x1 x 2 ) ( 1 2 ) (10,235 9342) 0
z 8.52
x1 x 2 104.8695335
From H0
Interval Estimation of μ1 – μ2
Hypothesis Testing About μ1 – μ2
1 1
s x1 x 2 s p
n1 n2
Step 1:
H0: μ1 – μ2 = 0 (The mean numbers of
calories are not different.)
H1: μ1 – μ2 ≠ 0 (The mean numbers of
calories are different.)
Step 2:
The two samples are independent
σ1 and σ2 are unknown but equal
The sample sizes are small but both
populations are normally distributed
We will use the t distribution
Step 3:
The ≠ sign in the alternative hypothesis
indicates that the test is two-tailed.
α = .01.
Area in each tail = α / 2 = .01 / 2 = .005
df = n1 + n2 – 2 = 14 + 16 – 2 = 28
Critical values of t are -2.763 and 2.763.
Step 1:
H0: μ1 – μ2 = 0
H1: μ1 – μ2 > 0
Step 2:
The two samples are independent
Standard deviations of the two populations
are unknown but assumed to be equal
Both samples are large
We use the t distribution to make the test
Step 3:
α = .025
Area in the right tail of the t distribution =
α = .025
df = n1 + n2 – 2 = 40 + 35 – 2 = 73
Critical value of t is 1.993
Step 4:
Step 5:
The value of the test statistic t = 5.048
It falls in the rejection region
Therefore, we reject the null hypothesis H0
Hence, we conclude that children in New
York State spend more time, on average,
watching TV than children in California.
Interval Estimation of μ1 – μ2
Hypothesis Testing About μ1 – μ2
Degrees of Freedom If
The two samples are independent
The standard deviations σ1 and σ2 of are
unknown and unequal
At least one of the following two conditions is
fulfilled:
i. Both sample are large (i.e., n1 ≥ 30 and n2 ≥
30)
ii. If either one or both sample sizes are small,
then both populations from which the samples
are drawn are normally distributed
Degrees of Freedom
then the t distribution is used to make inferences about μ1 –
μ2 and the degrees of freedom for the t distribution are
given by 2
s12 s22
n n
df 12 2
2
s1
2
s2
2
n
1 2 n
n1 1 n2 1
2 2
s s
s x1 x 2 1 2
n1 n2
( x 1 x 2 ) ts x1 x 2
Where the value of t is obtained from the t
distribution table for a given confidence
level and the degrees of freedom are given
by the formula mentioned earlier.
1 2
n n2 15 12
df 1
2 2
2 2
21.42 21
s12 s22 (5)2 (6)2
n
1 2 n 15 12
n1 1 n2 1 15 1 12 1
t 2.080
( x1 x 2 ) ts x1 x2 (80 77) 2.080(2.16024690)
3 4.49 1.49 to 7.49
Step 1:
H0: μ1 – μ2 = 0
The mean numbers of calories are not different
H1: μ1 – μ2 ≠ 0
The mean numbers of calories are different
Step 2:
The two samples are independent
Standard deviations of the two populations
are unknown and unequal
Both populations are normally distributed
We use the t distribution to make the test
Step 3:
The ≠ in the alternative hypothesis
indicates that the test is two-tailed.
α = .01
Area in each tail = α / 2 = .01 / 2 =.005
The critical values of t are -2.771 and
2.771
2 2
s 2
s 2
(3)2
(4) 2
1
2
n n 14 16
df 12 2
2
2
2
27.41 27
s12 s22 (3)2 (4)2
n
1 2 n 15 12
n1 1 n2 1 14 1 16 1
Step 4:
2 2 2 2
s s (3) (14)
s x1 x 2 1
2
1.28173989
n1 n2 14 16
( x1 x 2 ) ( 1 2 ) (23 25) 0
t 1.560
s x1 x 2 1.28173989
Step 5:
The test statistic t = -1.560
It falls in the nonrejection region
Therefore, we fail to reject the null
hypothesis
Hence, there is no difference in the mean
numbers of calories per can for the two
brands of diet soda.
Interval Estimation of μd
Hypothesis Testing About μd
Definition
Two samples are said to be paired or
matched samples when for each data
value collected from one sample there is a
corresponding data value collected from
the second sample, and both these data
values are collected from the same source.
d
d
n
d n
2 ( d )2
sd
n 1
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
INFERENCES ABOUT THE DIFFERENCE
BETWEEN TWO POPULATION MEANS FOR
PAIRED SAMPLES
Sampling Distribution, Mean, and
Standard Deviation of d
If d is known and either the sample size
is large (n ≥ 30) or the population is
normally distributed, then the sampling
distribution of d is approximately
normal with its mean and standard
deviation given as, respectively.
d
d d and d
n
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
INFERENCES ABOUT THE DIFFERENCE
BETWEEN TWO POPULATION MEANS FOR
PAIRED SAMPLES
d
d 35
5.00
n 7
d 2
( d ) 2
873
(35) 2
sd n 7 10.78579312
n 1 7 1
TestStatistic t for d
The value of the test statistic t for d is
computed as follows:
d d
t
sd
d
d 25
4.17
n 6
d 2
( d ) 2
139
( 25) 2
sd n 6 2.63944439
n 1 6 1
sd 2.63944439
sd 1.07754866
n 6
Step 1:
H0: μd = 0
μ1 – μ2 = 0 or the mean weekly sales do not
increase
H1: μd < 0
μ1 – μ2 < 0 or the mean weekly sales do
increase
Step 2:
d is unknown
The sample size is small (n < 30)
The population of paired differences is
normal
Therefore, we use the t distribution to
conduct the test
Step 3:
α = .01.
Area in left tail = α = .01
df = n – 1 = 6 – 1 = 5
The critical value of t is -3.365
d d 4.17 0
t 3.870
sd 1.07754866
Step 5:
The value of the test statistic t = -3.870
It falls in the rejection region
Therefore, we reject the null hypothesis
Consequently, we conclude that the mean
weekly sales for all salespersons increase
as a result of this course.
d
d
35
5.00
n 7
d 2
( d )2
873
(35) 2
sd n 7 10.78579312
n 1 7 1
sd 10.78579312
sd 4.07664661
n 7
Step 1:
H0: μd = 0
The mean of the paired differences is not
different from zero
H1: μd ≠ 0
The mean of the paired differences is different
from zero
Step 2:
d is unknown
The sample size is small
The population of paired differences is
(approximately) normal
We use the t distribution to make the test.
Step 3:
α = .05
Area in each tail = α / 2 = .05 / 2 = .025
df = n – 1 = 7 – 1 = 6
d d 5.00 0
t 1.226
sd 4.07664661
Step 5:
The value of the test statistic t = 1.226
It falls in the nonrejection region
Therefore, we fail to reject the null
hypothesis
Hence, we conclude that the mean of the
population paired difference is not different
from zero.
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
INFERENCES ABOUT THE DIFFERENCE BETWEEN TWO
POPULATION PROPORTIONS FOR LARGE AND
INDEPENDENT SAMPLES
p1q1 p2q2
s pˆ1 pˆ 2
n1 n2
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Example 10-13
A researcher wanted to estimate the difference
between the percentages of users of two
toothpastes who will never switch to another
toothpaste. In a sample of 500 users of
Toothpaste A taken by this researcher, 100 said
that they will never switch to another toothpaste.
In another sample of 400 users of Toothpaste B
taken by the same researcher, 68 said that they
will never switch to another toothpaste.
a) Point estimate of p1 – p2
= pˆ 1 pˆ 2
= .20 – .17 = .03
b)
n1pˆ 1 500(.20) 100
n1qˆ 1 500(.80) 400
n2 pˆ 2 400(.17) 68
n2qˆ 2 400(.83) 332
b)
Each of these values is greater than 5
Both sample sizes are large
Consequently, we use the normal
distribution to make a confidence interval
for p1 – p2.
pˆ 1qˆ 1 pˆ 2qˆ 2
s pˆ1 pˆ 2 .02593742
n1 n2
z = 2.17
( pˆ 1 pˆ 2 ) zs pˆ1 pˆ 2 (.20 .17) 2.17(.02593742)
.03 .056 .026 to .086
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Hypothesis Testing About p1 – p2
x1 x 2 n1pˆ 1 n2 pˆ 2
p or
n1 n2 n1 n2
1 1
s pˆ 1 pˆ 2 pq
n1 n2
where q 1 p
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Example 10-14
Reconsider Example 10-13 about the percentages
of users of two toothpastes who will never switch
to another toothpaste. At the 1% significance
level, can we conclude that the proportion of
users of Toothpaste A who will never switch to
another toothpaste is higher than the proportion
of users of Toothpaste B who will never switch to
another toothpaste?
H1: p1 – p2 > 0
p1 is greater than p2
Step 1:
H0: p1 – p2 = 0
The two population proportions are not different
H1: p1 – p2 ≠ 0
The two population proportions are different
Step 2:
The samples are large and independent
We apply the normal distribution
ˆ 2 are all greater
n1 p̂1 ,n1 q̂1 , n 2 p̂ 2 and n 2 q
than 5 (These should be checked)
Step 3:
The sign in the alternative hypothesis
indicates that the test is two-tailed.
α = .01
The critical values of z are -2.58 and 2.58