Professional Documents
Culture Documents
Test hypotheses and construct confidence intervals Test hypotheses and construct confidence intervals
about the difference in two population means using about the difference in two related populations.
the Z statistic. Test hypotheses and construct confidence intervals
Test hypotheses and construct confidence intervals about the differences in two population proportions.
about the difference in two population means using yp
Test hypotheses and construct confidence intervals
the t statistic. about two population variances using the F statistic.
Hypothesis Testing; Confidence Intervals ‐ Hypothesis Testing for Differences
Difference in Means using z Statistic Between Means: The Growth Example
(Population Variances Known) As a specific example, suppose we want to conduct a
hypothesis test to determine whether the average annual
Calculating two sample means and using the
growth for an animal species is different from the average
difference in the two sample means is used to
annual growth μ1 of another species μ2. Because we are testing
test the difference in the population to determine whether the means are different, it might seem
The central limit theorem states that the difference
The central limit theorem states that the difference logical that the null and alternative hypotheses would be
logical that the null and alternative hypotheses would be
in two sample means is normally distributed for large
sample sizes ((both n1 and n2) > 30) regardless of the Ho: μ1 = μ2
shape of the population Ha: μ1 ≠ μ2
Page 1
Hypothesis Testing for Differences Hypothesis Testing for Differences
Between Means: Between Means
H0 : 1 2 If z < - 1.96 or z > 1.96, reject Ho.
H a : 1 2
Rejection
Rejection Region If - 1.96 z 1.96, do not reject Ho.
Region
H0 : 1 2 0 Non Rejection Region
H a : 1 2 0
Z c 1.96 0 Z c
1.96
Critical Values
Analysis is testing whether there is a difference in the
annual growth. This is a two tailed test.
Hypothesis Testing for Differences Between Hypothesis Testing for Differences
Means: Between Means:
n 32
z (70.700 62.187) (0) 2.35
Species 1 1 Species 2
74.256 57.791 71.115
x 1
70.700 69.962 77.136 43.649
264.160 166.411
96.234 65.145 67.574 55.052 66.035 63.369
1
16.253
89.807 96.767 59.621 57.828 54.335 59.676
Page 2
Hypothesis Testing for Differences Hypothesis Testing for Differences
Between Means Between Means
0
Ho :
1 2
Rejection Rejection
If z 1.96 or z 1.96, reject H 0 .
If 1.96 z 1.96, do not reject H 0 .
Region
H : 0
Rejection Region
Rejection Region
a
1 2 Region
( x1 x2 ) (1 2)
z
12 22
.025
2
.025
2
.025
2
. 025 2 n1 n2
Non Rejection Region
Non Rejection Region (70.700- 62.187)- (0)
2.35
Z 233
. Z 233
. 264.164 166.411
X X2 X X
1 2
c 0 c
1
Critical Values
Critical Values 32 34
Since z 2.35 1.96, reject H 0 .
Demonstration Problem Demonstration Problem
A sample of 87 men showed that the average calcium
depletion per year is 3352 µg. The population standard
deviation is 1100 µg. A sample of 76 women showed that the
average calcium depletion per year is 5727 µg, with a
Rejection
population standard deviation of 1700 µg. A researcher wants
Ho : 1 2 0
Region
to “prove”
to prove that women lose more calcium. If they use α
that women lose more calcium If they use α = .001
= 001
and these sample data, will they be able to reject a null
hypothesis that women annually lose as much (or less) calcium
Ha : 1 2 0 .001
Non Rejection Region
as men do?
Z c
3.08 0
Critical Value
Page 3
Demonstration Problem Demonstration Problem
men Women
x1 $3,352 x2 $5,727
The evidence is substantial that women, on
1 $1,100 2 $1,700
Rejection average, lose more calcium than men.
Region n1 87 n2 76
.001
001 x x
z
1 2 1 2
2 2
Non Rejection Region
1 2
Z 308
c
.
0 n n 1 2
Critical Value
3352 5727 0 10.42
2 2
1100 1700
If z < - 3.08, reject Ho. 87 76
Confidence Interval Demonstration Problem
men Women
Sometimes the solution(s) is/are to take a random
sample from each of the two populations and study x1 $3,352 x2 $5,727
the difference in the two samples. 1 $1,100 2 $1,700 95 % Confidence z = 1.96
Formula for confidence interval to estimate (µ1 ‐ µ2). n1 87 n2 76
Designating a group as group one, and another as
g g g p g p ,
group two is an arbitrary decision.
x x z
2 2 2 2
1 2
1 2 x1 x 2 z 1 2
n n n n
1 2
1 2 1 2
Calculate it!
Page 4
Hypothesis Testing Hypothesis Testing
t Test for Differences in Population Means t Formula to Test the Difference in
Means Assuming 12 = 22
Each of the two populations is normally distributed.
The two samples are independent.
The values of the population variances are unknown.
( x1 x 2 ) ( 1 2 )
The variances of the two populations are equal. t
12 =
= 22 s ( n1 1) s 22 ( n 2 1)
2
1 1
1
n1 n 2 2 n1 n 2
Page 5
Shrimp weights
t.025,25
2060
.
t 2060
.
0 .025,25
If t < - 2.060 or t > 2.060, reject Ho.
Critical Values
If - 2.060 t 2.060, do not reject Ho.
Shrimp hatching methods Shrimp hatching methods
( x1 x 2 ) ( 1 2 ) ( x1 x 2 ) ( 1 2 )
t
s 12 s2 s 12 ( n 1 1) s 22 ( n 2 1) 1 1
2
Hatching Method A Hatching Method B n1 n2 n1 n 2 2 n1 n2
56 51 45
59 57 53
4 7 .7 3 5 6 .5 0 0
47 52 43
52 56 65
1 9 .4 9 5 1 4 1 8 .2 7 3 1 1 1
1
42 53 52 15 12 2 15 12
53 55 53
50 42 48 5 .2 0
54 64 57 2
47 44 44 s 12 s 22
n
n 2
d f 1
2 5
n1 15 n2 12
2 2
s1
2
s 22
n1
n 2
x2 56.5
n1 1 n 2 1
x1 47.73
If t < - 2.060 or t > 2.060, reject H o.
s 19.495
2 s22 18.273 Since t = -5.20 < -2.060, reject H o .
1 If - 2.060 t 2.060, do not reject H o.
Page 6
Shrimp hatching methods Confidence Interval to Estimate 1 ‐ 2
when 12 and 22 are unknown and 12 = 22
The conclusion is that there is a significant
difference in the effectiveness of the hatching
methods.
s12 ( n1 1) s22 (n2 1) 1 1
( x1 x2 ) t
n1 n2 2 n1 n2
where df n1 n2 2
Demonstration Problem Demonstration Problem
A coffee manufacturer is interested in estimating the difference in
the average daily coffee consumption of regular‐coffee drinkers and
decaffeinated‐coffee drinkers. Its researcher randomly selects 13 n1 13, n2 15
regular‐coffee drinkers and asks how many cups of coffee per day
they drink. He randomly locates 15 decaffeinated‐coffee drinkers and x1 4.35, x2 6.84
asks how many cups of coffee per day they drink. The average for the
y p p y y g
regular‐coffee drinkers is 4.35 cups, with a standard deviation of 1.20
cups. The average for the decaffeinated‐coffee drinkers is 6.84 cups, s1 1.20, s2 1.42
0.05, t0.025, 26 2.056
with a standard deviation of 1.42 cups. The researcher assumes, for
each population, that the daily consumption is normally distributed,
and he constructs a 95% confidence interval to estimate the
difference in the averages of the two populations.
Page 7
Demonstration Problem Statistical Inferences for Two
Related Populations
(1.20) 2 (12) (1.42) 2 (14) 1 1
(4.35 6.84) 2.056 Dependent samples
13 15 2 13 15
Used in before and after studies
After measurement is not independent of the before
2.49 1.03 measurement
3.52 1 - 2 1.46
The researcher is 95% confident that the difference in population
average daily consumption of cups of coffee between regular‐ and
decaffeinated‐coffee drinkers is between 1.46 cups and 3.52 cups.
Hypothesis Testing Dependent Samples
Researcher must determine if the two samples are Before and after measurements on the same
related to each other individual
The technique for related samples is different from the Individual Before After
Studies of twins
technique used to analyze independent samples
Studies of spouses 1 32 39
Matched pairs test requires the two samples be the 2 11 15
same size
3 21 35
4 17 13
5 30 41
6 38 39
7 14 22
Page 8
Hypothesis Testing Formulas for Dependent Samples
The following t test for dependent measures uses the
t
d D
d
d
sample difference, d, between individual matched sd n
samples as the basic measurement of analysis n (d d )2
sd
n 1
An analysis of d converts the problem from a two sample df n 1
problem to a single sample of differences ( d )2
n number of pairs d
2
n
d = sample difference in pairs n 1
Hypothesis Testing W/H Ratios for Nine Randomly
Selected Ethnic Groups
Analysis of data by this method involves calculating a Suppose a stock market investor is interested in
t value with a critical value obtained from the table determining whether there is a significant difference
n in the degrees of freedom (n – 1) is the number of in the W/H (weight to height) ratio for 2 year old
matched pairs of scores children of different ethnic groups in Vietnam. In an
effort to study this question, the investor randomly
y q , y
samples nine ethnic groups from Vietnam and
records the W/H ratios for each of these groups at
the end of year 1 and at the end of year 2.
Page 9
W/H Ratios for Nine Randomly Hypothesis Testing with Dependent
Selected Groups Samples: W/H Ratios for Nine groups
Year 1 Year 2 Ho : D 0
Groups W/H Ratio W/HRatio
1 8.9 12.7 Ha : D 0 Rejection
Region
Rejection
Region
2 38.1 45.4
3 43 0
43.0 10 0
10.0 .01
4 34.0 27.2 df n 1 9 1 8 .005 2
.005
2
5 34.5 22.8
t.005,8 3.355 Non Rejection Region
6 15.2 24.1
7 20.3 32.3 If t < - 3.355or t > 3.355, reject Ho. t.01,11
3.355 0 t
.01,11
3.355
Hypothesis Testing with Dependent Hypothesis Testing with Dependent
Samples: W/H Ratios for Nine groups Samples: W/H Ratios for Nine groups
t-Test: Paired Two Sample for Means
H0 : D = 0 d 5 . 033
H1 : D 0 s d 21 . 599 Year 1 Year 2
W/H Ratio W/H Ratio
5 . 033 0
t 0 . 70 Mean 30.64 35.68
21 . 599 Variance 268.1 837.5
Ob
Observations
ti 9 9
9
Pearson Correlation 0.674
Page 10
Confidence Intervals Confidence Intervals for Mean
Difference for Related Samples
Researcher can be interested in estimating the mean
difference in two populations for related samples
This requires a confidence interval of D (the mean
population difference of two related samples) to be d t s d
D d t s d
constructed
n n
df n 1
Difference in Number Bacteria Colonies Confidence Interval for Mean Difference
strain Without treatment With treatment d
in Number of bacteria colonies
d 3.39
1 8 11 ‐3
2 19 30 ‐11
3 5 6 ‐1
df n 1 18 1 17
4 9 13 ‐4 t . 005 , 17 2 . 898
sd 3.27 5
6
7
3
0
13
5
4
15
‐2
‐4
‐2
d t s d
n
D d t s d
n
3 . 27 3 . 27
8 11 17 ‐6 3 . 39 2 . 898 D 3 . 39 2 . 898
9 9 12 ‐3 18 18
10 5 12 ‐7 3 . 39 2 . 23 D 3 . 39 2 . 23
11 8 6 2 5 . 62 D 1 . 16
12 2 5 ‐3
13 11 10 1
14 14 22 ‐8 The analyst estimates with a 99% level of confidence that the
15 7 8 ‐1 average difference in the number of bacteria colonies with
16 12 15 ‐3 and without treatment is between ‐5.62 and ‐1.16 houses.
17 6 12 ‐6
18 10 10 0
Page 11
Statistical Inference about two Hypothesis Testing
p̂ p̂ )
Population Proportions ( – 1 2
pˆ pˆ
Sample proportion used is ( ) 1 2 Because population proportions are unknown,
an estimate of the Std Dev of the difference in two
( pˆ1 pˆ 2 ) ( p1 p2 ) pˆ proportion from sample 1
1 sample proportions is made by using sample
z pˆ proportion from sample 2 proportions as point of estimates of the population
p1 q1 p2 q2 2
n size of sample 1
1
proportion
n1 n2 n size of sample 2
2
q 1- p
1 1
q 1- p
2 2
Z Formula to Test the Difference Testing the Difference in Population
in Population Proportions Proportions
pˆ pˆ p p
Z 1
2 1
2
pp
Ho :
1 2
0
Rejection
p q 1 1
H :pp 0
Region
n 1 n
Rejection
a
2 1 2 Region
P x 1
x 2
.01 2
.005
.005
n n
1 2 2 2 2
.005
n pˆ n pˆ
1 1 2 2
Z 2575
.
Z 2575
.
n n
c 0
If z < - 2.575 or z > 2.575, reject H o. c
1 2 Critical Values
If - 2.575 z 2.575, do not reject H o.
q 1 p
Page 12
Testing the Difference in Population Sampling Distribution of Differences
Proportions in Sample Proportions
4. n qˆ 5 where qˆ = 1 - pˆ
P x1 x 2
.323 .677 1
1
2 2
24 39
. 17 pˆ 1
pˆ 2
p 1
p 2
and
100 95 . 067
. 323 2 . 54 p q p q
σ pˆ pˆ 2
1 1
2 2
Confidence Interval to Estimate p1 ‐ p2 Example Problem:
n 400
1 n 480 2
x 48
1 x 187 2
48
pˆ qˆ pˆ qˆ z pˆ qˆ pˆ qˆ pˆ 400 .12 pˆ 187
pˆ pˆ z
.39
p p pˆ pˆ
1
1 1 2 2 1 1 2 2 480 2
ˆq 1 pˆ .88 qˆ 1 pˆ .61
1 2
n n 1 2 n n 1 2 1 2
1 2
1 1 2 2 For a 98% level of confidence
confidence, z = 2.33.
2 33
pˆ pˆ Z pˆ qˆ pˆ qˆ pˆ pˆ Z pˆ qˆ pˆ qˆ
1 1 2 2
pp 1 1 2 2
1 2
n n 1 2
1 2 1 2
n 1 n 2
.12 .39 2.33 .12.88 .39.61 p1 p2 .12 .39 2.33 .12.88 .39.61
400 480 400 480
.27 .064 pp
1 2
.27 .064
.334 pp
1 2
.206
Page 13
F Test for Two Population Variances Sheet Metal Example
s12
F Suppose a machine produces metal sheets that are specified to be 22
s 22 millimeters thick. Because of the machine, the operator, the raw
material, the manufacturing environment, and other factors, there is
df num erator v1 n1 1 variability in the thickness. Two machines produce these sheets.
Operators are concerned about the consistency of the two machines. To
d
e
n
o
m
i
n
a
t
o
r
df v2 n2 1 test consistency, they randomly sample 10 sheets produced by machine
1 and 12 sheets produced by machine 2. The thickness measurements of
sheets from each machine are given in the table on the following page.
F distribution Assume sheet thickness is normally distributed in the population.
How can we test to determine whether the variance from each sample
comes from the same population variance (population variances are
equal) or from different population variances (population variances are
not equal)?
Sheet Metal Example: Hypothesis Test for Sheet Metal Example
Equality of Two Population Variances
0.05 F.025,9,11 3.59
Ho : 12 22 Machine 1 Machine 2
n1 10 1 22.3 21.8 22.2 22.0 22.2 22.0
Ha : 2
1
2
2 n2 12
F . 05 , 9 , 11 = 21.8 21.9 21.6 22.1 22.0 22.1
F . 05 , 9 , 11 22.3 22.4 21.8 21.7 21.9
s12
1 21.6 22.5 21.9 21.9 22.1
F
s 22 3 . 59
0 . 28
n1 10 s
2
0.1138 n 2 12
df numerator 1 n1 1
If F < 0.28 or F > 3.59, reject Ho. F 1
5.63
s 0 . 1138 s s 22 0 . 0202
2 2
0.0202
If 0.28 F 3.59, do reject Ho.
1 2
df deno min ator 2 n2 1
Page 14