Professional Documents
Culture Documents
\
|
n
N x
o
, ~
The confidence interval is
The Test Statistic is
x
Z x o
o 2 /
x
H
x
Z
o
=
Example: The temperature (degrees C) of a cooled storage unit is
taken on 8 consecutive days.
4.5 4.8 5.2 4.7 3.8 3.7 4.1 3.9
Temperatures for this type of storage unit are known to be
Normally distributed with a standard deviation of =0.35.
Construct a 90% confidence interval for the true mean temperature.
3375 . 4 = x
Calculate the sample mean:
For = 0.10, Z
/2
= 1.6449
Calculate the standard error:
x
Z x o
o 2 /
1237 . 0
8
35 . 0
= = =
n
x
o
o
4.3375 + 1.6449(0.1237)
4.3375 + 0.2035
4.1340 to 4.5410
We are 90% sure that the true population mean is in this
interval.
Test the hypothesis that the mean temperature is 4 degrees.
H0: = 4
H1: 4
For = 0.10, / 2 = 0.05 Z = 1.6449
= 4
Z = -1.6449
Reject
H0
Accept H0
Z = 1.6449
Reject
H0
Reject the null if Z > 1.6449 or Z < -1.6449
x
H
x
x
Z
o
=
n
x
o
o =
73 . 2
1237 . 0
4 3375 . 4
=
=
x
Z
Z = 2.73
Reject the null and conclude that the
average temperature is not 4 degrees.
p-value = pr (Z > 2.73) + pr(Z < -2.73)
= 0.0032 + 0.0032
= 0.0064
There is only a 0.64% chance of selecting the given sample if
the true mean is 4.
Often, we dont know the population standard deviation.
We can no longer use the Z table.
II. The t-distribution (aka Students t-distribution)
Fun origin: A chemist at the Guinness brewery in Dublin invented
the t-distribution in order to monitor quality in brewing, using
small samples from Normal populations with unknown.
If random samples of size n are selected from a Normal population
with mean and unknown, then the distribution of sample means
is a t-distribution.
( )
x n
s t x , ~
1
n
s
s
x
=
(n-1) refers to the degrees of freedom
The t-distribution is similar to the Normal distribution in several
ways:
it is bell shaped
it is symmetrical about the mean
is the number of standard errors between the
sample mean and population mean
x
s
x
t
=
Ex: find the tail
area equal to
5% when the
sample size is
10.
10-1 =9 degrees
of freedom
Tail area = 0.05
Critical t-value
is 1.8331
In large samples, when is unknown, we often use Z instead of t.
When samples are large, Z and t are close.
Statistical software always uses t when is unknown, even for
large samples.
The confidence interval for a small sample from a Normal
population with unknown is
x n
s t x
2 / , 1 o
= t
12 . 18 = x
=20
t =1.5332
Reject
H0
Accept H0
t = -1.0067
Accept H0 because
-1.0067 < 1.5332
Accept the null and validate the claim that at most the average
wait time is 20 minutes.
4. p-value is the area to the right of -1.0067
(rarely look up in t-distribution table software)
Example: The temperature (degrees C) of a cooled storage unit is
taken on 8 consecutive days.
4.5 4.8 5.2 4.7 3.8 3.7 4.1 3.9
At the 90% level, test the hypothesis that the mean temperature is
4 degrees.
H0: = 4
H1: 4
xi xi - mean (xi - mean)^2 mean 4.3375
4.5 0.1625 0.0264 variance 0.294107
4.8 0.4625 0.2139 st dev 0.542316
5.2 0.8625 0.7439
4.7 0.3625 0.1314
3.8 -0.5375 0.2889
3.7 -0.6375 0.4064
4.1 -0.2375 0.0564
3.9 -0.4375 0.1914
2.0588 sum
Lets verify the output:
1917 . 0
8
542316 . 0
= = =
n
s
s
x
x n
s t x
2 / , 1 o
t
7, 0.05
= 1.8946
4.3375 + (1.8946)(0.19174)
4.3375 + 0.3633
3.9742 to 4.7008
x
s
x
t
=
76 . 1
19174 . 0
4 3375 . 4
=
= t
This is a two-tail test.
- 1.8946 < 1.76 < 1.8946
Accept the null.
If we had rejected the null, the p-value would have told us the
level of significance.
III. Difference Between Means from Small, Independent Samples
Example: Promoters of e-learning software design a test for
effectiveness of an online course based on typing tutor software.
Two groups are randomly selected. Group 1 consists of 10 subjects
who have completed a course that did not use supporting software.
Group 2 consists of 8 subjects who used the online software.
The typing speeds (wpm) are as follows.
Group 1: 23, 35, 37, 12, 26, 60, 13, 24, 27, 53
Group 2: 56, 30, 55, 48, 35, 40, 33, 23
Construct a 90% confidence interval for the difference in mean
typing speed between the two groups. Can you conclude that those
who used the online software can type faster?
xi xi - mean (xi - mean)^2 xi xi - mean (xi - mean)^2
23 -8 64 56 16 256
35 4 16 30 -10 100
37 6 36 55 15 225
12 -19 361 48 8 64
26 -5 25 35 -5 25
60 29 841 40 0 0
13 -18 324 33 -7 49
24 -7 49 23 -17 289
27 -4 16 sum 1008
53 22 484
sum 2216
mean 31 mean 40
variance 246.2222 variance 144
st dev 15.69147 st dev 12
Group 1 Group 2
Well need to construct a pooled estimate of variance.
2
) 1 ( ) 1 (
2 1
2
2 2
2
1 1
2
+
+
=
n n
s n s n
s
p
5 . 201
2 8 10
) 12 )( 1 8 ( ) 69147 . 15 )( 1 10 (
2 2
2
=
+
+
=
p
s
Use the pooled estimate of variance to find the standard error.
|
|
.
|
\
|
+ =
2 1
2
1 1
2 1
n n
s s
p x x
7333 . 6
8
1
10
1
5 . 201
2 1
=
|
.
|
\
|
+ =
x x
s
Find the critical t value:
degrees of freedom = n1 + n2 2
= 16
/ 2 = 0.05
t
16, 0.05
= 1.7459
Construct the interval:
40 31 + 1.7459(6.7333)
9 + 11.7557
-2.7557 to 20.7557
The interval contains 0. We can conclude that the
difference between means is zero.
Typing speeds between the 2 groups are the same.
At the 95% level, test the hypotheses that the mean typing speed is
faster for those who used the software.
H0: 1 = 2
H1: 1 > 2
one tailed test
= 0.05
t
16,
0.05
= 1.7459
1 = 2
Accept H0
t = 1.7459
Reject
H0
t =1.3366
The test statistic is
|
|
.
|
\
|
+
=
2 1
2
2 1 2 1
1 1
) ( ) (
n n
s
x x
t
p
3366 . 1
7333 . 6
) 0 ( ) 31 40 (
=
= t
1 = 2
Accept H0
t = 1.7459
Reject
H0
Accept the null
hypotheses that the
typing speed of both
groups is the same.
Assumptions made in solving this problem:
1. independent samples
2. random samples from Normal populations
3. the variance is the same for both populations
IV. The F-test for equality of two variances
To figure out if two populations have similar variances, we will look
at the sample variances.
If the ratio of the sample variances is close to 1, then the hypothesis
that the populations have equal variance is plausible.
The sampling distribution of is an F-distribution, when the
samples are independent and selected from Normal populations
with equal variances.
2
2
2
1
s
s
The F-distribution is not symmetrical and depends on the
degrees of freedom in each sample.
v1 = n1 1 v2 = n2 - 1
Ex: Suppose sample 1 has 10 observations and sample 2 has 8
observations. Find the critical F-value for the 5% level.
v1 = 9 v2 = 7
If we wanted the 2.5% level, wed need a different table.
Example: Using the data from the typing example, test whether
the sample variances are equal at the 95% level.
H
0
:
2
1
=
2
2
H
1
:
2
1
2
2
this is a 2-tail test
/2 = 0.025
F: v1 = 10-1 = 9 v2 = 8-1 = 7
F = 4.82
Calculate the test statistic
2
2
2
1
s
s
7099 . 1
144
22 . 246
2
2
2
1
= =
s
s
F = 1.7099
Accept the null
hypothesis and
conclude that the
population variances
are equal.
Instead, test the hypothesis that the variance of population 1
exceeds the variance of population 2.
H
0
:
2
1
<
2
2
H
1
:
2
1
>
2
2
this is a 1-tail test, upper tail
= 0.05
F: v1 = 10-1 = 9 v2 = 8-1 = 7
F = 3.69
Calculate the test statistic
2
2
2
1
s
s
7099 . 1
144
22 . 246
2
2
2
1
= =
s
s
F = 1.7099
Accept the null
hypothesis and
conclude the
variance of
population 1 is less
than or equal to the
variance of
population 2.
V. Difference between Means, Paired Samples
Paired t-tests are used when data consists of pairs of measurements
on the same subjects.
ex: before and after
Example: The typing speeds for 7 people are recorded before and
after completing a course using typing tutor software.
Person Before After Difference
JM 32 46 14
AC 10 18 8
TB 65 58 -7
AF 39 50 11
AO 24 36 12
PD 10 24 14
FF 24 21 -3
Construct a 90% confidence interval for the difference between
average typing speed before and after the course.
/2 = 0.05
degrees of freedom = 7-1 = 6
t
6,
0.05
= 1.9432
Calculate the mean of the differences:
49 / 7 = 7
Calculate the sample standard deviation:
Person Difference dif - mean (dif - mean)^2
JM 14 7 49
AC 8 1 1
TB -7 -14 196
AF 11 4 16
AO 12 5 25
PD 14 7 49
FF -3 -10 100
436
variance 72.6667
st dev 8.5245
Calculate the sample standard error:
2219 . 3
7
5242 . 8
= = =
n
s
s
d
d
Construct the interval:
7 + 1.9432(3.2219)
7 + 6.2608
0.7392 to 13.2608
We are 90% confident that the true difference in average typing
speeds is between 0.7392 words per minute and 13.2608 words
per minute.
Now at the 2.5% level, test the hypothesis that typing speeds have
increased after taking the course.
H
0
:
d
< 0
H
1
:
d
> 0
one sided test
= 0.025
degrees of freedom = 6
t
6,
0.025
= 2.447
t =2.1726
d = 0
Accept H0
t = 2.447
Reject
H0
Calculate the test statistic:
sterror
claim H estimate
t
0
=
1726 . 2
2219 . 3
0 7
=
= t
Accept the null hypothesis and
conclude that typing speeds did
not improve during the course.
Concepts:
t-distribution
F-distribution
Skills:
Construct confidence interval and perform hypothesis test for
means from small, independent samples
Perform an F-test
Construct confidence interval and perform hypothesis test for the
difference between means from small, independent samples
Construct confidence interval and perform hypothesis test for the
difference between paired means from small, independent samples