Week 7: The T Distribution, Confidence Intervals and Tests

Week 7: The t Distribution, Condence
Intervals and Tests

Week 7: The t Distribution, Condence Intervals and Tests
Inference for a Single Mean
To this point, when examining the mean of a population we have always
assumed that the population standard deviation () was known.
In practice this is seldom the case.
We usually must estimate the population standard deviation with the
sample standard deviation s.
When we do this, the sampling distribution of the sample mean is no
longer normally distributed, because of the adjustment for estimating
with s.
Thus, instead of using the Z, the standard normal distribution, we must
use a different distribution called the t distribution.
2 / 51
The t distributions
Although there is only one Z distribution, there are many, many t
distributions.
In fact, there is a different t distribution for each sample size used.
The shape of each t distribution is very similar to the Z distribution. The
shape is bell-shaped, the mean (center) is 0, but the spread is larger.
Since we are now estimating the true variance (standard deviaton), we
must compensate by using a distribution with a larger spread (more
variability to compensate for guessing).
The larger the sample size, however, the better our estimate and the
closer the t distribution is to the Z distribution.
3 / 51
The t distributions
The way we distinguish between various t distributions is by nding the
degrees of freedom (df ) that correspond to the sample size.
For the one sample case, the degrees of freedom are the sample size
minus one: df = n 1.
We say that the one-sample t -statistic
t
n1
=

x
s/
n
has the t distribution with n 1 degrees of freedom. Notice the only
difference between the t and the z is that we use s instead of .
4 / 51
The t distributions
t distribution critical values can be found at
http://www.stat.tamu.edu/stat30x/zttables.php#ttable.
The degrees of freedom are listed in the left column.
The individual t -values are inside the table rather than the row/column
labels as in the Z table.
The probabilities on top (column headings) are areas to the right of the
t -values within the table whereas the probabilities within the Z table are
areas to the left.
Condence levels are given at the bottom of the table.
Make sure to get acquainted with this table and how it differs from the Z
table.
5 / 51
The t distributions
Differences between Z and t tables:
6 / 51
The t distributions
How the Z and t distributions compare:
From
http://www.stat.tamu.edu/
jhardin/applets/signed/T.html
As sample size, n, increases, the t distribution converges to Z distribution.
7 / 51
The t distributions
Changing from to s forces us to change from z to t
n1
(for the one
sample case), but the steps in producing condence intervals and
hypothesis tests are the same as we have seen previously.
Remember from Week 1, s is calculated from the data using the
formula:
s =
_
1
n 1
n
i =1
(x
i

x)
2
8 / 51
The one-sample Condence Interval (CI) for with Unknown
The formula for a condence interval for with unknown is
x t
/2,n1
s
n
t
/2,n1
is found in the t table. It must correspond to the appropriate
df = n 1 row and the correct -level column label at the bottom. It is
easier to nd the condence level at the bottom of the table and go up
to the correct df .
9 / 51
One-sample t Condence Interval Example
An economist wants to determine the annual average amount that a
family of four in the United States spends on housing. He randomly
selects 85 families of size four and nds the amount they spent on
housing the previous year.
The economist wishes to estimate the mean with 99% condence.
Information we have:
Sample Size: n = 85.
Data: $6,789, $8,233, $4,784, , $5,974 (85 numbers)
Calculated from the data:

x = $6, 219, s = $1,978
Degrees of Freedom: df = n 1 = 85 1 = 84
10 / 51
One-sample t CI Example
x t
/2,n1
s
n
= 6, 219 2.639
1, 978
85
= (5652.82, 6785.18)
t
/2,n1
is found in the t table. We rst go to the 99% condence level at
the bottom. Since 84 is NOT in the table, we then go up to 80 df
(always round down = up in the table). Thus, t
0.005,80
= 2.639.
This is a 99% condence interval for the true average amount a family
of four in the United States spends on housing annually.
11 / 51
Test of Hypotheses Procedure for a 1-sample t test:
Testing Procedure:
1
State the null hypothesis, H
0
.
2
State the alternative hypothesis, H
A
.
3
Determine the level of signicance (e.g., = 0.05).
4
Calculate the test statistic (t -statistic):
t
n1
=

x
0
s/
n
12 / 51
5
Determine the range of the p-value from the t table.
For a greater than test: H
A
: >
0
p-value = P(T > t )
For a less than test: H
A
: <
0
p-value = P(T < t )
For a two-sided test: H
A
: =
0
p-value = P(T |t | or T |t |) = 2P(T |t |)
We cannot determine the exact p-value unless we use a computer, so it is
more common to state a range for the p-value.
13 / 51
6
Reject or fail to reject H
0
based on the p-value.
If the p-value is less than or equal to , reject H
0
.
It the p-value is greater than , fail to reject H
0
.
7
State your conclusion.
If H
0
is rejected with the two-sided alternative, There is signicant
statistical evidence that the population mean is different from
0
.
If H
0
is not rejected, There is NOT signicant statistical evidence
that the population mean is different than
0
.
Notice that the steps are exactly the same as for the case where is known
except for the test statistic formula and the p-value.
14 / 51
TV Example: Suppose that the data collected from a class survey is a
random sample from the entire university (which obviously is not). We wish to
see if there is evidence that the average amount of television watched for
students here is more than 7 hours per week.
Sample Size: n = 38
x: 8.05, s = 7.46
Degree of Freedom: df = 37
15 / 51
TV Example cont:
State the null hypothesis: H
0
: = 7
State the alternative hypothesis: H
A
: >7
State the level of signicance: = 0.05
Calculate the test statistic.
t
n1
=

x
0
s
n
=
8.05 7
7.46
38
= 0.868
Find the p-value.
p-value = P(T t ) = P(T 0.868)
16 / 51
TV Example cont:
Since there isnt a 37
th
row, we use 30. Our test statistic is 0.868 which falls
between 0.854 and 1.055. So according to the top of the table, our p-value
falls between 0.20 and 0.15. Note: since this is a greater than test and the
areas at the top are areas greater than the corresponding t -values, we did
not subtract from 1 nor multiply by 2.
Do we reject or fail to reject H
0
based on the p-value?
= 0.05 < 0.15 < p-value < 0.20, so we fail to reject H
0
.
State the conclusion: There is NOT signicant statistical evidence that
the average amount of television watched is more than 7 hours per
week at the 0.05 level of signicance.
17 / 51
Inference for a Single Mean - Dependent Two Samples (Matched Pairs)
To this point, we have only looked at tests for a single sample.
Soon we will look at condence intervals and hypothesis tests for
comparing two groups.
There is a special 2-sample case in which each individual can be given
both treatments. These two samples are not independent. We can
reduce the two samples to a single sample using a matched pairs
design and eliminate one source of variability.
To analyze matched pairs data, we rst reduce the data from two
samples to one sample by taking the difference for each individual and
then use the 1-sample test.
18 / 51
Example of matched pairs design
Matched Pairs Examples:
Students are each given a pre-test and a post-test to determine the
amount of material learned in a given time interval.
To examine the effect of a new drug, a large group of identical twins is
identied. One twin is given a treatment and the other a placebo.
An ophthalmologist is examining the importance of the dominant eye in
reading. A large group of subjects is asked to read a passage with
dominant eye covered and again with the non-dominant eye covered.
It can be seen in each of these examples that something pairs the two
responses.
19 / 51
Example of matched pairs design
The data is reduced from two samples to one by subtracting one of the
responses from the other.
We could subtract each pre-test score from each post-test score.
We could subtract each placebo response from each treatment
response.
We could subtract the time taken to read the passage with the
non-dominant eye from that with the dominant eye.
20 / 51
Matched Pairs Condence Interval
After reducing the data to a single sample, we use the same formula as
for a condence interval for with unknown , namely,
x
d
t
/2,n
d
1
s
d
n
d
where

x
d
and s
d
are calculated from the differences, n
d
is the number of
differences, and t
/2,n
d
1
is from t -table with df = n
d
1. Note that this
n
d
is actually half the total number of observations since there are two
observations per individual.
21 / 51
Golf Ball Example:
In the manufacture of golf balls two procedures are used. Method I
utilizes a liquid center and method II, a solid center. To compare the
distance obtained using both types of balls, 12 golfers are allowed to
drive a ball of each type, and the length of the drive (in yards) is
measured. (from Milton, McTeer, and Corbet, Introduction to Statistics,
1997)
The manufacturer wants to estimate the mean difference with 90%
condence.
22 / 51
Golf Ball Example cont:
Sample Size: n
d
= 12
x
d
= 9.52
s
d
= 3.12
df = n
d
1 = 11
23 / 51
Golf Ball Example cont:
x
d
t
/2,n
d
1
s
d
n
d
= 9.52 1.796
3.12
12
= (7.90, 11.14)
To nd t
0.05,11
, we rst go to the 90% condence level at the bottom.
Then we go up to nd df = 11. Thus, t
0.05,11
= 1.796.
This is a 90% condence interval for the true average difference for the
distance traveled for the two types of golf balls.
24 / 51
Paired t test
Hypotheis Test for Paired Data:
Again reduce the data to a single sample, then use the 1-sample t -test.
t
n
d
1
=

x
d

d
s
d
n
d
Again, note that the degrees of freedom are from the reduced one
sample.
25 / 51
Paired t test
Keyboard Example: Suppose we want to compare two brands of computer
keyboards, which we will denote as keyboard 1 and keyboard 2. Keyboard 1
is a standard keyboard, while keyboard 2 is specially designed so that the
keys need very little pressure to make them respond. The manufacturer of
keyboard 2 would like to claim that typing can be done faster using keyboard
2. A simple random sample of n = 30 teachers was selected from a
population of high-school teachers attending a national conference. Each
teacher typed the same page of text once using keyboard 1 and once using
keyboard 2. For each teacher the order in which the keyboards were used
was determined by the toss of a coin. The variable measured was the time
(in seconds) for each teacher to correctly type the page of text (from
Graybill, Iyer and Burdick, Applied Statistics, 1998).
26 / 51
Paired t test
Keyboard Example cont:
Sample Size: n
d
= 30
x
d
=

x
2
x
1
= 3.53
s
d
= 8.56
df = n
d
1 = 29
27 / 51
Paired t test
Keyboard Example cont:
0
:
d
=
2
1
= 0
A
:
d
< 0
Why less than?
Determine the level of signicance: = 0.05
Calculate the test statistic:
t
29
=

x
d

d
s
d
n
d
=
3.53 0
8.56
30
= 2.26
28 / 51
Paired t test
Keyboard Example conclusion:
Find the p-value.
p value = P(T t ) = P(T 2.26) = between 0.01 and 0.02
Use degree of freedom as 29.
0.01 < p-value < 0.02 < = 0.05. Therefore, we reject H
0
and
conclude H
A
is true.
State the conclusion: There is signicant statistical evidence that the
average amount of time needed to type the passage is lower for
keyboard 2 than keyboard 1 at the 0.05 level of signicance.
29 / 51
Comparing Two Means-Independent Two Samples w/
1
=
2
Two Independent Populations Means
What about comparing two samples or populations?
First, we can compare the samples graphically.
Histograms or stemplots for each.
The easiest way is side-by-side boxplots since the datasets are
put on the same scale.
We can also compare means, medians, standard deviations, etc.
But we cant make any conclusions without statistical evidence.
30 / 51
1
=
2
To compare two independent population means, we test the difference
between the means. But we need the sampling distribution for the
difference. Suppose X N(
X
,
2
X
) and Y N(
Y
,
2
Y
) are
independent.
Then
XY
=
X

Y
and
XY
=
_
2
X
+
2
Y
. In particular,
X Y N
_
X

Y
,
_
_
2
X
+
2
Y
_
2
_
Remember, you must nd the variance rst and then take the
square root to get the standard deviation. So whether we want the
sum or the difference, we always SUM the variances and then
take the square root.
31 / 51
1
=
2
For example:
Let X and Y denote the average scores on the rst test for two different
sections, respectively. (In reality, these probably arent independent if they
cover the same material.) It is known that
X N(80, 4
2
)
Y N(70, 3
2
)
Since the same formulas work for means:
X Y N
_
80 70,
_
_
4
2
+ 3
2
_
2
_
= N(10,
5
2
)
If not given the distribution of the means, only the individuals, remember the
variance for the mean,
2
X
=
2
X
/n.
32 / 51
1
=
2
When we are interested in comparing two population means and we are
estimating the population standard deviations
1
and
2
with s
1
and s
2
,
the conservative two-sample t -statistic is then
t =
(
x
1
x
2
) (
1
2
)
_
s
2
1
/n
1
+ s
2
2
/n
2
with degrees of freedom equal to the smaller of n
1
1 and n
2
1, i.e.,
df = minn
1
1, n
2
1.
33 / 51
1
=
2
Two Sample t -tests
The null hypothesis can be any of the following:
H
0
:
1
=
2
H
0
:
1

2
H
0
:
1

2
The alternative hypothesis can be any of the following:
H
A
:
1
=
2
H
A
:
1
>
2
H
A
:
1
<
2
The other steps are the same as those used for the tests we have
looked at previously.
34 / 51
1
=
2
Two Sample t -tests
Tomato Example
There has been some discussion among amateur gardeners about the
virtues of black plastic versus newspapers as weed inhibitors for
growing tomatoes. To compare the two, several rows of tomatoes are
planted. Black plastic is used around nine randomly selected plants and
newspaper around the remaining ten. All plants start at virtually the
same height and receive the same care. The response of interest is the
height in feet after a month-growth. (from Milton, McTeer, and Corbet,
Introduction to Statistics, 1997).
Perform a test to see if there is any difference between the average
heights with signicance level 0.10.
35 / 51
1
=
2
Two Sample t -tests
Tomato Example
Sample Sizes: n
1
= 9, n
2
= 10
x
1
= 1.87,
x
2
= 1.49
s
1
= 0.63, s
2
= 0.43
df = n
1
1 = 9 1 = 8
because n
1
is smaller than n
2
.
36 / 51
1
=
2
Two Sample t -tests
Tomato Example
0
:
1
=
2

1
2
= 0
a
:
1
=
2
Calculate the test statistic.
t =
(
x
1
x
2
) (
1
2
)
_
s
2
1
/n
1
+ s
2
2
/n
2
=
(1.87 1.49) 0
_
0.63
2
/9 + 0.43
2
/10
= 1.519
Note: We will use the computer for this calculation. The formula is only
provided for completeness.
37 / 51
1
=
2
Two Sample t -tests
Tomato Example
Find the p-value.
p value = 2P(T |t |) = 2P(T 1.519) = 2(between 0.05 and 0.10)
= between 0.10 and 0.20
Use df = 8.
= 0.10 < p-value < 0.20, so we fail to reject H
0
State the conclusion: There is not signicant statistical evidence that
the average tomato plant heights are different for the two types of weed
inhibitors at the 0.10 level of signicance.
38 / 51
1
=
2
Two-Sample t Condence Interval
Condence Intervals
The condence interval for the difference of two population means
(
1
2
) is
(
x
1
x
2
) t
/2,df
s
2
1
n
1
+
s
2
2
n
2
Where t
/2,df
corresponds to the desired condence level and df =
min{n
1
-1, n
2
-1}.
39 / 51
1
=
2
Commercial Example:
There is some concern that TV commercial breaks are becoming
longer. The observations on the following slide are obtained on the
length in minutes of commercial breaks for the 1984 viewing season
and the current season. (from Milton, McTeer, and Corbet, Introduction
to Statistics, 1997)
Find a 95% condence interval for the difference between the true
averages of the two seasons.
40 / 51
1
=
2
Commercial Example cont:
Sample Sizes:
n
1
= 16, n
2
= 16
x
1
= 2.01,
x
2
= 2.36
s
1
= 0.49, s
2
= 0.19
df = n
1
1 = 16 1 = 15
because n
1
and n
2
are same.
t
0.025,15
= 2.131. Go to the
95% condence level at the
bottom. Then go up to
df = 15.
41 / 51
1
=
2
Commercial Example cont:
(
x
1
x
2
) t
0.025,15
s
2
1
n
1
+
s
2
2
n
2
= (2.01 2.36) 2.131
_
0.49
2
16
+
0.19
2
16
= (0.63, 0.07)
This is a 95% condence interval for the true difference of average
length in minutes for commercials between 1984 and the present.
At the 5% signicance level, we could conclude that there is a difference
in the average length for commericals between 1984 and the present.
Why?
42 / 51
Comparing Two Means - Independent Two Samples w/
1
=
2
Pooled Estimator
Added Assumption of Equal Variances
Previously, we discussed two-sample t procedures from two
populations with two unknown standard deviations. We then used the
sample standard deviations to estimate the population standard
deviations. But what if the two populations have the same standard
deviation? This estimate is called the pooled estimator of s
2
p
because it
combines (pools) the information in both samples.
s
2
p
=
(n
1
1)s
2
1
+ (n
2
1)s
2
2
n
1
+ n
2
2
The pooled standard deviation is the average of the two based on each
sample size. This gives us a better estimate for the true population standard
deviation since it is based on a larger sample = n
1
+ n
2
(more data).
43 / 51
1
=
2
Test Statistics
Suppose that an SRS of size n
1
is drawn from a normal population with
unknown mean
1
and that an independent SRS of size n
2
is drawn
from another normal population with unknown mean
2
. Suppose
further that we know that the two populations have the SAME standard
deviation. Thus, the pooled two-sample t statistic is
t
n
1
+n
2
2
=

x
1
x
2
s
p
_
1/n
1
+ 1/n
2
The degrees of freedom are df = n
1
+ n
2
- 2 due to the larger pooled
sample.
The added assumption for the pooled t -test makes it more powerful
than the 2-sample t -test which means it is easier to detect a false H
0
.
44 / 51
1
=
2
HT and CI
Hypothesis test procedures are the same. Only the test statistic with
df = n
1
+ n
2
2 and p-value are different.
A level 1 condence interval for
1
2
is
(
x
1
x
2
) t
/2,n
1
+n
2
1
s
p
_
1
n
1
+
1
n
2
Where t
/2,n
1
+n
2
1
corresponds to the desired condence level and df =
n
1
+ n
2
- 2.
Again, all of these calculations will be done with the computer, NEVER
by hand.
45 / 51
1
=
2
HT and CI
Tomato Example Revisited: Still using the tomato example aforementioned,
but now assuming the two populations have the same standard deviation, i.e.
assume that
1
=
2
.
Sample Sizes: n
1
= 9, n
2
= 10
x
1
= 1.87,
x
2
= 1.49
s
1
= 0.63, s
2
= 0.43
Note: In practice, we should NOT assume equal variances unless told
so or the values of sample variance are very close to each other. The
rule of thumb says not more than twice the size.
df = n
1
+ n
2
2 = 9 + 10 2 = 17 (note the df is larger than before).
46 / 51
1
=
2
HT and CI
Tomato Example Revisited:
0
:
1
=
2
A
:
1
=
2
Calculate the test statistic (from the computer).
t =
(
x
1
x
2
) (
1
2
)
s
p
_
1/n
1
+ 1/n
2
=
(1.87 1.49) 0
0.53
_
1/9 + 1/10
= 1.550
Since the sample standard deviations were close, this value isnt much
different than the 2-sample t = 1.519.
47 / 51
1
=
2
HT and CI
Tomato Example Revisited:
The p-value, however comes from a different row of the t table, df = 17
vs. df = 8.
p value = 2P(T |t |) = 2P(T 1.550) = 2(between 0.05 and 0.10)
= between 0.10 and 0.20
In this case though, the p-value range is the same. From the computer,
the true p-values are 0.1395 are 0.1527. The pooled t -test will always
have a smaller p-value and therefore it would be easier to reject.
= 0.10 < p-value < 0.20, so we fail to reject H
0
.
State the conclusion: the same as before.
48 / 51
Comparison of the Two Sample Tests
Power vs. Conservatism
With every added assumption, we get more power.
The z-test is more powerful than the t -test since we assume the
variances are known.
The pooled t -test is more powerful than the 2-sample t -test since we
assume that the unknown variances are equal.
The larger the df , the more powerful the test. Notice that the z is the
last row of the t table. It has the largest possible sample, the whole
population.
To be conservative means to use a smaller df and so we get larger
p-values and wider condence intervals. If we reject with one df , we
would also reject with a larger df (more data) assuming that H
0
is false.
49 / 51
The paired t -test is even more powerful than the pooled t -test, but not
because of its degrees of freedom.
In the paired t -test procedure, we reduce the total variance by
eliminating the difference in the individuals.
There are (at least) two sources of variability in any two sample
situation: the difference due to the different population means and the
difference due to the different individuals.
The smaller the variance, the easier it is to see a difference in means.
Remember s or is always in the denominator. The smaller the
denominator, the larger the test statistics. The larger the test statistic,
the smaller the p-value and the more likely we are to reject.
50 / 51
Example: We want to know if retaking the SAT will improve your score.
There is the difference in the test scores for each person (paired
difference). It is unlikely you will make exactly the same score, but did
everyone improve is the question?
There is also a difference in peoples scores. Some people will do well
both times and some will not do as well either time. (Did the person
sitting next to you get the same score?)
The paired t -test gets rid of the difference in individuals by only
comparing within an individual.
But remember, this test requires paired, dependent data. Often we
cannot do this due to time restraints, etc.
51 / 51

Week 7: The T Distribution, Confidence Intervals and Tests

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Week 7: The T Distribution, Confidence Intervals and Tests

Uploaded by

Copyright:

Available Formats

Week 7: The t Distribution, Condence

Intervals and Tests

You might also like