You are on page 1of 20

Chapter 8 Student’s t, χ2 and F

distributions
Chapter Goals
• Learned about confidence intervals and hypothesis
testing.
• Consider inference about µ when σ is unknown by
using classical procedure.
• Consider inference about variance and standard
deviation.
• Consider inference concerning the ratio of
variances using two independent samples.
1

8.1 Inference About Mean µ


(σ unknown)
• Inferences about µ are based on the sample mean x .
• If the sample size is large or the sample population
is normal, then z* = ( x − µ ) /(σ / n ) has a
standard normal distribution.
• If σ is unknown, use s as a point estimate for σ.
• Estimated standard error of the mean: s / n .
• Student’s t-statistic:
t = ( x − µ ) /( s / n )

1
Assumption: Samples are taken from normal populations.

Properties of the t-Distribution (df > 2):


1. t is distributed with a mean of 0.
2. t is distributed symmetrically about its mean.
3. t is distributed so as to form a family of distributions, a
separate distribution for each different number of degrees
of freedom (df ≥ 1)
4. The t-distribution approaches the normal distribution as the
number of degrees of freedom increases.
5. t is distributed with a variance greater than 1, but as the
degrees of freedom increase, the variance approaches 1.
6. t is distributed so as to be less peaked at the mean and
thicker at the tails than the normal distribution.
3

Student’s t-Distributions:

Normal distribution
Student’s t, df = 15
Student’s t, df = 5

0 t

Degrees of Freedom, df:


A parameter that identifies each different distribution of
Student’s t-distribution. For the methods presented in this
chapter, the value of df will be the sample size minus 1, df =
n − 1. 4

2
t-Distribution Showing t(df, α):

0 t (df , α ) t

Example: Find the value of t(12, 0.025).

0.025 0.025

− t (12,0.025) 0 t (12,0.025) t
− 2.18 2.18

Portion of Amount of α in one-tail


Table 6
df L 0.025 L
M
12 2.18 6

3
The assumption for inferences about mean µ when σ is
unknown:
The sampled population is normally distributed.

Confidence Interval Procedure:


1. Procedure for constructing confidence intervals similar to
that used when σ is known.
2. Use t in place of z. Use s in place of σ.
3. The formula for the 1 − α confidence interval for µ is

α s α s
x − t  df ,  to x + t  df ,  where df = n − 1
 2 n  2 n

Example: A study is conducted to learn how long it takes the


typical tax payer to complete their federal income tax return.
A random sample of 17 income tax filers showed a mean time
(in hours) of 7.8 and a standard deviation of 2.3. Find a 95%
confidence interval for the true mean time required to
complete a federal income tax return. Assume the time to
complete the return is normally distributed.

Solution:
1. Parameter of Interest: the mean time required to complete
a federal income tax return.
2. Confidence Interval Criteria:
a. Assumptions: Sampled population assumed normal, σ
unknown.
b. Test statistic: t will be used.
c. Confidence level: 1 − α = 0.95
8

4
3. The Sample Evidence:
n = 17, x = 7.8, and s = 2.3
4. The Confidence Interval:
a. Confidence coefficients: t (df , α / 2) = t (16,0.025) = 2.12
b. Maximum error:
s 2 .3
E = t (16,0.025) = ( 2.12) ⋅ = ( 2.12)(0.5578) = 1.18
n 17
c. Confidence limits:
x − E to x + E
7.8 − 1.18 to 7.8 + 1.18
6.62 to 8.98
5. The Results:
6.62 to 8.98 is the 95% confidence interval for µ.

Hypothesis-Testing Procedure:
1. The t-statistic is used to complete a hypothesis
test about a population mean µ.
2. The test statistic:
x−µ
t* = with df = n −1
s n
3. The calculated t is the number of estimated
standard errors of x from the hypothesized
mean µ.
4. Use Classical Approach.

10

5
Example: A random sample of 25 students registering for
classes showed the mean waiting time in the registration line
was 22.6 minutes and the standard deviation was 8.0 minutes.
Is there any evidence to support the student newspaper’s
claim that registration time takes longer than 20 minutes?
Use α = 0.05 and assume waiting time is approximately
normal.

Solution:
1. The Set-up:
a. Population parameter of concern: the mean waiting time
spent in the registration line.
b. State the null and alternative hypotheses:
H0: µ = 20 (≤) (no longer than)
Ha: µ > 20 (longer than)
11

2. The Hypothesis Test Criteria:


a. Check the assumptions: The sampled population is
approximately normal.
b. Test statistic: t* with df = n − 1 = 24
c. Level of significance: α = 0.05
3. The Sample Evidence:
a. Sample information: n = 25 , x = 22 .6 , and s = 8
b. Calculate the value of the test statistic:
x − µ 22 .6 − 20 2 .6
t* = = = = 1 .625
s n 8 25 1 .6

4. The Probability Distribution:


a. The critical value: t ( 24 ,0 .05 ) = 1 .71
b. t* is not in the critical region.
5. The Results:
a. Decision: Fail to reject H0.
b. Conclusion: There is insufficient evidence to show the
mean waiting time is greater than 20 minutes.
12

6
8.2 Inferences About Variance and
Standard Deviation

• Problems often arise that require us to make


inferences about variability.
• Usually use variance to make inferences
about variability.
• Inferences about the variance of a normally
distributed population use the chi-square,
χ2, distribution.
13

Background:
1. The chi-square distributions are a family of probability
distributions.
2. Each distribution is identified by a parameter called the
number of degrees of freedom.

Properties of the Chi-Square Distribution:


1. χ2 is nonnegative in value; it is zero or positively valued.
2. χ2 is not symmetrical; it is skewed to the right.
3. χ2 is distributed so as to form a family of distributions, a
separate distribution for each different number of degrees
of freedom.

14

7
Various Chi-Square Distributions

χ2, df = 6

χ2, df = 10

χ2, df = 20

0 χ2
15

Critical Values for Chi-Square Distribution

1. χ2(df, α)
The symbol used to identify the critical value of a
chi-square distribution with df degrees of freedom.
It denotes a point on the measurement axis so that there is
α of the area to the right of that point.
2. Since the chi-square distribution is not symmetrical, the
critical values associated with right and left tails are given
separately in Table 8.

16

8
χ2 Distribution Showing χ2(df,α):

0 χ2(df, α) χ2
17

Example: Find the value of χ2(12, 0.99).

0.01 0.99

0 χ2(12, 0.99) χ2
3.57
Area to the Right
Portion of L 0.99 L
Table 8 Area in Left-hand Tail
df L 0.010 L
M
12 3.57 18

9
The assumptions for inferences about the variance σ2 or
standard deviation σ
The sample population is normally distributed.

Hypothesis-Testing Procedure:
1. The statistical procedures for standard deviation are very
sensitive to nonnormal distributions (skewness, in
particular). This makes it difficult to decide if a significant
result is due to sample evidence or a violation of
assumptions.
( n − 1) s 2
2. The test statistic: χ 2* =
σ2

If the random sample is drawn from a normal population


with known variance σ2, then χ2* has a chi-square
distribution with n − 1 degrees of freedom.
19

Example: A machine used to fill 5-gallon buckets of driveway


sealer has standard deviation 2.5 ounces. A random sample
of 24 buckets showed a standard deviation of 2.9 ounces. Is
there any evidence to suggest an increase in variability at the
0.05 level of significance? Assume the amount of driveway
sealer in a bucket is normally distributed.

Solution (The Classical Approach):


1. The Set-up:
a. Population parameter of concern: the variance σ2 for the
amount of driveway sealer in a 5-gallon bucket.
b. State the null and alternative hypotheses:
H0: σ2 = 6.25 (≤) (variance is not larger than 6.25)
Ha: σ2 > 6.25 (variance is larger than 6.25)

20

10
2. The Hypothesis Test Criteria:
a. Check the assumptions: The sample population is
assumed to be normally distributed.
b. Test statistic: χ2* with df = n − 1 = 24 − 1 = 23
c. Level of significance: α = 0.05
3. The Sample Evidence:
a. Sample information: n = 24, σ2 = (2.9)2 = 8.41
b. Calculate the value of the test statistic:

( n − 1) s 2 ( 24 − 1)(8.41)
χ 2* = = = 30.95
σ 2
6.25

21

4. The Probability Distribution:


a. The critical value:
χ2(23,0.05) = 35.2
b. χ2* is not is the critical region.

5. The Results:
a. Decision: Fail to reject H0.
b. Conclusion: There is insufficient evidence to show the
variability has increased.

22

11
8.3 Inferences concerning the ratio of
variances using two independent
samples
• Compare the standard deviations of two
populations.
• Sampling distributions dealing with sample
standard deviations (or variances) are very
sensitive to slight departures from the
assumptions.
• Consider the hypothesis test for the equality
of standard deviations (or variances) for two
normal populations. 23

Background:
1. The hypothesis test procedure uses the ratio of
variances.
2. Inferences about the ratio of variances for two
normally distributed populations uses the F
distribution.
3. The F distribution is a family of probability
distributions.
4. Each F distribution is identified by two numbers
of degrees of freedom, one for each of the two
samples involved.

24

12
Properties of the F Distribution:
1. F is nonnegative in value; it is zero or positively skewed.
2. F is nonsymmetrical; it is skewed to the right.
3. F is distributed so as to form a family of distributions;
there is a separate distribution for each pair of numbers of
degrees of freedom.

Note:
1. For inferences discussed in this section, the number of
degrees of freedom for each sample is df1 = n1 − 1 and
df2 = n2 − 1.
2. Each different combination of degrees of freedom results
in a different F distribution.

25

F Distributions:
df1 = 3, df2 = 5

df1 = 20, df2 = 30

df1 = 10, df2 = 15

0 F
26

13
Critical values for the F distribution are identified
using three values:
1. dfn: degrees of freedom associated with the sample
whose variance is in the numerator of the
calculated F.
2. dfd: degrees of freedom associated with the sample
whose variance is in the denominator.
3. α: area under the distribution curve to the right of
the critical value.
4. Notation: F(dfn, dfd, α)

27

F Distribution Showing F(dfn, dfd, α):

0 F(dfn, dfd, α) F

28

14
Example: Find the value of F(4,14, 0.01).

Solution:
Use Table 9c (α = 0.01). Find the intersection of column df =
4 (for numerator) and row df = 14 (for denominator). Read
the value in the body of the table: F(4, 14, 0.01) = 5.04
Portion of Table 9c, α = 0.01

Degrees of Freedom for Numerator


1 2 3 4 5 6
1
df for Denominator

14 5.04

29

Note:
1. The degrees of freedom associated with the
numerator and with the denominator must be kept
in the correct order.
2. For example: F(4, 14, 0.01) F(14, 4, 0.01)
3. Interchanging the degrees of freedom numbers
will result in different F values.
4. Computers and calculators may be used to find the
cumulative probability for a specified F value. If
the area in the right tail is needed, subtract the
calculated probability from 1.

30

15
The assumptions for inferences about the ratio of two
variances: The samples are randomly selected from normally
distributed populations, and the two samples are selected in
an independent manner.

Hypothesis Tests:
If the null hypothesis is there is no difference in variability,
the test statistic is a ratio of sample variances:
s2
F * = 12
s2
If the null hypothesis is true, F* will have an F distribution
with dfn = n1 − 1 (numerator) and dfd = n2 − 1 (denominator).

31

Note:
1. The tables of critical values for the F distribution give only
the right-hand critical values. Adjust the numerator-
denominator order so that all the activity is in the right-
hand tail.
2. One-tailed tests: arrange the null and alternative hypothesis
so that the alternative is always greater than. F* is
computed using the same order as specified in the null
hypothesis.
3. Two-tailed tests: When computing F*, always use the
sample with the largest variance for the numerator. This
will make F* larger than 1 and place it in the right-hand
tail of the distribution.

32

16
Example: A recent study was conducted to determine whether
or not there was equal variability in male and female systolic
blood pressures. Independent random samples of 18 men and
16 women showed sm = 8.15 and sw = 9.92. Is there any
evidence to suggest the variances are unequal. Use the
classical approach with α = 0.05.

Solution:
1. The Set-up:
a. Population parameter of concern: The ratio of variances.
b. The null and alternative hypotheses:
σ m2
H0 : =1 (or σ m2 = σ w2 )
σ w2
σ m2
Ha : ≠1 (or σ m2 ≠ σ w2 )
σ w2
33

2. The Hypothesis Test Criteria:


a. Assumptions: Independent random samples, and assume
sampled populations are normally distributed.
b. Test statistic:

sw2 dfn = nw − 1 = 16 − 1 = 15
F* = with
sm2 dfd = nm − 1 = 18 − 1 = 17
This is a two-tailed test. Therefore the larger sample
variance is in the numerator.
c. Level of significance: α = 0.05

34

17
3. The Sample Evidence:
a. Sample information:
n m = 18 s m2 = ( 8 . 15 ) 2 = 66 . 4225
n w = 16 s w2 = ( 9 . 92 ) 2 = 98 . 4064

b. Calculate the value of the test statistic:


sw2 98.4064
F* = = = 1.4815
sm2 66.4225

4. The Probability Distribution:


a. Critical value: F(dfn, dfd, α/2) = F(15, 17, 0.025) = 2.72
b. F* is not in the critical region.

35

5. The Results:
a. Decision: Fail to reject H0.
b. Conclusion: At the 0.05 level of significance, there is no
evidence to suggest a difference in variability for men’s
and women’s systolic blood pressure.

36

18
Example: A soft-drink bottling company was to make a
decision about the equality of the variances of amounts of
fill between its present machine and a modern high-speed
outfit. Does the sample information below present
sufficient evidence to reject the null hypothesis (the
manufacturer’s claim) that the modern high-speed bottle-
filling machine fills bottles with no more variance than the
company’s present machine? Assume the amount of fill is
normally distributed for both machines, and complete the
test using α = 0.01.

Sample n s2
Present machine (p) 22 0.0008
Modern high-speed machine (m) 25 0.0018

37

Solution:
σ 2
Step 1. H 0 : m
≤ 1,
σ 2
p

σ 2
H a : m
> 1.
σ 2
p

Step 2. The F-distribution will be used with α = 0.01.

sm2 0.0018
Step 3. F* = = = 2.25.
s2p 0.0008
Step 4. dfn = 25 − 1 = 24,
dfd = 22 − 1 = 21.
Therefore, F(dfn, dfd, α) = F(24, 21, 0.01) = 2.80.
Now, F* is not in the critical region.

38

19
Step 5. We fail to reject H0. At the 0.01 level of
significance, the samples do not present sufficient
evidence to indicate an increase in variance.

39

20

You might also like