You are on page 1of 13

Chapter 7 Introduction to Statistical

Inferences
Chapter Goals
• Learn the basic concepts of estimation and
hypothesis testing.
• Consider questions about a population mean using
two methods that assume the population standard
deviation is known.
• Consider: what value or interval of values can we
use to estimate a population mean?
• Consider: is there evidence to suggest the
hypothesized mean is incorrect?
1

7.1 The Nature of Estimation


Point Estimate for a Parameter:
The value of the corresponding statistic.
Example:
x = 14.7 is a point estimate (single number value) for the
mean µ of the sampled population.
Problem:
How good is the point estimate? Is it high? Or low? Would
another sample yield the same result?
Note:
The quality of an estimation procedure is enhanced if the
sample statistic is both less variable and unbiased.

1
Interval Estimate:
An interval bounded by two values and used to estimate the
value of a population parameter. The values that bound this
interval are statistics calculated from the sample that is being
used as the basis for the estimation.

Level of Confidence 1 − α:
The probability that the sample to be selected yields an
interval that includes the parameter being estimated.

Confidence Interval:
An interval estimate with a specified level of confidence.

7.2 Estimation of Mean µ (σ Known)


The assumption for estimating the mean µ using a known σ:
The sampling distribution of x has a normal distribution.

Assumption satisfied by:


1. Knowing that the sampling population is normally
distributed, or
2. Using a large enough random sample (CLT).
Note: The CLT may be applied to smaller samples (for
example n = 15) when there is evidence to suggest a
unimodal distribution that is approximately symmetric. If
there is evidence of skewness, the sample size needs to be
much larger.
4

2
A (large sample) 1 − α confidence interval for µ is found by

α σ α σ
x − z   to x + z  
 2 n  2 n

Note:
1. x is the point estimate and the center point of the
confidence interval.
2. z(α/2): confidence coefficient, the number of multiples of
the standard error needed to construct an interval estimate
of the correct width to have a level of confidence 1 − α.

α /2 1− α α /2

− z(α / 2 ) 0 z(α / 2) z 5

3. σ / n : standard error of the mean.


The standard deviation of the distribution of x

4. z (α / 2)(σ / n ) : maximum error of estimate E.


One-half the width of the confidence interval (the product
of the confidence coefficient and the standard error).

5. x − z (α / 2)(σ / n ) : lower confidence limit (LCL).


x + z (α / 2)(σ / n ) : upper confidence limit (UCL).

3
The Confidence Interval: A Five-Step Model:
1. Describe the population parameter of concern.
2. Specify the confidence interval criteria.
a. Check the assumptions.
b. Identify the probability distribution and the formula to
be used.
c. Determine the level of confidence, 1 − α.
3. Collect and present sample information.
4. Determine the confidence interval.
a. Determine the confidence coefficient.
b. Find the maximum error of estimate.
c. Find the lower and upper confidence limits.
5. State the confidence interval.

Example: A random sample of the test scores of 100


applicants for clerk-typist positions at a large insurance
company showed a mean score of 72.6. Determine a 99%
confidence interval for the mean score of all applicants at the
insurance company. Assume the standard deviation of test
scores is 10.5.

Solution:
1. Parameter of concern: the mean test score, µ, of all
applicants at the insurance company.
2. Confidence interval criteria.
a. Assumptions: The distribution of the variable, test score,
is not known. However, the sample size is large enough
(n = 100) so that the CLT applies.
b. Probability distribution: standard normal variable z with
σ = 10.5. 8

4
c. The level of confidence: 99%, or 1 − α = 0.99.
3. Sample information.
Given: n = 100 and x = 72.6.
4. The confidence interval.
a. Confidence coefficient: z (α / 2 ) = z ( 0.005) = 2.58
b. Maximum error:
E = z (α / 2)(σ / n ) = ( 2.58)(10.5 / 100 ) = 2.709
c. The lower and upper limits:
72.6 − 2.709 = 69.891 to 72.6 + 2.709 = 75.309
5. Confidence interval: With 99% confidence we can say,
“The mean test score is between 69.9 and 75.3.”
69.9 to 75.3 is a 99% confidence interval for the true
mean test score.

Note: The confidence is in the process.


95% confidence means: if we conduct the experiment over
and over, and construct lots of confidence intervals, then 95%
of the confidence intervals will contain the true mean value µ.

Sample Size:
Problem: Find the sample size necessary in order to obtain a
specified maximum error and level of confidence (assume the
standard deviation is known).
α σ
E = z 
 2 n
Solve this expression for n:
2
z (α / 2) ⋅ σ 
n =  
 E
10

5
Example: Find the sample size necessary to estimate a
population mean to within .5 with 95% confidence if the
standard deviation is 6.2.

Solution:

2
z (α / 2) ⋅ σ 
n =  
 E
2
(1.96)(6.2) 
=  = [24.304] = 590.684
2

 .5 

Therefore, n = 591.

Note: When solving for sample size n, always round up to the


next largest integer.
11

7.3 The Nature of Hypothesis Testing


Hypothesis:
A statement that something is true.

Statistical Hypothesis Test:


A process by which a decision is made between two opposing
hypotheses. The two opposing hypotheses are formulated so
that each hypothesis is the negation of the other. (That way
one of them is always true, and the other one is always false.)
Then one hypothesis is tested in hopes that it can be shown to
be a very improbable occurrence thereby implying the other
hypothesis is the likely truth.

12

6
There are two hypotheses involved in making a decision.

Null Hypothesis, H0:


The hypothesis to be tested. Assumed to be true. Usually a
statement that a population parameter has a specific value.
The “starting point” for the investigation.

Alternative Hypothesis, Ha:


A statement about the same population parameter that is used
in the null hypothesis. Generally this is a statement that
specifies the population parameter has a value different, in
some way, from the value given in the null hypothesis. The
rejection of the null hypothesis will imply the likely truth of
this alternative hypothesis.

13

Example: Suppose you are investigating the effects of a new


pain reliever. You hope the new drug relieves minor muscle
aches and pains longer than the leading pain reliever. State
the null and alternative hypotheses.

Solution:
H0: The new pain reliever is no better than the leading pain
reliever.

Ha: The new pain reliever lasts longer than the leading pain
reliever.

14

7
Example: You are investigating the presence of radon in
homes being built in a new development. If the mean level of
radon is greater than 4 then send a warning to all home
owners in the development. State the null and alternative
hypotheses.

Solution:
H0: The mean level of radon for homes in the development is
4 (or less).

Ha: The mean level of radon for homes in the development is


greater than 4.

15

Level of Significance α:
The probability of committing an error if the null hypothesis
is true.

Test Statistic:
A random variable whose value is calculated from the sample
data and is used in making the decision fail to reject H0 or
reject H0.

Note:
1. The value of the test statistic is used in conjunction with a
decision rule to determine fail to reject H0 or reject H0.
2. The decision rule is established prior to collecting the data
and specifies how you will reach the decision.
16

8
The Conclusion:
a. If the decision is reject H0, then the conclusion should be
worded something like, “There is sufficient evidence at
the α level of significance to show that . . . (the meaning
of the alternative hypothesis).”

b. If the decision is fail to reject H0, then the conclusion


should be worded something like, “There is not sufficient
evidence at the α level of significance to show that . . .
(the meaning of the alternative hypothesis).”

Note:
1. The decision is about H0.
2. The conclusion is a statement about Ha.
3. There is always the chance of making an error.
17

7.4 Hypothesis Test of mean µ (σ known):


A Classical Approach
The assumption for hypothesis tests about mean µ
using a known σ:
The sampling distribution of x has a normal
distribution.

Hypothesis test:
1. A well-organized, step-by-step procedure used to
make a decision.
2. The classical approach is the hypothesis test
process that has enjoyed popularity for many years.
18

9
The Classical Hypothesis Test: A Five-Step Procedure:
1. The Set-Up:
a. Describe the population parameter of concern.
b. State the null hypothesis (H0) and the alternative hypothesis (Ha).
2. The Hypothesis Test Criteria:
a. Check the assumptions.
b. Identify the probability distribution and the test statistic to be used.
c. Determine the level of significance, α.
3. The Sample Evidence:
a. Collect the sample information.
b. Calculate the value of the test statistic.
4. The Probability Distribution:
a. Determine the critical region(s) and critical value(s).
b. Determine whether or not the calculated test statistic is in the
critical region.
5. The Results:
a. State the decision about H0.
b. State the conclusion about Ha.
19

Example: The mean water pressure in the main water pipe


from a town well should be kept at 56 psi. Anything less and
several homes will have an insufficient supply, and anything
greater could burst the pipe. Suppose the water pressure is
checked at 47 random times. The sample mean is 57.1.
(Assume σ = 7.) Is there any evidence to suggest the mean
water pressure is different from 56? Use α = 0.01.

Solution:
1. The Set-Up:
a. Describe the parameter of concern:
The mean water pressure in the main pipe.
b. State the null and alternative hypotheses.
H0: µ = 56
Ha: µ ≠ 56
20

10
2. The Hypothesis Test Criteria:
a. Check the assumptions:
A sample of n = 47 is large enough for the CLT to apply.
b. Identify the test statistic.
The test statistic is z*.
c. Determine the level of significance: α = 0.01 (given)
3. The Sample Evidence:
a. The sample information: x = 57.1, n = 47
b. Calculate the value of the test statistic:
x − µ 57.1 − 56
z* = = = 1.077
σ n 7 47

21

4. The Probability Distribution:


a. Determine the critical regions and the critical values.

0.005 0.005

− 2.58 0 2.58 z

b. Determine whether or not the calculated test statistic is


in the critical region.
The calculated value of z, z* = 1.077, is in the
noncritical region.
22

11
5. The Results:
a. State the decision about H0.
Fail to reject H0.
c. State the conclusion about Ha.
There is no evidence to suggest the water pressure is
different from 56.

Example: An elementary school principal claims students


receive no more than 30 minutes of homework each night.
A random sample of 36 students showed a sample mean of
36.8 minutes spent doing homework (assume σ = 7.5). Is
there any evidence to suggest the mean time spent on
homework is greater than 30 minutes? Use α = 0.01.

23

Solution:
1. The parameter of concern: µ, the mean time spent doing
homework each night.
H0: µ = 30 (≤)
Ha: µ > 30
2. The Hypothesis Test Criteria:
a. The sample size is n = 36, the CLT applies.
b. The test statistic is z*.
c. The level of significance is given: α = 0.01.
3. The Sample Evidence:
x = 36.8, n = 36
x − µ 36.8 − 30
z* = = = 5.44
σ n 7.5 36
24

12
4. The Probability Distribution:

0.01

0 2.33 z

The calculated value of z, z* = 5.44, is in the critical


region.

25

5. The Results:
Decision: Reject H0.
Conclusion: There is sufficient evidence at the 0.01 level
of significance to conclude the mean time spent on
homework by the elementary students is more than 30
minutes.

Note: Suppose we took repeated sample of size 36.


What would you expect to happen?

26

13

You might also like