You are on page 1of 71

Chapter 4

Inferences Based on a Single Sample:


Confidence Intervals and Tests of
Hypothesis (9 hours)
Inferences Based on a Single Sample:
Confidence Intervals
Learning Objectives
1. Estimate a population parameter (means,
proportion, or variance) based on a large
sample selected from the population.
2. Use the sampling distribution of a statistic to
form a confidence interval for the population
parameter
3. Show how to select the proper sample size for
estimating a population parameter
Example

Suppose you’re interested in the average amount of


money that students in this class (the population)
have on them. How would you find out?
Statistical Methods

Statistical
Methods

Descriptive Inferential
Statistics Statistics

Hypothesis
Estimation
Testing
Estimation Methods

Estimation

Point Interval
Estimation Estimation
Target Parameter
 The unknown population parameter (e.g., mean or
proportion) that we are interested in estimating is
called the target parameter.
 The type of data (quantitative or qualitative)
collected is indicative of the target parameter. With
quantitative data, you are likely to be estimating the
mean or variance of the data. With qualitative data
with two outcomes (success or failure), the binomial
proportion of successes is likely to be the parameter
of interest.
Target Parameter
Determining the Target Parameter

Parameter Key Words of Phrase Type of Data

µ Mean; average Quantitative

p Proportion; percentage
fraction; rate Qualitative
Point Estimator
A point estimator of a population parameter is a
rule or formula that tells us how to use the sample
data to calculate a single number that can be used
as an estimate of the target parameter.
Point Estimation
 Provides a single value: based on
observations from one sample
 Gives no information about how close the
value is to the unknown population parameter

Example: Sample mean x = 3 is the point


estimate of the unknown population mean
Interval Estimator

An interval estimator (or confidence


interval) is a formula that tells us how to use
the sample data to calculate an interval that
estimates the target parameter.
Interval Estimation
 Provides a range of values
 Based on observations from one sample
 Gives information about closeness to unknown
population parameter
• Stated in terms of probability
– Knowing exact closeness requires knowing unknown
population parameter
 Example: Unknown population mean lies between
50 and 70 with 95% confidence
Estimation Process
Population Random Sample
I am 95%
Mean  confident that 
Mean, , is  x = 50 is between 40 &
unknown 60.

 
 
Sample 

 


Key Elements of
Interval Estimation
Sample statistic
Confidence
(point estimate)
interval

Confidence Confidence
limit (lower) limit (upper)

A confidence interval provides a range of


plausible values for the population parameter.
Example-Overdue

Data set: OVRDUE


Confidence Interval
The Central Limit Theorem:
 The sampling distribution of the sample mean is

approximately normal for large samples.


The interval estimator:

1.96
x  1.96 x  x 
n
Confidence Interval
If sample measurements yield a value of x that falls
between the two lines on either side of µ, then the
interval x  1.96 x will contain µ.
Confidence Coefficient

 The confidence coefficient is the probability


that a randomly selected confidence interval
encloses the population parameter - that is, the
relative frequency with which similarly
constructed intervals enclose the population
parameter when the estimator is used
repeatedly a very large number of times.
 The confidence level is the confidence
coefficient expressed as a percentage.
95% Confidence Level
If our confidence level is 95%, then in the long run,
95% of our confidence intervals will contain µ and
5% will not.
To choose a different confidence coefficient we
increase or decrease the area (call it ) assigned to
the tails. If we place /2 in each tail

and z/2 is the z-value, the


confidence interval with
coefficient (1 – ) is
 
x  z 2  x .
Large-Sample
(1 – )% Confidence Interval for µ
Large-Sample
(1 – )% Confidence Interval for µ

x   z 2   x  x  z 2  / n 
where z/2 is the z-value with an area /2 to its right
and in the standard normal distribution.
The parameter  is the standard deviation of the
sampled population, and n is the sample size.
Note: When  is unknown and n is large (n ≥ 30),
the confidence interval is approximately equal to

x  z 2 s / n 
where s is the sample standard deviation.
Required Conditions

1. A random sample is selected from the target


population.
2. The sample size n is large (i.e., n ≥ 30). Due to
the Central Limit Theorem, this condition
guarantees that the sampling distribution of
is approximately normal. Also, for large n, s will
be a good estimator of .
Thinking Challenge
You’re a Q/C inspector for
Gallo. The  for 2-liter bottles
is .05 liters. A random sample
of 100 bottles showed x =
1.99 liters. What is the 90%
confidence interval estimate
of the true mean amount in 2-
liter bottles?
22 liter
liter

© 1984-1994 T/Maker Co.


Confidence Interval
Solution*

 
x  z /2     x  z /2 
n n

.05 .05
1.99  1.645    1.99  1.645
100 100

1.982    1.998
Problem

 Unoccupied seats on flights cause airlines to lose


revenue. Suppose a large airline wants to
estimate its average number of unoccupied seats
per flight over the past year. To accomplish this,
the records of 225 flights are randomly selected,
and the number of unoccupied seats is noted for
each of the sampled flights. (The data are saved
in the NOSHOW file.)
Example
Small Sample  Unknown

Instead of using the standard normal statistic


xµ xµ
z 
x  n

use the t–statistic


xµ
t
s n
in which the sample standard deviation, s, replaces
the population standard deviation, .
Student’s t-Statistic
The t-statistic has a sampling distribution very much like
that of the z-statistic: mound-shaped, symmetric, with
mean 0.

The primary
difference between
the sampling
distributions of t and
z is that the t-
statistic is more
variable than the z-
statistic.
Degrees of Freedom

The actual amount of variability in the sampling


distribution of t depends on the sample size n. A
convenient way of expressing this dependence is to say
that the t-statistic has (n – 1) degrees of freedom (df).
Student’s t Distribution

Standard
Normal
Bell-Shaped
t (df = 13)
Symmetric
‘Fatter’ Tails
t (df = 5)

z
t
0
t - Table
t-value
If we want the t-value with an area of .025 to its right
and 4 df, we look in the table under the column t.025
for the entry in the row corresponding to 4 df. This
entry is t.025 = 2.776. The corresponding standard
normal z-score is z.025 = 1.96.
Small-Sample
Confidence Interval for µ

 s 
x  t 2 
 n 

where ta/2 is based on (n – 1) degrees of freedom.


Required Conditions

1. A random sample is selected from the target


population.
2. The population has a relative frequency
distribution that is approximately normal.
Estimation Example
Mean ( Unknown)
A random sample of n = 25 has x = 50 and s = 8.
Set up a 95% confidence interval estimate for .
s s
x  t /2     x  t /2 
n n
8 8
50  2.064     50  2.064 
25 25
46.70    53.30
Thinking Challenge
You’re a time study analyst
in manufacturing. You’ve
recorded the following task
times (min.):
3.6, 4.2, 4.0, 3.5, 3.8, 3.1.
What is the 90% confidence
interval estimate of the
population mean task time?
Confidence Interval Solution*
 x = 3.7
 s = 3.8987

• n = 6, df = n – 1 = 6 – 1 = 5
• t.05 = 2.015
.38987 .38987
3.7  2.015    3.7  2.015
6 6
.492    6.908
Problem
 Facial structure of CEOs. In Psychological Science (Vol. 22, 2011), researchers
reported that a chief executive officer’s facial structure can be used to predict a
firm’s financial performance. The study involved measuring the facial width to-
height ratio (WHR) for each in a sample of 55 CEOs at publicly traded Fortune
500 firms. These WHR values (determined by a computer analyzing a photo of
the CEO’s face) had a mean of x = 1.96 and a standard deviation of s = .15.
 a. Find and interpret a 95% confidence interval for m, the mean facial WHR for
all CEOs at publicly traded Fortune 500 firms.
 b. The researchers found that CEOs with wider faces (relative to height) tended to
be associated with firms that had greater financial performance. They based their
inference on an equation that uses facial WHR to predict financial performance.
Suppose an analyst wants to predict the financial performance of a Fortune 500
firm based on the value of the true mean facial WHR of CEOs. The analyst wants
to use the value of m = 2.2. Do you recommend he use this value?
Large-Sample Confidence
Interval for a Population
Proportion
Problem

 A food-products company conducted a market


study by randomly sampling and interviewing
1,000 consumers to determine which brand of
breakfast cereal they prefer. Suppose 313
consumers were found to prefer the company’s
brand. How would you estimate the true fraction
of all consumers who prefer the company’s cereal
brand?
Sampling Distribution of p̂
1. The mean of the sampling distribution of p̂ is p;
that is, p̂ is an unbiased estimator of p.

2. The standard deviation of the sampling


distribution of p̂ is pq n ; that is,  p̂  pq n
where q = 1–p.
3. For large samples, the sampling distribution of p̂
is approximately normal. A sample size is
considered large if both np̂  15 and nq̂  15.
Large-Sample Confidence

Interval for
pq ˆˆ
pq
pˆ  z 2 pˆ  pˆ  z 2  pˆ  z 2
n n
x
where p̂  and q̂  1  p̂.
n

Note: When n is large, p̂ can approximate the


value of p in the formula for  p̂ .
Conditions Required for a Valid
Large-Sample Confidence
Interval for p

1. A random sample is selected from the target population.

2. The sample size n is large. (This condition will be


satisfied if both np̂  15 and nq̂  15 . Note that np̂
and nq̂ are simply the number of successes and
number of failures, respectively, in the sample.).
Estimation Example
Proportion
A random sample of 400 graduates showed 32
went to graduate school. Set up a 95% confidence
interval estimate for p.

ˆˆ
pq ˆˆ
pq 32
pˆ  Z /2  p  pˆ  Z /2 pˆ   0.08
n n 400

.08  .92  .08  .92 


.08  1.96  p  .08  1.96
400 400

.053  p  .107
Thinking Challenge
You’re a production
manager for a newspaper.
You want to find the %
defective. Of 200
newspapers, 35 had
defects. What is the 90%
confidence interval estimate
of the population
proportion defective?
Confidence Interval
Solution*

pˆ  qˆ pˆ  qˆ
pˆ  z /2  p  pˆ  z /2
n n

.175(.825) .175(.825)
.175  1.645  p  .175  1.645
200 200

.1308  p  .2192
Adjusted (1 – )100% Confidence
Interval for a Population Proportion, p

p  1  p 
p  z 2
n4
x2
where p  is the adjusted sample proportion of observations with the characteristic
of interest, x n  4of successes in the sample, and n is the sample size.
is the number
Determining the Sample Size
Sampling Error
In general, we express the reliability associated
with a confidence interval for the population mean
µ by specifying the sampling error within which
we want to estimate µ with 100(1 –)% confidence.
The sampling error (denoted SE), then, is equal to
the half-width of the confidence interval.
Sample Size Determination for 100(1 – )
% Confidence Interval for µ

In order to estimate µ with a sampling error (SE)


and with 100(1 – )% confidence, the required
sample size is found as follows:
  
z 2    SE
 n
The solution for n is given by the equation
2
 z /2 
n 
 SE 
Sample Size Example
What sample size is needed to be 90% confident
the mean is within  5? A pilot study suggested
that the standard deviation is 45.

1.645 45
2 2
(z 2 ) 
2 2

n   219.2  220
(SE) 2 5
2
Sample Size Determination for 100(1 – )
% Confidence Interval for p

In order to estimate p with a sampling error SE and


with 100(1 – )% confidence, the required sample
size is found by solving the following equation for n:
pq
z 2  SE
n
The solution for n can be written as follows:
z   pq 
2
 2
Note: Always round n
n up to the nearest
SE  2
integer value.
Sample Size Example
What sample size is needed to estimate p within .
03 with 90% confidence?
width .03
SE    .015
2 2

(Z 2 )  pq 
2
1.645  .5 .5 
2

n   3006.69  3007
(SE) 2 .015 2
Thinking Challenge
You work in Human Resources at Merrill Lynch.
You plan to survey employees to find their
average medical expenses. You want to be
95% confident that the sample mean is within ±
$50.
A pilot study showed that  was about $400.
What sample size do you use?
Sample Size Solution*

(z 2 )2  2
n
(SE)2

1.96  400 
2 2


50
2

 245.86  246
Confidence Interval for a
Population Variance
Confidence Interval for a
Population Variance
Conditions Required for a Valid
Confidence Interval for 2

1. A random sample is selected from the target


population.
2. The population of interest has a relative
frequency distribution that is approximately
normal.
Thinking Challenge
You’re a marketing manager for a 5K race. You
take a random sample of the times of 292 runners
from the last race, with mean of 28.5 minutes and
standard deviation of 8.3 minutes. What is the
95% confidence interval estimate of the
population variance?
Confidence Interval
Solution*

df = 292  1 = 291 (use 300 df) 2
 .025

 n  1  2  
s 2
n  1 s 2

22  21
 2 

 292  1  8.3  292  1  8.3


2 2

 2

349.874 253.912

57.30    78.95 2
Key Ideas
Population Parameters, Estimators, and
Standard Errors

Parameter Estimator Standard Estimated


   ˆ  Error of Std Error
Estimator
  ˆ   ˆ 
ˆ

Mean, µ x  n s n
Proportion, p p̂ pq n p̂q̂ n
Key Ideas
Population Parameters, Estimators, and
Standard Errors
Confidence Interval: An interval that encloses
an unknown population parameter with a certain
level of confidence (1 – )

Confidence Coefficient: The probability (1 – )


that a randomly selected confidence interval
encloses the true value of the population
parameter.
Key Ideas
Key Words for Identifying the Target
Parameter

µ – Mean, Average

p – Proportion, Fraction, Percentage, Rate,


Probability

2 - Variance
Key Ideas
Commonly Used z-Values for a Large-Sample
Confidence Interval

90% CI: (1 – ) = .10 z.05 = 1.645


95% CI: (1 – ) = .05 z.025 = 1.96
98% CI: (1 – ) = .02 z.005 = 2.326
99% CI: (1 – ) = .01 z.005 = 2.575
Key Ideas
Determining the Sample Size n

Estimating µ: n   z 2      SE 
2 2
2

Estimating p: n   z 2 
2
 pq   SE 
2
Key Ideas
Finite Population Correction Factor

Required when n/N > .05


Key Ideas
Illustrating the Notion of “95% Confidence”
Key Ideas
Illustrating the Notion of “95% Confidence”
 END OF CHAPTER 4

You might also like