You are on page 1of 76

Inferential Statistics:

Interval Estimation
Exc 1
 What conditions are required by the central limit
theorem before a confidence interval of the population
mean may be created?
A. The underlying population must be normally
distributed.
B. The underlying population must be normally
distributed if the sample size is 30 or more.
C. The underlying population need not be normally
distributed if the sample size is 30 or more.
D. The underlying population need not be normally
distributed if the population standard deviation is
known.
Exc 2
How does the variance of the sample mean compare to
the variance of the population?
A. It is smaller and therefore suggests that averages have
less variation than individual observations.
B. It is larger and therefore suggests that averages have
less variation than individual observations.
C. It is smaller and therefore suggests that averages have
more variation than individual observations.
D. It is larger and therefore suggests that averages have

more variation than individual observations.


Exc 3
If a population is known to be normally distributed, what
can be said of the sampling distribution of the sample
mean drawn from this population?
A. For any sample size n, the sampling distribution of the
sample mean is normally distributed.
B. For a sample size , the sampling distribution of the
sample mean is normally distributed.
C. For a sample size , the sampling distribution of the

sample mean is normally distributed.


D. For a sample size , the sampling distribution of the

sample mean is normally distributed.


Exc 4
The central limit theorem says that as the sample
size, n, from a given population gets large
enough, the sampling distribution of the mean can
be approximated by
A. The binomial distribution

B. The normal distribution

C. The Poisson distribution

D. Different distributions depending on the given

population
Exc 5
The finite population multiplier is used to
A. Reduce bias in the sample mean

B. Reduce the bias in the sample variance

C. Make the computation of the variance easier

D. Reduce sampling error


Exc 6
If we sample without replacement,
A. It is important to consider the size of the

sample relative to the size of the population


B. A larger sample relative to the size of the

population is preferred because it will reduce


sampling error
C. The sample size is unimportant

D. Use a smaller sample


True/ False (I f False, Ex plain W hy)

1. The central limit theorem can be used to determine the


distribution of the sample mean except distributions for
population proportions.
2. To use the central limit theorem, we need to have a
large enough sample size.
3. Results from a census will always be more accurate
than results from sample.
4. The variance of the sample mean will decrease as the
sample size increases.
True/ False (I f False, Ex plain W hy)

1. Sampling error cannot be eliminated.


2. In order to compute the mean and the standard
deviation of a sample, we must know the distribution of
the sample.
3. The standard deviation of the sample mean becomes
larger as the sample size increases.
4. Non sampling errors can be eliminated by taking a
census.
5. If we sample with replacement, we do not need to
consider the size of the sample relative to the size of
the population.
6. Sampling error is a type of systematic error.
Briefly answer all the question below.

1. At the beginning of every ten years, the


Indonesian government conducts a census.
Why does it take a census? What are the
advantages of a census over a sample?
2. What are the desired properties from a point
estimator? Explain!
3. How does the variance of the sample mean
compare to the variance of the population?
Briefly answer all the question below.

5. What is the importance of Central Limit


Theorem in the inferential statistics?
6. If a population is known to be normally
distributed, what can be said of the sampling
distribution of the sample mean drawn from
this population?
7. What is the relationship between sample size
and standard error?
8. What is the relationship between sample size
and power of the test?
Briefly answer all the question below.

9. What is the proportion of the observations that fall


outside of the interval [µ − σ, µ + σ] in the standard
normal distribution?
10. What is stratified random sampling?
11. What is cluster random sampling?
12. What is the advantage of cluster random sampling?
13. What is nonresponse bias in the sampling?
14. What is selection bias in the sampling?
15. What is the importance of Central Limit Theorem in the
inferential statistics?
16. Bias can occur in sampling. Bias refers to…………..
Estimating Population Values
Confidence Intervals

Content of this chapter


 Confidence Intervals for the Population
Mean, μ
σ
 when Population Standard Deviation is Known
n
s
 when Population Standard Deviation is Unknown
n

 Determining the Required Sample Size


 Confidence Intervals for the Population
Proportion, p
Point and Interval Estimates

 A point estimate is a single number,


 a confidence interval provides additional
information about variability

Lower Upper
Confidence Confidence
Point Estimate Limit
Limit
Width of
confidence interval
Point Estimates

We can estimate a with a Sample


Population Parameter … Statistic
(a Point Estimate)

Mean μ x
Proportion p p
Confidence Intervals

 How much uncertainty is associated with a


point estimate of a population parameter?

 An interval estimate provides more


information about a population characteristic
than does a point estimate

 Such interval estimates are called confidence


intervals
Confidence Interval Estimate

 An interval gives a range of values:


 Takes into consideration variation in sample
statistics from sample to sample
 Based on observation from 1 sample
 Gives information about closeness to
unknown population parameters
 Stated in terms of level of confidence
 Never 100% sure
Estimation Process

Random Sample I am 95%


confident that
μ is between
Population Mean 40 & 60.
(mean, μ, is x = 50
unknown)

Sample
General Formula

 The general formula for all


confidence intervals is:

Point Estimate ± (Critical Value)(Standard Error)


Confidence Level

 Confidence Level
 Confidence in which the interval
will contain the unknown
population parameter
 A percentage (less than 100%)
Confidence Level, (1-α)
(continued)
 Suppose confidence level = 95%
 Also written (1 - α) = .95
 A relative frequency interpretation:
 In the long run, 95% of all the confidence
intervals that can be constructed will contain the
unknown true parameter
 A specific interval either will contain or will
not contain the true parameter
 No probability involved in a specific interval
Confidence Intervals

Confidence
Intervals

Population Population
Mean Proportion

σ Known σ Unknown
Confidence Interval for μ
(σ Known)
 Assumptions
 Population standard deviation σ is known

 Population is normally distributed

 If population is not normal, use large sample

 Confidence interval estimate

σ
x ± z α/2
n
Finding the Critical Value
 Consider a 95% confidence interval: z α/2 = ±1.96
1 − α = .95

α α
= .025 = .025
2 2

z units: z.025= -1.96 0 z.025= 1.96


Lower Upper
x units: Confidence Point Estimate Confidence
Limit Limit
Common Levels of Confidence

 Commonly used confidence levels are


90%, 95%, and 99%
Confidence
Confidence z value,
Coefficient,
Level
1− α z α/2
80% .80 1.28
90% .90 1.645
95% .95 1.96
98% .98 2.33
99% .99 2.57
99.8% .998 3.08
99.9% .999 3.27
Interval and Level of Confidence
Sampling Distribution of the Mean

α/2 1− α α/2
x
Intervals μx = μ
extend from x1
σ x2 100(1-α)%
x + z α/2 of intervals
n
to constructed
σ contain μ;
x − z α/2
n 100α% do not.
Confidence Intervals
Margin of Error

 Margin of Error (e): the amount added and


subtracted to the point estimate to form the
confidence interval

Example: Margin of error for estimating μ, σ known:

σ σ
x ± z α/2 e = z α/2
n n
Factors Affecting Margin of Error

σ
e = z α/2
n
 Data variation, σ : e as σ

 Sample size, n : e as n

 Level of confidence, 1 - α : e if 1 - α
Example

 A sample of 11 circuits from a large normal


population has a mean resistance of 2.20
ohms. We know from past testing that the
population standard deviation is .35 ohms.

 Determine a 95% confidence interval for the


true mean resistance of the population.
Example
(continued)
 A sample of 11 circuits from a large normal
population has a mean resistance of 2.20
ohms. We know from past testing that the
population standard deviation is .35 ohms.
σ
 Solution: x ± z α/2
n

= 2.20 ± 1.96 (.35/ 11)

= 2.20 ± .2068
1.9932 ............... 2.4068
Interpretation

 We are 98% confident that the true mean


resistance is between 1.9932 and 2.4068 ohms
 Although the true mean may or may not be in this
interval, 98% of intervals formed in this manner
will contain the true mean

 An incorrect interpretation is that there is 98% probability that this


interval contains the true population mean.
(This interval either does or does not contain the true mean, there is
no probability for a single interval)
Confidence Interval for a Mean, σ Known -
Example

The American Management Association surveys middle


managers in the retail industry and wants to estimate their
mean annual income. A random sample of 49 managers
reveals a sample mean of $45,420. The standard deviation
of this population is $2,050.
 What is the best point estimate of the population mean?
 What is a reasonable range of values for the population
mean?
 What do these results mean?
Confidence Interval for a Mean, σ Known –
Example

The American Management Association surveys middle


managers in the retail industry and wants to estimate their
mean annual income. A random sample of 49 managers
reveals a sample mean of $45,420. The standard deviation
of this population is $2,050.

What is the best point estimate of the population mean?

 Our best estimate of the unknown population mean is


the corresponding sample statistic.

 The sample mean of $45,420 is the point estimate


of the unknown population mean.
Confidence Interval for a Mean, σ
Known - Example
The American Management Association surveys middle managers in the
retail industry and wants to estimate their mean annual income. A random
sample of 49 managers reveals a sample mean of $45,420. The standard
deviation of this population is $2,050.
 What is a reasonable range of values for the population mean?

 Suppose the association decides to use the 95 percent level of


confidence.
 The American Management Association surveys middle managers in
the retail industry and wants to estimate their mean annual income. A
random sample of 49 managers reveals a sample mean of $45,420.
The standard deviation of this population is $2,050.
Confidence Interval for a Mean –
Interpretation
What is the interpretation of the
confidence limits $45,846
and $45,994?

If we select many samples of 49


managers, and for each
sample we compute the
mean and then construct a
95 percent confidence
interval, we could expect
about 95 percent of these
confidence intervals to
contain the population
mean. Conversely, about 5
percent of the intervals
would not contain the
population mean annual
income, µ.
Confidence Intervals

Confidence
Intervals

Population Population
Mean Proportion

σ Known σ Unknown
Confidence Interval for μ
(σ Unknown)

 If the population standard deviation σ is


unknown, we can substitute the sample
standard deviation, s
 This introduces extra uncertainty, since s
is variable from sample to sample
 So we use the t distribution instead of the
normal distribution
Confidence Interval for μ
(σ Unknown)
(continued)
 Assumptions
 Population standard deviation is unknown
 Population is normally distributed
 If population is not normal, use large sample
 Use Student’s t Distribution
 Confidence Interval Estimate

s
x ± t α/2
n
Student’s t Distribution

 The t is a family of distributions


 The t value depends on degrees of
freedom (d.f.)
 Number of observations that are free to vary after
sample mean has been calculated

d.f. = n - 1
Degrees of Freedom (df)
Idea: Number of observations that are free to vary
after sample mean has been calculated
Example: Suppose the mean of 3 numbers is 8.0

Let x1 = 7
If the mean of these three
Let x2 = 8
values is 8.0,
What is x3? then x3 must be 9
(i.e., x3 is not free to vary)
Here, n = 3, so degrees of freedom = n -1 = 3 – 1 = 2
(2 values can be any numbers, but the third is not free to vary
for a given mean)
Student’s t Distribution
Note: t z as n increases

Standard
Normal
(t with df = ∞)

t (df = 13)
t-distributions are bell-
shaped and symmetric, but
have ‘fatter’ tails than the t (df = 5)
normal

0 t
Student’s t Table

Upper Tail Area


Let: n = 3
df .25 .10 .05 df = n - 1 = 2
α = .10
1 1.000 3.078 6.314 α/2 =.05

2 0.817 1.886 2.920


3 0.765 1.638 2.353 α/2 = .05

The body of the table


contains t values, not 0 2.920 t
probabilities
t distribution values
With comparison to the z value

Confidence t t t z
Level (10 d.f.) (20 d.f.) (30 d.f.) ____

.80 1.372 1.325 1.310 1.28


.90 1.812 1.725 1.697 1.64
.95 2.228 2.086 2.042 1.96
.99 3.169 2.845 2.750 2.57

Note: t z as n increases
Example
A random sample of n = 25 has x = 50 and
s = 8. Form a 95% confidence interval for μ

 d.f. = n – 1 = 24, so t α/2 , n−1 = t.025,24 = 2.0639

The confidence interval is


s 8
x ± t α/2 = 50 ± (2.0639)
n 25
46.698 …………….. 53.302
Approximation for Large Samples

 Since t approaches z as the sample size


increases, an approximation is sometimes
used when n ≥ 30:

Technically Approximation
correct for large n

s s
x ± t α/2 x ± z α/2
n n
Determining Sample Size
 The required sample size can be found to
reach a desired margin of error (e) and
level of confidence (1 - α)

 Required sample size, σ known:

2 2 2
z σ  z α/2 σ 
n= α/2
2
= 
e  e 
Required Sample Size Example
If σ = 45, what sample size is needed to be
90% confident of being correct within ± 5?

2 2
 z α/2 σ   1.645(45) 
n=  =  = 219.19
 e   5 

So the required sample size is n = 220

(Always round up)


If σ is unknown

 If unknown, σ can be estimated when


using the required sample size formula
 Use a value for σ that is expected to be
at least as large as the true σ

 Select a pilot sample and estimate σ with


the sample standard deviation, s
Confidence Interval Estimates for the Mean –
Example

The manager of the Inlet Square Mall, near Ft.


Myers, Florida, wants to estimate the mean
amount spent per shopping visit by customers. A
sample of 20 customers reveals the following
amounts spent.

Based on a 95% confidence interval, do


customers spend $50 on average? Do they
spend $60 on average?

9-50
Confidence Interval Estimates for
the Mean – Example

9-51
Exercise
 In the 150 samples of farmed salmon, the mean
concentration of mirex was 0.0913 ppm with a
standard deviation of 0.0495 ppm. A 95%
confidence interval for the mean mirex
concentration was found to be: (0.0833,
0.0993).
 Question: How large a sample would be
needed to produce a 95% confidence interval
with a margin of error of 0.004?
Solution
Confidence Intervals

Confidence
Intervals

Population Population
Mean Proportion

σ Known σ Unknown
Confidence Intervals for the
Population Proportion, p

 An interval estimate for the population


proportion ( p ) can be calculated by
adding an allowance for uncertainty to
the sample proportion ( p )
Confidence Intervals for the
Population Proportion, p
(continued)
 Recall that the distribution of the sample
proportion is approximately normal if the
sample size is large, with standard deviation

p(1 − p)
σp =
n
 We will estimate this with sample data:

p(1 − p)
sp =
n
Confidence interval endpoints
 Upper and lower confidence limits for the
population proportion are calculated with the
formula

p(1 − p)
p ± z α/2
n
 where
 z is the standard normal value for the level of confidence desired
 p is the sample proportion
 n is the sample size
Example

 A random sample of 100 people


shows that 25 are left-handed.
 Form a 95% confidence interval for
the true proportion of left-handers
Example
(continued)
 A random sample of 100 people shows
that 25 are left-handed. Form a 95%
confidence interval for the true proportion
of left-handers.
1. p = 25/100 = .25

2. Sp = p(1 − p)/n = .25(.75)/n = .0433

3. .25 ± 1.96 (.0433)


0.1651 . . . . . 0.3349
Interpretation

 We are 95% confident that the true


percentage of left-handers in the population
is between
16.51% and 33.49%.

 Although this range may or may not contain


the true proportion, 95% of intervals formed
from samples of size 100 in this manner will
contain the true proportion.
Changing the sample size

 Increases in the sample size reduce


the width of the confidence interval.

Example:
 If the sample size in the above example is
doubled to 200, and if 50 are left-handed in the
sample, then the interval is still centered at .25,
but the width shrinks to
.19 …… .31
Finding the Required Sample Size
for proportion problems

Define the p(1 − p)


margin of error: e = z α/2
n
2
z p (1 − p)
Solve for n: n= α/2
2
e
p can be estimated with a pilot sample, if
necessary (or conservatively use p = .50)
What sample size...?

 How large a sample would be necessary


to estimate the true proportion defective in
a large population within 3%, with 95%
confidence?
(Assume a pilot sample yields p = .12)
What sample size...?
(continued)

Solution:
For 95% confidence, use Z = 1.96
E = .03
p = .12, so use this to estimate p

z α2 /2 p (1 − p) (1.96)2 (.12)(1 − .12)


n= 2
= 2
= 450.74
e (.03)
So use n = 451
Exercise
 A school district is trying to determine its students’
reaction to a proposed dress code.
 To do so, the school selected a random sample of 50

students and questioned them. If 20 were in favor of the


proposal, then
(a) Estimate the proportion of all students who are in favor.
(b) Estimate the standard error of the estimate.
Solution
 Solution
(a) The estimate of the proportion of all students who are
in favor of the dress code is 20/50 = 0.40.
(b) The standard error of the estimate is p(1 − p)/50,
where p is the actual proportion of the entire population
that is in favor. Using the estimate for p of 0.4, we can
estimate this standard error by √0.4(1 − 0.4)/50 = 0.0693.
Exercise
 The Chamber of Commerce of a mid-sized city has supported a
proposal to change the zoning laws for a new part of town. The
new regulations would allow for mixed commercial and residential
development. The vote on the measure is scheduled for three
weeks from today, and the president of the Chamber of Commerce
is concerned that they may not have the majority of votes that they
will need to pass the measure. She commissions a survey that
asks likely voters if they plan to vote for the measure. Of the 516
people selected at random from likely voters, 289 said they would
likely vote for the measure.
 a. Find a 95% confidence interval for the true proportion of voters
who will vote for the measure.
 b. What would you report to the president of the Chamber of
Commerce?
Solution
Sample Size for Estimating
Population Mean – Example 1
A student in public administration wants to
determine the mean amount members of city
councils in large cities earn per month. She
would like to estimate the mean with a 95%
confidence interval and a margin of error of
less than $100. The student found a report by
the Department of Labor that estimated the
standard deviation to be $1,000. What is the
required sample size?

Given in the problem:


 E, the maximum allowable error, is $100,
 The value of z for a 95 percent level of
confidence is 1.96,
 The estimate of the standard deviation is
$1,000.
9-69
Sample Size for Estimating
Population Mean – Example 2
A consumer group would like to estimate the mean monthly
electricity charge for a single family house in July within $5
using a 99 percent level of confidence. Based on similar
studies, the standard deviation is estimated to be $20.00. How
large of a sample is required?
Review 1
A 90% confidence interval is constructed for the
population mean. If a 95% confidence interval had been
constructed instead (everything else remaining the same),
the width of the interval would have been ________ and
the probability of making an error would have been
_________.
A. Wider; bigger
B. Wider; smaller
C. Narrower; bigger

D. Narrower; smaller
Review 2
A point estimate is
A. a single value that is used to estimate the

population parameter.
B. a range of values used to estimate the

population parameter.
C. always unbiased.

D. always efficient.
Review 3
Other things being equal, the width of a 90 %
confidence interval will be
A. wider than a 95 % confidence interval.

B. narrower than a 95 % confidence interval.

C. may be wider or narrower depending on the

population variance.
D. the same width as a 95 % confidence interval.
Review 4
When constructing interval estimates for the population
mean we should use the
A. z distribution when the population variance is unknown
and the t distribution when the population variance is
known.
B. z distribution when the population variance is known
and the t distribution when the population variance is
unknown and is estimated with a small sample.
C. z distribution whether the population variance is
known or not
D. F distribution.
Briefly answer all the question below.

1. What is the connection between sample size and


standard error?
2. What is the connection between sample size and the
width of the confidence interval?
3. What is the connection between significance level and
the width of the confidence interval?
4. What is the connection between significance level and
confidence level?
5. What is the connection between sample size and
power of the test?
Briefly answer all the question below.

6. A 90% confidence interval is constructed for the


population mean. If a 95% confidence interval had
been constructed instead (everything else remaining
the same), the width of the interval would have been
________ and the probability of making an error would
have been _________.
7. What conditions are required by the central limit
theorem before a confidence interval of the population
mean may be created?
8. When the required sample size calculated by using a
formula is not a whole number, what is the best choice
for the required sample size?

You might also like