You are on page 1of 25

9.

1 Confidence Intervals
Winter 2021
Mon: 4:30-6:20pm
Wed: 5:00-6:50pm

Professor: Clarence Au
Date: Apr 5th, 2021
Learning Objectives
10.1  Construct, interpret, and solve problems involving confidence
intervals for proportions.
10.2  Identify the relationship between sample size and a confidence
interval.
10.3  Compute sample sizes for confidence intervals for proportions.
10.4  State and check the assumptions for confidence intervals.

Textbook
Ch 11 p. 337 – 349

Practice Questions
Exercises Ch 11 #1, 3, 5, 7, 9, 11, 21, 23, 27, 29, 33, 41, 47, 55
Proportion - Example

Sudbury is planning on passing a new bylaw and is interested in what


proportion of citizens plan on voting. 516 people were selected
randomly from the population, and 289 said they would vote. What
values are given in this example?

P hat = 289/516 and


sample size n = 516 are available

https://student.desmos.com/activitybuilder/student-greeting/6015df355e2b940b5e464e46
Proportions – Point Estimate

• When samples are random, every sample will give a different


estimate for .
• The sample proportion , is our best estimate of the true
population proportion.
• It is called a point estimate.
• When we do not know the true population proportion p, the
sample proportion serves as our best estimate even if it will not
be perfect.
Proportions – Sampling Distribution

• From the Central Limit Theorem the sampling distribution of is:


1. Centered at p – the true (often UNKNOWN) population proportion
2. Spread with standard deviation
3. Shape is approximately normal when the sample is large enough
• When the true proportion is unknown we use the standard
error, , as an estimate of the standard deviation.

p= ?
Proportion - Example

Sudbury is planning on passing a new bylaw and is interested in what


proportion of citizens plan on voting. 516 people were selected randomly from
the population, and 289 said they would vote. Sketch the sampling distribution
model for the sample proportions of citizens that plan on voting. Calculate the
sample proportion and standard error using R.
= 289/516 = 0.56
= sqrt (0.56 x (1-0.56))/516)
= 0.02185

p= ?

R CODE
n<-516
p.hat<-289/n
q.hat<-1-p.hat
SE<-sqrt(p.hat*q.hat/n)

Problem: when you don’t know p, you can’t know how far away is.
A Range of Proportions

• We can change our point of view to focus on .


• Using the 68%-95%-99.7% Rule we can find a range of values
that we expect p to be within
• About 68% of the time p will fall within 1 standard error of
• About 95% of the time p will fall within 2 standard errors of
• About 99.7% of the time p will fall within 3 standard errors of
Proportion - Example

Using the sampling distribution of the sample proportions, show the


spread of values expected for the population proportion using the 68-
95-99.7 rule. Value becomes z*
68%
p.hat - 1 x SE = 0.56 – 1 x 0.02185 = 0.538
p.hat + 1 x SE = 0.56 + 1 x 0.02185 = 0.582
95% R CODE
p.hat - 2 x SE = 0.56 – 2 x 0.02185 = 0.516 #68% of samples should have p between
p.hat-1*SE
p.hat + 2 x SE = 0.56 + 2 x 0.02185 = 0.604 p.hat+1*SE

#95% of samples should have p between


99.7% p.hat-2*SE
p.hat+2*SE
p.hat - 3 x SE = 0.56 – 3 x 0.02185 = 0.495 #99.7% of samples should have p between
p.hat-3*SE
p.hat + 3 x SE = 0.56 + 3 x 0.02185 = 0.625 p.hat+3*SE
General Form of Confidence Intervals

• Instead, of reporting the single point


estimate we can make an interval of
the form

that we are reasonably sure will


contain the true proportion.
• Confidence intervals help us
determine how close the point
estimate is to the population
parameter.
Level of Confidence

• The level of confidence corresponds to an area under the curve.


• For any confidence level the number of SEs we must stretch out
on either side of p̂ is called the critical value.
• Critical values are based on the Normal model so we denote
them z*.
• In general a confidence interval for the proportion takes the form
Finding the Critical Value

1. Write the confidence level (given as a %) as a decimal


number
2. Divide the confidence level in half
- This is necessary because we are looking for the middle percentage of
observations around the mean
p= 95%/2 = 47.5%
 0.5 + 0.475 =
3. Add 0.5 to get the area less than the critical value 0.975

4. Find that area (or the value


closest to it) in the BODY of the
POSITIVE z-score table
P= 5%/2 = 2.5%
5. Locate the corresponding z-
score in the POSITIVE z-score
table
P= 50% or 0.5
Common Levels of Confidence

Confidence Confidence Area to Look up z*


Level Level (Decimal) in the Table

90% 0.90 0.4500 + 0.5000 = 1.645


0.95

95% 0.95 0.4750 + 0.5000 = 1.96


0.9750

99% 0.99 0.99/2 + 0.5 = 2.575


0.4950 + 0.5000 =
0.995
Proportion - Example

Sudbury is planning on passing a new bylaw and is interested in


what proportion of citizens plan on voting. 516 people were selected
randomly from the population, and 289 said they would vote. Find a
precise 95% confidence interval for the proportion of voters.
p hat = 0.56, q hat = 0.44, z* (based on 95%) = 1.96
SE = 0.02185
= 0.56 z* x sqrt (0.56 x 0.44 / 516)
= 0.56 z* x 0.02185
R CODE
= 0.56 1.96 (0.02185)
z.crit<-qnorm(0.95/2+0.5, 0, 1, lower.tail=TRUE)
= 0.56 0.042826 = (0.5172, 0.6028) p.hat-z.crit*SE
p.hat+z.crit*SE

lower , upper
Interpreting Confidence Intervals

Not all confidence intervals will contain the true population


proportion.
If we created 100 confidence intervals, we could expect the
population mean to be within a certain number of the intervals.
That number corresponds to the level of confidence.
- For a 95% confidence interval:

95% of samples of this size


will produce confidence
intervals that capture the
true population proportion.
Proportion - Example

Sudbury is planning on passing a new bylaw and is interested in


what proportion of citizens plan on voting. 516 people were
selected randomly from the population, and 289 said they would
vote. Interpret the 95% confidence interval for the proportion of
voters.

95/100 chance that the sample will hit the real population
proportion will fall between 0.5172 and 0.6028
Assumptions and Conditions

1. Independence Assumption: the sampled values must be


independent of each other
2. Randomization Condition: data must be representative of
the population and randomly selected
3. 10% Condition: the sample size, n, must be no larger than
10% of the population
4. Success/Failure Condition: n ≥ 10 and n ≥ 10
Proportion - Example

Sudbury is planning on passing a new bylaw and is interested in


what proportion of citizens plan on voting. 516 people were
selected randomly from the population, and 289 said they would
vote. Are the confidence interval assumption and conditions met?

1) and 2) Independence and random as 516 people were selected


randomly
3) 516 < 10% of Sudbury population
4) n = 516 x 0.56 = 289 > 10
n = 516 x 0.44 = 227 > 10
Example

In a random sample of 1050 adult Canadians there are 556 people that were
willing to pay more for products with social and environmental benefits.
1. What is a 99% confidence interval for the proportion?
p hat = 556/1050, q hat = 1 – 556/1050, z* (based on 99%) = 2.575
SE = 0.015
= (0.4898, 0.5692)
2. How can we interpret the 99% confidence interval?
99% confidence that real population proportion will fall between 0.4898 and
0.5692
R CODE
n<-1050
p.hat<-556/n
q.hat<-1-p.hat
SE<-sqrt(p.hat*q.hat/n)

z.crit<-qnorm(0.99/2+0.5, 0, 1, lower.tail=TRUE)
p.hat-z.crit*SE
p.hat+z.crit*SE
Margin of Error

The margin of error controls the width of the interval

1. The higher the level of confidence the wider the interval


becomes
High confidence, low precision
2. The lower the level of confidence the more narrow the interval
becomes
Low confidence, high precision

Every confidence interval is a balance between certainty and


precision.
Margin of Error

Considering the 95% confidence interval created for the proportion of voters.
1. If we wanted to be 98% confident, would our confidence interval need to be
wider or narrower?
- Wider interval

2. If we wanted to reduce the margin of error would our level of confidence be higher
or lower?
- Narrower reducing z* value, confidence level decrease

3. If we had sampled more people would the margin of error be larger or smaller?
- n increase, narrow, margin of error is smaller
Margin of Error

• The margin of error depends on the number of people sampled


and NOT on the size of the population

• It is the size of the sample that will control the width of the
confidence interval because the sample size n is proportional to
the reciprocal of the squared ME.
• To get an interval that is more precise without giving up
confidence we can choose a larger sample size!
• The sample size should be decided when the study is being
designed so that you can obtain your desired level of
confidence.
Margin of Error

To determine the required sampling size to get a particular


confidence level

where n is the sample size (round up to the nearest whole


number)
z* is the critical value
is the sample proportion
)
ME is the margin of error (usually given as ±)
Proportion - Example

Suppose Sudbury wants to estimate, to within ±5%, the proportion of


voters who will vote in the NEXT election with 95% confidence. Their
previous year’s sample information indicated that 289 out of 516
people would vote. How large a sample do they need for the NEXT
election?
ME = ±5%
p hat = 289 / 516 = 0.56, q hat = 0.44
95% confidence = z.crit = 1.96
n = 379

R CODE
p.hat<-289/516
q.hat<-1-p.hat
ME <- 0.05
z.crit<-qnorm(0.95/2+0.5, 0, 1, lower.tail=TRUE)
n<-(z.crit/ME)^2*p.hat*q.hat
Determining the Sample Size - Unknown

• Often you have an estimate of the


population proportion based on
experience or a previous study.
• If not, use – the worst-case
scenario
• This estimate gives the largest
value for the standard error and
will determine the largest sample
necessary.
Example

Suppose a company wants to offer a new service and wants to


estimate, to within 3%, the proportion of customers who are likely
to purchase this new service with 95% confidence. How large a
sample do they need?

p.hat = 0.5, q.hat = 0.5, ME = ±3%


95% confidence = z.crit = 1.96
n = 1068

You might also like