Professional Documents
Culture Documents
Confidence Intervals
Confidence intervals are a statistics concept that many people find difficult to
understand, yet we encounter them every day in polling data, drug test results, and
marketing surveys, to name only a few examples. It is important to understand the
meaning of these intervals and how they are used to influence and make decisions.
Imagine the large circle below represents the population, and each dot represents a
single value for that population. When we sample a number of values (“n” values) from
the population, we get a sample mean, (or alternately a sample proportion). Each
individual sample provides a point estimate, or single-number estimate, of the
population parameter. Since there is variability within the individual data of a population,
depending on the n values that are sampled in each distribution, we will get a sample
mean that may or may not be close to the population mean.
Sample 2, mean = 2
std dev = σx2 Population, mean = μ
std dev = σ
Sample 3, mean = 3
std dev = σx3
Sample 1, mean = 1
std dev = σx1
We can use the sample distribution from a study to construct a confidence interval.
We express statistical certainty by giving a confidence level for the interval. Usually we
talk about a 90% confidence interval (CI), a 95% CI, a 98% CI, or a 99% CI. The
confidence level indicates how likely the confidence interval is to contain the
population parameter. What that means is, if you were to repeat the study the way it
A 90% confidence level means that 90% of the time the interval we construct around a
sample mean or proportion will include (or capture) the true population parameter. It
does NOT mean that we are 90% confident that the population parameter is within a
certain interval. This is critical to understand.
We can imagine the concept of confidence levels and intervals like a game of
horseshoes. The goal (post) is the population parameter. It is FIXED. What changes is
our horseshoe toss – how close we get to the post. Each horseshoe toss is like using
one sampling distribution and one confidence level to construct an interval that may or
may not include the population parameter. A higher confidence level is like making a
wider horseshoe, which makes it more likely that you capture the post (or true value) in
a toss. So it makes sense that a higher confidence level would result in a larger RANGE
of values in the confidence interval.
Population
parameter, μ or p
Confidence interval
using sample 1
Confidence interval
using sample 2
Half-width (HW)
The point estimate comes from the sample distribution – either a sample mean or
sample proportion. The critical value is determined by the confidence level. Each
confidence level corresponds to a particular t-score value (if we’re estimating the
population mean) or a critical z-score value (if we’re estimating the population
proportion). The quantity after the ± sign is called the half-width (HW) because it is
exactly one half of the confidence interval. When the half-width is added to and
subtracted from the sample mean, we get the upper and lower limit of the confidence
interval.
Example 1: You are trying to estimate the number of VCC students who own iPhones.
A random sample of 100 students reveals that 35 of them own an iPhone. Estimate the
percentage of all hospitality students who have iPhones with 98% confidence.
Solution: This question involves a sample PROPORTION. First we have to make sure
the sample size is less than 5% of the population in order to use the binomial
distribution, since sampling is done without replacement.
100/.05 2,000
Since the student population is greater than 2,000, our sample is less than 5% of the
population and we can use the binomial distribution.
Also, n ̂ and n must be ≥10 in order to approximate the binomial distribution with the
normal distribution:
̂ 100 0.35 35
100 0.65 65
They are both greater than 10, so we can proceed.
Using the table, a 98% confidence level gives a z* of 2.326. ̂ 0.35, 0.65
. .
̂ 0.35 2.326
0.35 0.111
The confidence interval (0.24, 0.46) captures the true proportion of all VCC students
who own an iPhone, with 98% confidence.
We use the sampling distribution of to build a confidence interval for the population
mean, μ. After checking the normality of the sample value distribution, we can proceed
to calculating the confidence interval:
√
Example 2: The average spending for a gym membership by a random sample of 150
students at a university is $125, with a standard deviation of $30. Construct a 95%
confidence interval estimate for the average expense of all university students on gym
membership. You may assume that the membership costs are normally distributed.
The confidence interval is ($120.16, $129.84) or the range from $120.16 to $129.84.
Exercises
1. At 95% confidence level, half-width is 0.077 (or 7.7 percentage points). At 99%
confidence level, half-width is 0.101 (or 10.1 percentage points).
2. $45,000 ± 1,924.48
3. 74.3 ± 4.1 % of all customers, or (70.2%, 78.4%)
4. ($950,939.91, $999,060.09)
5. 48.5 ± 1.2 hours per week, or (47.3, 49.7)
6. 271 people
7. 362 residents