Professional Documents
Culture Documents
Name
Where do we start?
A good place to start would be to take a SRS of n lengths of wire rope and record the weight
they can hold until they break. Compute the sample average of breaking weight strength, and let
this be our estimate of the true mean of breaking weight strengths for wire rope made under your
manufacturing process. (This is a point estimate)
A better idea is to give an interval estimate for the true population mean.
Definitions
The point estimate of a parameter is the value of the statistic that estimates the value of
the parameter. (i.e. x
is a point estimate for )
An interval estimate of a parameter is an interval of numbers used to estimate the value
of a population parameter.
The level of confidence is the probability that represents the percentage of intervals that
will contain if a large number of repeated samples are obtained.
Denoted (1 )100%
If = 0.05 then (1 ) 100% = 95%
We call the significance level; note 0 1, usually small values (0.1, 0.05, 0.01).
A (1 )100% confidence interval for a parameter (or function of one or more parameters)
is a data based interval of numbers thought likely to contain the parameter and possessing a
stated probability based confidence or reliability.
1
What is the (approximate) probability that the sample average breaking weight of our rope
will be within 1.645 of its standard deviations of the true average breaking weight?
General Setup
2
Start by assuming X follows a normal distribution. For Xi iid N (, 2 ), we know X n N (, )
n
exactly.
Example Interval
We want to find an interval that has a probability of 0.7.
The z-scores -1.04 and 1.04 cut off 0.15 in each tail on the standard norma
P (1.04 < Z < 1.04) = 0.7 thus P ( 1.04 n < X < + 1.04 n ) = 0.7
Draw a picture to represent this:
We can manipulate the inequality in the above probability statement so that it is centered around
.
General Formula
How do we develop a generic formula for a Large-n CI for if 2 is known?
Definition: we define the notation z1/2 as the z-value (from a N(0,1) distribution) that has a
probability of (1 /2) less than it. ie: P [Z < z1/2 ] = 1 2 and P [Z < z1/2 ] = 2
We want (1 )100% confidence level.
Set up: P [z1/2 < Z < z1/2 ] = 1
Large-n CI for with known variance 2 : If X1 , X2 , . . . , Xn are iid with E[Xi ] = and V ar(Xi ) =
2 , where 2 is known, then for large enough n (ie: n 25) then,
X n z1/2
n
is a (1 )100% confidence interval for .
Question: What affects the location and width of a CI for ?
As n it increases, the width of the CI decreases.
As it increases, the width of the CI increases.
As (1 ) it increases, the width of the CI increases.
There are confidence levels for CIs that are more common. Here is a table of those values along
with the associated z-value.
Confidence
80%
90%
95%
99%
z1/2
1.28
1.645
1.96
2.58
We started by assuming that X follows the normal distribution. What if X follows some other
distribution, or we dont know what distribution it follows?
2
For large n (n 25), by CLT X n N (, ) approximately.
n
Question: What does it mean to be (1 )100% confident?
Answer: If we repeat this methodology many, many times, then (1)100% of the confidence
intervals will contain the true parameter (or function of parameters)
However, once the confidence interval is constructed, we CAN NOT SAY that there is a
PROBABILITY OR CHANCE that the true value of the population parameter (or function
of parameters) lies inside the confidence interval
P [a < < b] = 1 ; but here, a is a constant, b is a constant and is a constant so this
statement DOESNT MAKE ANY SENSE!!
Once the confidence interval for a mean has been constructed, the true value for the population
mean either IS in the interval or IS NOT in the interval. So we can only say that we are
CONFIDENT the interval contains the truth.
(1 )100% confidence refers to the method, NOT the particular result.
When we calculate a single confidence interval, the correct interpretation is
We are (1 )100% confident that the true population mean is between a and b (of course,
put it into context)
Note the difference in the two statements
What does it mean to be (1 )100% confident
This refers to the methodology.
How can we interpret a single confidence interval
This needs to be present after EVERY confidence interval that you ever compute.
4
Example 1
Suppose we know that the standard deviation for the weight of a bag of M&Ms (labeled 10 oz.)
is 0.5 ounces. If we randomly sampled 30 bags which resulted in a sample mean weight of 10.4
ounces, what our 95% confidence interval for the true mean weight of all 10 oz. bags of M&Ms?
How do we interpret our confidence interval?
Find how many bags we need to sample to create a 95% CI that has a width of 0.1 ounces.
In general, to find the required sample size for a given confidence level:
Example 2
Ten simple random samples of 30 observations were drawn from a Normal(0,1) distribution. With
each sample, a 90% confidence interval was constructed. Here are the 10 intervals that were
constructed.
Lower
Upper
-0.26
0.34
-0.26
0.34
-0.12
0.48
-0.39
0.21
-0.42
0.19
-0.23
0.37
-0.33
0.27
-0.13
0.47
-0.27
0.33
-0.23
0.38
Example 1 (cont)
Suppose the sample variance of the 10 oz. M&M bags is 1.0 ounce. Find a 90% confidence interval
for the true mean weight of all 10 oz. bags of M&Ms.