Professional Documents
Culture Documents
Carrie Madden
Probablity Distribution
Definition
The probability distribution of a random variable X tells us what values X
can take and how to assign probabilities to those values.
Motivation
What would happen if we repeatedly took samples of the same size, n,
from the population and calculated the sample mean, x̄ .
Keep doing this and after you have a sufficiently large number of sample
means, plot their values.
Example
The label on a can of Pepsi says the can contains 355 ml. Suppose that in
fact, the fill volumes of cans of Pepsi follow a normal distribution with
mean 355 ml and standard deviation 2 ml.
set.seed(1)
x<-rnorm(1,355,2)
for(i in 1:1000)
{
x<-c(x,rnorm(1, 355, 2))
}
100
50
0
mean(x)
## [1] 354.979
sd(x)
## [1] 2.070066
Say, we take one sample of 100 cans and calculate the mean fill level x̄ .
We then take a second sample of 100 cans and calculate the mean of that
sample. We repeat this for a large number of samples of size 100, then
make a histogram of all the x̄ ’s we made.
This is what the R code would look like:
set.seed(1)
xbar<-mean(rnorm(100,355,2))
for(i in 1:1000)
{
xbar<-c(xbar,mean(rnorm(100, 355, 2)))
}
100
50
0
xbars
µ = 355
mean(xbar)
## [1] 354.9955
sd(xbar)
## [1] 0.1933137
Sampling Distribution
Sampling Distribution
Note
The mean of the sampling distribution of X̄ is equal to the mean of the
population distribution of X , but the standard deviation is lower. This is
due to the fact that averages are less variable than individual observations.
Recall: The label on a can of Pepsi says the can contains 355 ml.
Suppose that in fact, the fill volumes of cans of Pepsi follow a normal
distribution with mean 355 ml and standard deviation 2 ml.
Question
If you buy a six-pack of Pepsi, what is the probability that the mean
volume of Pepsi in the cans is greater than 356 ml?
Solution
356−355
P X̄ > 356 = P Z > √2
= P(Z > 1.22) = 1 − P(Z < 1.22) =
6
1 − 0.8888 = 0.1112
Instead of using our Tables, we can use R to find the probability as well.
Since we are finding P(x̄ > 356) (area to the right). We use 1-pnorm
function R. pnorm is the cummulative density function. This is for you
information, but this function just gives us a more accurate value than the
values we see in the tables.
1-pnorm(356,355,2/sqrt(6))
## [1] 0.1103357
If the pulse rates of healthy adult women follow a normal distribution with
mean 74 beats per minute and standard deviation 12 beats per minutes,
What is the probability that a simple random sample of 25 women has an
average pulse rate between 68 and 80 beats per minute?
So what we have here is X ∼ N(74, 12). In words, our random variable X
(pulse rates) follow a normal distribution with a mean, µ = 74 and
standard deviation σ = 12.
We are looking for P(68 ≤ x̄ ≤ 80)
12
X̄ ∼ N(74, √ )
25
68−74 80−74
P 68 < X̄ < 80 = P √12
<Z < √12
= P(−2.50 < Z < 2.50) =
25 25
P(Z < 2.50) − P(Z < −2.50) = 0.9938 − 0.0062 = 0.9876
## [1] 0.9875807
Lightbulb Example
Lifetime of Lightbulbs
150
100
50
0
xbar
xbar<-mean(rexp(2,rate=(1/500)))
for(i in 1:1000)
{
xbar<-c(xbar,mean(rexp(2,rate=(1/500))))
}
150
100
50
0
mean(xbar)
## [1] 500.8593
sd(xbar)
## [1] 366.4374
100
50
0
xbar<-mean(rexp(5,rate=(1/500)))
for(i in 1:1000)
{
xbar<-c(xbar,mean(rexp(5,rate=(1/500))))
}
mean(xbar)
## [1] 496.9203
sd(xbar)
## [1] 232.9599
100
50
0
xbar<-mean(rexp(50,rate=(1/500)))
for(i in 1:1000)
{
xbar<-c(xbar,mean(rexp(50,rate=(1/500))))
}
You can see what is happening to the shape of the distribution as the
sample size increases.
500
We expect the mean = 500 and the standard deviation = √ = 70.71
50
mean(xbar)
## [1] 500.1796
sd(xbar)
## [1] 72.23384
σ
X̄ ∼ N µ, √
n
Lightbulb Example
Recall the example for the lifetimes of lightbulbs. Recall lifetimes followed
a right-skewed distribution with µ = 500 and σ = 500
Example
What is the probability that a random sample of 40 light bulbs has a
mean lifetime greater than 600 hours?
Even though the distribution of lifetimes is not normal, the sample size
(here n = 40) is large enough that we can use the Central Limit Theorem
to calculate this probability, since the distribution of X̄ is approximately
normal.
Lightbulb Example
The probability that the mean lifetime of the 40 bulbs exceeds 600 hours is
approximately :
Solution
600−500
P X̄ > 600 ≈ P Z > 500
√
= P(Z > 1.26) = 1 − P(Z < 1.26) =
40
1 − 0.8962 = 0.1038
1-pnorm(600,500,500/sqrt(40))
## [1] 0.1029516
In summary:
Statistical Inference
We get data from a sample, but we are often not satisfied with
information just about the sample itself. We would like to use this sample
data to infer something about the population of interest.
Statistical inference provide methods for drawing conclusions about a
population from sample data.
Statistical Inference
We can never be sure that our sample data fairly represent the population.
In order to quantify this uncertainty, in statistical inference we use the
language of probability.
Like probability, the foundation of inference lies on predictable long-run
behaviour.
By taking “good” sample (i.e.: SRS), we can draw conclusions with a high
probability of being correct.
Statistical Inference
Suppose a random variable X follows a normal distribution with mean µ
and standard deviation σ. We take a simple random sample of n
individuals from the population. Recall that a level C confidence interval
for the population mean µ is calculated as
σ
x̄ ± z ∗ √
n
where:
x̄ is the estimate
z ∗ is the critical value such that
P (−z ∗ ≤ Z ≤ z ∗ ) = C
Statistical Inference
The confidence interval is the middle area. For example, a 95% confidence
interval woud have the shaded area as 0.95, with 0.025 area in either tail.
This falls between z = −1.96 and z = 1.96
0.4
0.3
0.2
y
0.1
0.0
Carrie
−4 Madden −2 STAT 2000
0 – Unit 1 2 4 45 / 253
Unit 1 – Inference for the Mean of a Single Population
Statistical Inference
For interest only, this is the code used to produce the picture on the
previous slide:
x=seq(-4,4,length=200)
y=dnorm(x,mean=0,sd=1)
plot(x,y,type="l")
x=seq(-1.96,1.96,length=100)
y=dnorm(x,mean=0,sd=1)
polygon(c(-1.96,x,1.96),c(0,y,0),col="blue")
Example
The sentence times for criminals convicted of a particular crime are known
to follow a normal distribution with standard deviation σ = 25.8 months.
The sentences (in months) of a random sample of ten criminals convicted
of this crime are shown below:
136 102 84 150 115
98 125 176 120 74
Calculate a 95% confidence interval for the true mean sentence time for all
criminals convicted of this crime.
Solution
A 95% confidence interval for the true mean sentence time for all criminals
convicted of this crime is
σ 25.8
x̄ ± z ∗ √ = 118.0 ± 1.96 √
n 10
= 118.0 ± 16.0 = (102.0, 134.0)
where z ∗ = 1.96 is the upper 0.025 critical value from the standard normal
distribution, i.e.,
P (−1.96 ≤ Z ≤ 1.96) = 0.95
Critical Values
Critical values, z ∗ for some common confidence levels (90%, 95%, 99%,
etc.) can be found in the last row of Table 3. We interpret the confidence
interval as follows:
Interpretation of Confidence Interval
If we took repeated sample of ten criminals and calculated the interval in a
similar manner, then 95% of such intervals would contain the true mean
sentence time for all criminals convicted of this crime.
## [1] 1.959964
#or
## [1] -1.959964
We use the qnorm function. The probability (since we are looking to have
a middle = 0.95) is going to leave 0.025 in either tail. That is the reason
we divide by 2, and set the lower tail probability to false (because, once
again, it’s a two-sided critical value.)
Carrie Madden STAT 2000 – Unit 1 50 / 253
Unit 1 – Inference for the Mean of a Single Population
Confidence Intervals in R
x<-c(136, 102, 84, 150, 115, 98, 125, 176, 120, 74)
xbar <- mean(x)
sigma <- 25.8
#from the question this is the population standard deviation
left
## [1] 102.0093
right
## [1] 133.9907
Practice Question
Practice Question
The manager at a grocery store would like to estimate the true mean
amount of money spent by customers in the express lane. She selects a
simple random sample of 50 receipts and calculates a 97% confidence
interval for µ to be ($15.50, $20.25). The confidence interval can be
interpreted as, in the long run,
A 97% of similarly constructed intervals would contain the population
mean.
B 97% of similarly constructed interval would contain the sample mean.
C 97% of all customers in the express lane spend between $15.50 and
$20.25.
D 97% of samples of 50 customers will have means between $15.50 and
$20.25.
E 97% of customers who are spending between $15.50 and $20.25 use
the express lane.
Carrie Madden STAT 2000 – Unit 1 55 / 253
Unit 1 – Inference for the Mean of a Single Population
Example
Suppose we wish to find a 92% confidence interval for µ:
We find the value z ∗ such that
P (−z ∗ ≤ Z ≤ z ∗ ) = 0.92
Therefore,
P (Z ≤ z ∗ ) = 0.92 + 0.04 = 0.96
From Table 2, this corresponds to the critical value z ∗ = 1.75. Therefore,
our 92% confidence interval is thus:
σ 25.8
∗
X̄ ± z √ = 118.0 ± 1.75 √
n 10
= 118.0 ± 14.3 = (103.7, 132.3)
## [1] 1.750686
#or
## [1] -1.750686
Confidence Intervals in R
x<-c(136, 102, 84, 150, 115, 98, 125, 176, 120, 74)
xbar <- mean(x)
sigma <- 25.8
#from the question this is the population standard deviation
left
## [1] 103.7167
right
## [1] 132.2833
Practice Question
Notice that as the confidence level increase, so too does the margin of
error (and hence the length of the confidence interval).
If we increase the confidence level, we must sacrifice our precision of
estimation. If we want to be more sure that our interval contains µ, we
have to expand the interval!
We would ideally like to use a high confidence level and obtain a narrow
confidence interval, but we have seen
that there is a trade-off between the confidence level and the margin of error.
How can we reduce the length of the interval without sacrificing our
precision of estimation?
σ 25.8
∗
x̄ ± z √ = 118.0 ± 1.96 √
n 40
= 118.0 ± 8.0 = (110.0, 126.0)
R practice
To practice on your own, go back through and find the R code to produce
this confidence interval.
Note
You will not have to start by entering the vector, since we have increased
the observations to 40, and x̄ = 118 (still the same)
(102.0, 134.0)
(110.0, 126.0)
Notice that a higher sample size results in a lower margin of error (and
hence a narrower confidence interval). In fact, we see that taking a sample
size that is four times greater results in a margin of error only half as large.
Practice Question
Example
Suppose it is known that the nicotine content of a certain brand of
cigarettes follows a normal distribution with standard deviation 0.1 mg.
We would like to take a sample of cigarettes large enough to estimate the
true mean nicotine content to within 0.04 mg with 98% confidence. How
many cigarettes do we need to sample in order to achieve this?
Solution
We know that
Therefore,
∗ 2 2
z σ 2.326(0.1)
n= =
m 0.04
= 33.81 ≈ 34
Example – R Code
n<-(((zstar*sigma)/moe)^2)
n
## [1] 33.82434
Example
Now suppose that we decide that a margin of error of 0.04 mg is too large
and we would like to estimate the true mean nicotine content to within
0.02 mg with 98% confidence (i.e., cut the margin of error in half).
Solution
We require a sample size of
∗ 2 2
z σ 2.326(0.1)
n= =
m 0.02
= 135.26 ≈ 136
For practice, try to reproduce the R code for the above example and see if
you get the same answer.
Notice that when we cut the margin of error in half, we require four times
the sample size.
Note
In general, if we want to reduce the margin of error by a factor of k, we
need a sample that is k 2 times as large.
If we want to reduce the margin of error to one third its original value, we
need nine times more individuals in our sample, etc.
Practice Question
We would like to estimate the true mean number of hours adults sleep at
night. Suppose that sleep time is knowns to follow a normal distribution
with standard deviation 1.5 hours. What sample size is required in order to
estimate the true mean to within 0.5 hours with 96% confidence?
A 19
B 24
C 38
D 47
E 53
Practice Question
A real estate agent would like to estimate the true mean value of all
houses in Winnipeg. She calculates that, in order to estimate the true
mean to within $10,000 with 95% confidence, she need to select a sample
of 90 houses in Winnipeg. What sample size would be required to estimate
the true mean value of all houses in Winnipeg to within $5,000 with 95%
confidence?
A 45
B 127
C 180
D 360
E 8100
Some Cautions
Our formula for the confidence interval holds only if the data were
collected using an SRS. There is no correct way to do proper
inference using data collected haphazardly. Good formulas cannot
rescue us from pro sampling methods.
Since the sample mean is strongly influenced by outliers, so too is the
confidence interval.
We are using the true population standard deviation σ in our
calculations. In practice, this is not a realistic assumption. We will
see a proper method for constructing confidence intervals when we
only have the sample standard deviation s. We make this
unreasonable assumption now to establish the framework for building
confidence intervals.
The margin of error covers only random sampling error. It does not
reflect any degree of undercoverage, nonresponse, or other forms of
bias.
Carrie Madden STAT 2000 – Unit 1 78 / 253
Unit 1 – Inference for the Mean of a Single Population
Hypothesis Testing
Hypothesis Testing
In other words, we have some claim about the value of some population
parameter and we would like to determine whether there is evidence to
support this claim. We accomplish this by looking at sample data and
seeing if they are representative of the claim. We can prove that a
parameter has any particular value, so we try to reach our conclusions with
a high probability of being correct. We have some new vocabulary in
hypothesis testing, but the idea is a simple one:
“An outcome that would rarely occur if an assumption were true is good
evidence that the assumption is not true.”
Speeding Example
Example
The parent council at an elementary school appeals to the municipal
government to install a red light camera at a nearby intersection. The
council claims that the average speed of motorists at the intersection is
greater than the posted speed limit of 60 km/h. Suppose it is known that
speeds of vehicles at the intersection follow a normal distribution with
standard deviation of 15 km/h. A city worker is sent to measure the
speeds of a random sample of 50 motorists at the intersection. The
sample mean speed of the vehicles is 66 km/h. Is this enough evidence to
conclude that the true mean speed of all drivers at the intersection is
greater than 60 km/h? That is, should a red light camera be installed?
We can not just say that 66 > 60, so “yes, the mean speed at the
intersection is above the speed limit”. We must ask:
If the true mean speed of motorist at the intersection really was 60 km/h,
how likely would it be to observe a sample mean as extreme as 66 km/h?
If the probability is low, then we can conclude that the mean speed really
is higher than the limit. IF the probability is not sufficiently low, we have
no conclusive evidence to support the claim.
The first step is to assume that the true mean speed really is equal to 60
km/h; that is, we assume
µ = µ0 = 60
Now if this is true, what is the probability of observing a sample
mean at least as high as 66 km/h? We have the tools to find this
probability!
66 − 60
P X̄ ≥ 66 = P Z ≥
√15
50
= P (Z ≥ 2.83) = 1 − P(Z < 2.83)
= 1 − 0.9977 = 0.0023
z<-((xbar-mu)/(sigma/sqrt(n)))
## [1] 2.828427
#P-value
1 - pnorm(z)
## [1] 0.002338867
Carrie Madden STAT 2000 – Unit 1 86 / 253
Unit 1 – Inference for the Mean of a Single Population
All I have done in the previous code is set the variables with the fixed
values from the problem. I have calculated our test statistic z, then
“called” it so you can see the test statistic printed out. Then found our
P-value using our pnorm function, and since we are looking for the area to
the right, I have the 1−.
We conclude that, since this probability is so low, the true mean speed of
vehicles at the intersection really is above the posted limit. It is
possible, but very unlikely, that the true mean is as low as 60.
The probability of observing such a high mean sample speed given the
assumption that µ = 60 is small enough that we are willing to believe the
council’s claim. Based on these findings, the municipal government
decides to install a camera at the intersection.
Hypothesis Testing
For this reason, we should look at the distribution of the data prior to
conducting a test (especially if the sample size is low) to see if the
assumption of normality appears to be reasonable. As mentioned,
hypothesis tests have their own distinct vocabulary.
Because we are interested in the value of a parameter for the whole
population, we always express our statements of interest in terms of
population parameters.
Null Hypothesis
Alternative Hypothesis
The statement making the claim which we are trying to support is called
the alternative hypothesis, denoted Ha , which will always be expressed as
an inequality.
The null and alternative hypotheses are precise statements of what claims
we are testing. They are both always given in terms of the population
parameter µ. The hypotheses can be stated in either words or in symbols.
P-value
Note that we assume the null hypothesis is true and calculate the
probability of observing a value of the sample mean at least as extreme as
the one observed.
This probability is called the P-value of the test. The lower the P-value,
the less likely it would have been to observe a value of x̄ at least as
extreme as the one observed if H0 were true. In other words, the lower the
P-value, the stronger our evidence against the null hypothesis.
Level of Significance
How low must the P-value be before we are willing to reject the null
hypothesis in favour of the alternative claim? For example, if the P-value
is 0.08, do we consider this to be convincing enough evidence to say the
null hypothesis is false?
Prior to the test, we must choose a level of significance α, to which we
will compare the P-value.
As such, we can think of α as the maximum P-value for which the null
hypothesis will be rejected.
The most common values of α are 0.1, 0.05 and 0.01.
We will now conduct the formal hypothesis test for our speeding example.
1 Level of significance
Let α = 0.05
(We will be willing to conclude in favour of the council only if the
P-value is less than or equal to 0.05)
2 Hypotheses
H0 : µ = 60
Ha : µ > 60
3 Decision Rule
Reject H0 if the P-value≤ α = 0.05.
(The decision rule is a precise statement of that must happen in order
for us to reject the null hypothesis.)
4 Test Statistic
X̄ − µ0 66 − 60
z= σ = = 2.83
√ 15
√
n 50
(The test statistic is a measure of the compatibility between the null
hypothesis and our data.)
Note that this is calculated assuming the null hypothesis is true.
P-value = P X̄ ≥ 66|µ = 60 = P (Z ≥ 2.83) = 1 − P (Z < 2.83) =
1 − 0.9977 = 0.0023
Since the P-value = 0.0023 < α = 0.05, we reject the null hypothesis in
favour of the alternative at the 5% level of significance.
6 Conclusion
There is sufficient statistical evidence to conclude that the true mean
speed of motorists at the intersection is greater than the posted limit
of 60 km/h.
Statistical Significance
Practice Question
Practice Question
Hypothesis Testing
Note that in the above case, we were interested in testing the claim that
the mean speed of vehicles at the intersection was greater than 60.
Let us consider the cases where we are interested if the population mean
for some variable is less than some specified value.
Example
The true mean drying time for a certain type of paint under specified test
conditions is known to be 75 minutes. Chemists have proposed a new
additive designed to decrease the mean drying time. Suppose it is known
that drying times follow a normal distribution with a standard deviation of
9 minutes. They test the new paint on a random sample of 36 specimens,
and find an average drying time of 73 minutes. Is this strong evidence to
suggest that the true mean drying time with the additive is less than 75
minutes?
H0 : µ = 75
Ha : µ < 75
3 Decision rule
Reject H0 if the P − value ≤ α = 0.01.
4 Test statistic
x̄ − µo 73 − 75
z= =
√σ √9
n 36
= −1.33
Carrie Madden STAT 2000 – Unit 1 105 / 253
Unit 1 – Inference for the Mean of a Single Population
Since the P-value = 0.0918 > α = 0.01, we fail to reject the null
hypothesis.
6 Conclusion
We have insufficient statistical evidence that the true mean drying
time is lower with the additive.
pnorm(-1.33)
## [1] 0.09175914
Hypothesis Testing
Practice Question
Practice Question
It is known that the tar contents of cigarettes of a particular brand follow
a normal distribution with standard deviation 0.3 mg. The mean tar
content is supposed to be 14.1 mg per cigarette. However, changes in the
composition of the tobacco and changes in the processing methods
sometimes cause the mean tar content to shift. We take a random sample
of five cigarettes every hour and calculate x̄ = 14.4. We would like to
conduct a hypothesis test of H0 : µ = 14.1 vs. Ha : µ > 14.1. We find a
P-value of 0.0127. At a 5% level of significance, we should:
A fail to reject Ho because the P-value is 0.0253, which is less than 0.05.
B reject Ho because the P-value is 0.0127, which is less than 0.05.
C fail to reject Ho because the P-value is 0.0127, which is less than 0.05.
D reject Ho because the P-value is 0.0253, which is less than 0.05.
E fail to reject Ho because the P-value is 0.9873, which is greater than
0.05.
Carrie Madden STAT 2000 – Unit 1 111 / 253
Unit 1 – Inference for the Mean of a Single Population
P-Values
Ha : µ > µ0 is
P(Z > z)
Ha : µ < µ0 is
P(Z < z)
Ha : µ ̸= µ0 is
2P (Z > |z|)
These P-values are exact if the population is normal and approximate for
large sample size n in other cases.
Example
The Mackenzie Valley Bottling Company distributes root beer in bottles
labeled 500 ml. They routinely inspect samples of 10 bottles prior to
making a large shipment, hoping to detect if the true mean volume in the
shipment differs from 500 ml. If the bottles are under-filled, the company
could be sued for false advertising. If the bottles are overfilled, the
company is spending more money than they need to. Suppose it is known
that fill volumes for the bottles of root beer follow a normal distribution
with standard deviation 3.5 ml. One random sample of 10 bottles results
in a sample average volume of 502 ml. Does this provide convincing
evidence that the true mean fill volume for the shipment differs from the
advertised amount of 500 ml?
Let α = 0.05.
2 Hypotheses
H0 : µ = 500
Ha : µ ̸= 500
3 Decision rule
Reject H0 if P-value ≤ α = 0.05.
4 Test statistic
502 − 500
z= 3.5 = 1.81
√
10
Solution
5 P-value
The P-value is
Interpretation: If the true mean fill volume of all bottles was 500 ml,
the probability of observing a sample mean at least as extreme as 502
ml would be 0.0702.
Since the P-value = 0.0702 > α = 0.05, we fail to reject Ho .
6 Conclusion
We have insufficient evidence that the true mean volume of all bottles
in the shipment differs from 500 ml.
2*pnorm(-1.81)
## [1] 0.07029579
##or
2*(1-pnorm(1.81))
## [1] 0.07029579
Practice Question
Practice Question
After all the evidence has been presented, the jury deliberates and
determines a P-value of 0.35. They therefore conclude that:
A there is sufficient evidence that the defendant is innocent.
B there is insufficient evidence that the defendant is innocent.
C there is sufficient evidence that the defendant is guilty.
D there is insufficient evidence that the defendant is guilty.
E the probability that the defendant is innocent is only 35%.
Consider again the root beer example. We will conduct the hypothesis test
again, this time using the confidence interval method. \begin{solution}
1 Level of significance
Let α = 0.05.
2 Hypotheses
H0 : µ = 500
Ha : µ ̸= 500
3 Decision rule
Reject H0 if µ0 = 500 is not in the 95% confidence interval for µ.
Solution
4 Confidence interval
Here is the R code to make the confidence interval for the above example:
xbar<-502
n<-10
sigma<-3.5
#This is for a 95%CI, leaving 0.025 in either tail.
#Hence the are to the left of 1.96 is 0.95+0.025=0.975
zstar<- qnorm(0.975)
moe<-zstar*(sigma/sqrt(n))
left<- xbar-moe
right<- xbar+moe
## [1] 499.8307
## [1] 504.1693
Practice Question
The manager at a grocery store would like to estimate the true mean
amount of money spent by customers in the express lane. She selects a
simple random sample of 50 receipts and calculates a 98% confidence
interval for µ to be ($15.50, $20.25). Suppose we wish to conduct a
hypothesis test to determine whether there is evidence that the true mean
amount spent by customers in the express lane differs from $20. Which of
the following statements is true?
A At a significance level of α = 0.01, we have sufficient evidence that
µ ̸= 20.
B At a significance level of α = 0.01, we conclude that µ = 20.
C At a significance level of α = 0.02, we have sufficient evidence that
µ ̸= 20.
D At a significance level of α = 0.02, we have insufficient evidence that
µ ̸= 20.
E At a significance level of α = 0.04, we have insufficient evidence that
µ ̸= 20.
Carrie Madden STAT 2000 – Unit 1 125 / 253
Unit 1 – Inference for the Mean of a Single Population
“How high does z have to be in order for us to reject the null hypothesis?”
0.3
P (Z ≥ z ∗ ) = 0.05
0.2
0.1 0.05
0 z*
qnorm(0.05)
## [1] -1.644854
qnorm(0.95)
## [1] 1.644854
Carrie Madden STAT 2000 – Unit 1 128 / 253
Unit 1 – Inference for the Mean of a Single Population
We now revisit the speeding vehicle example and conduct the test using
the critical value approach.
Solution
1 Level of significance
Let α = 0.05.
2 Hypotheses
H0 : µ = 60
Ha : µ > 60
3 Decision Rule
Reject H0 if z ≥ z ∗ = 1.645.
Solution
4 Test statistic
66 − 60
z= = 2.83
√15
50
Note that there are only five steps in conducting a hypothesis test using
the critical value approach, since the calculation of a P-value is no longer
necessary. In the case of a left-sided test, we reject the null hypothesis if
z ≤ −z ∗ , where −z ∗ is the value of Z such that
P (Z ≤ −z ∗ ) = α
Let α = 0.01.
2 Hypotheses
H0 : µ = 75
Ha : µ < 75
3 Decision rule
Reject H0 if z ≤ z ∗ = −2.326.
.
Carrie Madden STAT 2000 – Unit 1 132 / 253
Unit 1 – Inference for the Mean of a Single Population
qnorm(0.01)
## [1] -2.326348
Solution
4 Test statistic
73 − 75
z= = −1.33
√9
36
If we use the critical value approach for a two-sided test, we reject the null
hypothesis if |Z | > z ∗ , where z ∗ is the value of z such that
α
P (Z ≥ z ∗ ) =
2
In other words, we reject Ho if Z ≤ −z ∗ or Z ≥ z ∗ .
Let α = 0.05.
2 Hypotheses
H0 : µ = 500
Ha : µ ̸= 500
3 Decision rule
Reject H0 if |z| ≥ z ∗ = 1.96. In other words, if z ≤ −1.96 or
z ≥ 1.96.
Carrie Madden STAT 2000 – Unit 1 136 / 253
Unit 1 – Inference for the Mean of a Single Population
qnorm(.05/2)
## [1] -1.959964
Solution
–>
4 Test statistic
502 − 500
z= 3.5 = 1.81
√
10
Practice Question
Decisions in Inference
Decisions in Inference
Ho true Ha true
Decision based on sample
Correct
Reject Ho Type I Error
Decision
1 2
3 4
Decisions in Inference
Decisions in Inference
Type I Error
Note however that this is just
our regular level of significance
we have been using all along.
We say we have made a Type I
error if we reject H0 when it is
true.
We will denote the probability
0.3
of making a Type I Error as α:
0.2
P (reject Ho |Ho true) = P (Type I Error)
0.1
↵
=α
µo
X̄
Type I Error
Type II Error
The probability of Type II Error has not been considered previously when
we have set up our tests of significance. We would, however, like for β to
be low when conducting a hypothesis test.
Type II Error
Power
Power
The power of a hypothesis test is the probability of correctly rejecting H0
when Ha is true.
As we can see from the expression above, if we can calculate the power,
then we can easily find β:
Therefore,
β = 1 − Power
Example
A pharmaceutical company is producing prescription pills for the alleviation
of migraine headaches. The pills are supposed to contain 25 mg of the
active ingredient. It is very important that the amount of this ingredient
does not exceed 25 mg, as it is very powerful and could be dangerous if
taken in excessive amounts. Thousand of pills are produced for sale each
day, and samples of 5 pills are taken frequently to measure the amount of
the ingredient. A hypothesis test is conducted at α = 0.01 to test whether
the amount of exceeds 25 mg. Suppose it is known that the amount of the
active ingredient per pill follows a normal distribution with standard
deviation 1.2 mg. The analysts would like to detect a true mean of 27 mg
with high probability, as high amount of the ingredient are dangerous.
Example
What is the power of the test of
H0 : µ = 25
Ha : µ = 27
when α = 0.01?
Z ≥ z∗
X̄ − µo
≥ z∗
√σ
n
σ
X̄ ≥ µo + z ∗ √
n
1.2
X̄ ≥ 25 + 2.326 √
5
X̄ ≥ 26.248
n <- 5
sigma <- 1.2
sm<-(sigma/sqrt(5))
alpha <- 0.01
mu0<-25
q = qnorm((1-alpha), mean=mu0, sd=sm) #1-alpha; area to the ri
## [1] 26.24845
mua<-27
power<-pnorm(q, mean = mua, sd=sm, lower.tail = FALSE)
power
## [1] 0.919308
Example (Cont’d)
Question:
H0 : µ = 25
Ha : µ = 28
when α = 0.01.
Example (Cont’d)
1 Step 1 is the same as before, as the rejection rule depends only on the
value µ0 stated in the null hypothesis
2
mua<-28
power<-pnorm(q, mean = mua, sd=sm, lower.tail = FALSE)
power
## [1] 0.9994504
Power
Practice Question
Suppose we are testing the null hypothesis that the value of some
population mean is 32 versus the alternative that the population mean is
greater than 32. We use a significance level of 0.05 and the standard
deviation of the population is known to be 8. Based on a sample size of
16, the critical (rejection) region of the test is:
A x̄ > 28.71
B x̄ > 33.97
C x̄ > 1.645
D x̄ > 35.29
E x̄ > 36.44
Solution
1 We reject H if
0
Z ≤ z∗
X̄ − µ0
≤ −z ∗
√σ
n
σ
∗
X̄ ≤ µ0 − z √
n
0.4
X̄ ≤ 5 − 1.645 √ = 4.81
12
Solution
2 Find the power of the test.
n <- 12
sigma <- 0.4
sm<-(sigma/sqrt(n))
alpha <- 0.05
mu0<-5
q = qnorm((alpha), mean=mu0, sd=sm)
## [1] 4.810069
mua<-4.5
power<-pnorm(q, mean = mua, sd=sm, lower.tail = TRUE)
power
## [1] 0.9963765
Relationships
Question:
Assume that, instead of conduction the above test with α = 0.05, we were
to conduct it using α = 0.01.
1 We reject HO if
Z ≤ z∗
X̄ − µ0
≤ −z ∗
√σ
n
σ
∗
X̄ ≤ µ0 − z √
n
0.4
X̄ ≤ 5 − 2.326 √ = 4.73
12
n <- 12
sigma <- 0.4
sm<-(sigma/sqrt(n))
alpha <- 0.01 #Changed alpha
mu0<-5
q = qnorm((alpha), mean=mu0, sd=sm)
## [1] 4.731376
mua<-4.5
power<-pnorm(q, mean = mua, sd=sm, lower.tail = TRUE)
power
## [1] 0.9774531
You will see small discrepencies due to rounding error with hand
calculations.
Relationships (Cont’d)
Practice Question
A box of cookies is supposed to weigh 250 grams. There is some variation
in weight from box to box. The weights of cookies are normally distributed
with an unknown mean and a known standard deviation of 3 grams. A
consumer agency takes a random sample of 8 boxes of cookies, and wishes
to conduct a 5% significance test to test the hypothesis:
H0 : µ = 250
Ha : µ < 250
What is the power of the test if the true mean is actually 245?
A 0.8493
B 0.8749
C 0.9374
D 0.9500
E 0.9989
Carrie Madden STAT 2000 – Unit 1 171 / 253
Unit 1 – Inference for the Mean of a Single Population
Practice Question
Suppose we test
H0 : µ = 10
Ha : µ ̸= 10
Example
A hypothesis test of
H0 : µ = 1.5
Ha : µ ̸= 1.5
H0 : µ = 1.5
Ha : µ = 1.51
Solution
1 We reject H if
0
Z ≤ −z ∗ or Z ≥ z∗
X̄ − µo X̄ − µo
≤ −z ∗ or ≥ z∗
√σ √σ
n n
σ σ
∗ ∗
X̄ ≤ µ0 − z √ or X̄ ≥ µ0 + z √
n n
0.04 0.04
X̄ ≤ 1.5 − 1.96 √ or X̄ ≥ 1.5 + 1.96 √
10 10
X̄ ≤ 1.4752 or X̄ ≥ 1.5248
n <- 10
sigma <- 0.04
sm<-(sigma/sqrt(n))
alpha <- 0.05
mu0<-1.5
ql = qnorm((alpha/2), mean=mu0, sd=sm) #alpha/2; area in the l
qr = qnorm((1-(alpha/2)), mean=mu0, sd=sm) #alpha/2; area in t
ql
## [1] 1.475208
qr
## [1] 1.524792
Carrie Madden STAT 2000 – Unit 1 175 / 253
Unit 1 – Inference for the Mean of a Single Population
Solution
2 The power is
## [1] 0.002974916
powerr
## [1] 0.1211223
power = powerl+powerr
power
## [1] 0.1240973
Carrie Madden STAT 2000 – Unit 1 177 / 253
Unit 1 – Inference for the Mean of a Single Population
The power is so low because out test is based on a small sample size.
Let’s see what happens when we conduct the same test with a higher
sample size, say n = 100.
1 We reject H0 if
Z ≤ −z ∗ or Z ≥ z∗
X̄ − µ0 X̄ − µ0
≤ −z ∗ or ≥ z∗
√σ √σ
n n
σ σ
X̄ ≤ µ0 − z ∗ √ or X̄ ≥ µ0 + z ∗ √
n n
0.04 0.04
X̄ ≤ 1.5 − 1.96 √ or X̄ ≥ 1.5 + 1.96 √
100 100
X̄ ≤ 1.49216 or X̄ ≥ 1.50784
2 The power is
Go back through the previous R code for this example. Change n and see
if you can get the correct answers.
Summary:
The further away the value of the alternative mean is from µ0 , the
higher the power.
For a fixed sample size n, reducing the probability of a Type I Error α
will result in an increase of the probability of a Type II Error β (and
thus a decrease in power), and vice-versa.
For a fixed level of significance α, an increase in the sample size n will
result in a decrease in β (and thus an increase in power).
Practice Question
In which of the following situations is a Type II Error more serious than a
Type I Error? Suppose you have to decide whether or not to:
(I) slow down to the speed limit
A I only
B II only
C both I and II
D neither
Carrie Madden STAT 2000 – Unit 1 183 / 253
Unit 1 – Inference for the Mean of a Single Population
Power
a centre line.
two other horizontal lines called the control limits, they are set at
values such that if the process is in control, nearly all points will lie
between them.
the top lines is known as the upper control limit(UCL) and the
other is the lower control limit(LCL).
If at least one point plots beyond the control limits, the process is
said to be “out-of-control.”
If the points behave in a systematic or nonrandom manner, then the
process could be out-of-control. (i.e. a run of 9).
A company manufactures a 1cm carbon steel bolt. The bolts are assigned
a hardness rating which follows a normal distribution with a mean of 85
and standard deviation of 2.8. Five bolts were selected every hour for 15
hours and the average hardness for each sample was calculated. The
averages are shown below:
84
82
80
2 4 6 8 10 12 14
Time
84
82
80
2 4 6 8 10 12 14
Time
##
Carrie Madden STAT 2000 – Unit 1 196 / 253
Unit 1 – Inference for the Mean of a Single Population
We are using “ts” which stands for time series because this is data
gathered over time (in a run order).
hardnessseries<- ts(hardness)
hardnessseries
plot.ts(hardnessseries, ylim=c(80,90))
plot.ts(hardnessseries, ylim=c(80,90))
abline(h=85)
abline(h=88.757, lty=2)
abline(h=81.243, lty=2)
Hour x̄ Hour x̄
1 19.2 6 18.4
2 18.6 7 20.6
3 19.6 8 19.5
4 19.9 9 21.4
5 20.7 10 19.8
Is this process in control? Construct a control chart and plot the control
limits.
Carrie Madden STAT 2000 – Unit 1 199 / 253
Unit 1 – Inference for the Mean of a Single Population
20
19
18
plot.ts(diameter, ylim=c(18,22))
abline(h=19.5)
abline(h=21.45, lty=2)
abline(h=17.55, lty=2)
From the control chart above, we can see there are no points that extend
above the UCL and none below LCL so this process is in control.
Until now, we have used the unrealistic assumption that we know the
population standard deviation σ of our variable of interest, in order to
more easily explain the reasoning behind or methods.
Now that the framework is in place, we are ready to make the transition to
the more realistic situation of unknown population standard deviation.
X̄ − µ
Z= ∼ N (0, 1)
√σ
n
for each sample (this is theoretical, as there are infinitely many possible
samples), then we will get the standard normal curve.
t-distribution
X̄ − µ
T =
√s
n
follows a t distribution.
t-distribution
for each sample, then we will get a probability density curve that we call
the t distribution.
The quantity in the denominator of t is called the standard error of the
sample mean. The standard error of a random variable is its estimated
standard deviation.
t-distribution
The t statistic
x̄ − µ
t=
√s
n
T vs. Z
Since the form of t and z are quite similar, we expect the shape of the t
distribution to be similar to that of the standard normal curve.
In fact, this is the case. The t distribution is symmetric about its mean,
which is zero, but the spread for this distribution is slightly greater than
for the standard normal curve. This is the case because estimating σ by s
introduces more variation.
T vs. Z
Comparing the T and Z Distribution
Comparison of t−distributions
0.4
t−distributions
df = 1
df = 5
df = 10
df = 25
normal
0.3
dnorm(x)
0.2
0.1
0.0
−4 −2 0 2 4
t−value
Cont’d
# Plots the normal curve, type is (line - not points)
#and lty (2 is a dashed line)
plot(x, dnorm(x), type = "l", lty = 2,
xlab = "t-value",
main = "Comparison of t-distributions", col = "black")
# This will add the t-distribution for the four df’s
#dt is density for t with $n-1$ df.
#So it looks at the vectors as a table of values.
for(i in 1:4){
lines(x, dt(x, df[i]), col=colour[i])
}
#This code adds the legend
legend("topright", c("df = 1", "df = 5", "df = 10",
"df = 25", "normal"),
col = colour, title = "t-distributions", lty = c(1,1,1,1,2))
Carrie Madden STAT 2000 – Unit 1 212 / 253
Unit 1 – Inference for the Mean of a Single Population
T vs. Z (Cont’d)
The t distributions have less area near the centre and more in the
tails than does the standard normal distribution.
As the degrees of freedom increase, the t distribution approaches the
standard normal distribution.
The critical values for selected t distributions are given in Table 3, for
various upper tail probabilities (located at the top).
R fuctions for the t-distribution:
dt() – calculates the density (won’t often use here)
pt() – calculates the area under the density curve
qt() – gives us our critical value from a given probability
rt() – generate random values from the t-distribution for certain
parameters.
Example
The chief of a local police department would like to estimate the true
mean response time for all emergency calls in the city. A random sample
of seven emergency calls is selected, and the police response times (in
minutes) are shown below:
7 4 11 8 7 12 9
We would like to construct a 95% confidence interval for the true mean
response time µ of all emergency response calls in the city. We will assume
that response times follow a normal distribution.
Solution
We find the upper 0.025 critical value from the t distribution with
n − 1 = 6 degrees of freedom to be t ∗ = 2.447. Our 95% confidence
interval is
s 2.69
∗
x̄ ± t √ = 8.29 ± 2.447 √
n 7
= 8.29 ± 2.49 = (5.80, 10.78)
Interpretation
If we repeatedly measured samples of seven response times in this city and
constructed an interval in a similar manner, then 95% of such intervals
would contain the true mean response time µ.
times<-c(7,4,11,8,7,12,9)
xbar<-mean(times)
sd<- sd(times)
n<-7
stderror<- sd/sqrt(n)
cvl<- qt(0.025,(n-1)) #negative critical value
cvu<-qt(0.975,(n-1)) #positive critical value
lower
## [1] 5.797536
upper
## [1] 10.77389
The number of goals scored by each of the five NHL Pacific Division
teams for the 2009/10 regular season are shown below:
Team Goals
Anaheim Ducks 238
Dallas Stars 237
L.A. Kings 241
Phoenix Coyotes 225
S.J. Sharks 264
We calculate x̄ = 241.
To get the sample standard deviation, we take the positive square root of
the variance: √ √
s = s 2 = 202.5 = 14.23
Practice Question
A medical researcher measured the pulse rates (in beats per minute) of a
random sample of 10 adult females. The mean pulse rates of the females
in the sample was 72.2 and the standard deviation was 5.9. Assuming
pulse rates follow a normal distribution, a 96% confidence interval for the
true mean pulse rate of all adult females is given by:
5.9
1 72.2 ± 2.359 √
10
5.9
2 72.2 ± 2.398 √
9
5.9
3 72.2 ± 2.398 √
10
5.9
4 72.2 ± 2.054 √
10
5.9
5 72.2 ± 2.359 √
9
Example
Consider the response times example. The police chief would like to know
if new measures need to be adopted in order to improve response times.
Specifically, he would like to know if the true mean response times has
increased since the previous year, when the mean time was known to be
6.5 minutes.We will conduct a hypothesis test to determine if the mean
response time has increased since last year. Test at the 10% level of
significance.
Solution
1 Let α = 0.10
H0 : µ = 6.5
Ha : µ > 6.5
Solution
5 The P-value is P T(6) ≥ 1.76 . From Table 3, we see that
P T(6) ≥ 1.440 = 0.10 and P T(6) ≥ 1.943 = 0.05
Since 1.440 < t = 1.76 < 1.943, it follows that our P-value is
between 0.05 and 0.10. For any P-value between 0.05 and 0.10, we
would reject the null hypothesis.
6 We have sufficient evidence to conclude that the department’s true
mean response time has increased since last year. As such, action will
be taken to improve response times.
The interpretation of the P-value is the same for t tests as it is for z tests.
Interpretation
If the true mean response time was 6.5 minutes, the probability of getting
a sample mean at least as high as 8.29 minutes would be between 0.05
and 0.10.
H0 : µ = 6.5
Ha : µ > 6.5
alpha=0.10
cv<-qt(0.90, 6)
#we use 0.90 since this is a upper-tailed test.
#df = 6 = 7-1
cv
## [1] 1.439756
Carrie Madden STAT 2000 – Unit 1 229 / 253
Unit 1 – Inference for the Mean of a Single Population
T-test in R
times<-c(7,4,11,8,7,12,9)
t.test(times, mu=6.5, alternative = ’greater’)
##
## One Sample t-test
##
## data: times
## t = 1.7561, df = 6, p-value = 0.0648
## alternative hypothesis: true mean is greater than 6.5
## 95 percent confidence interval:
## 6.309763 Inf
## sample estimates:
## mean of x
## 8.285714
You can see from the output, our test statistic, the degrees of freedom and
the p-value associated with the test.
Carrie Madden STAT 2000 – Unit 1 231 / 253
Unit 1 – Inference for the Mean of a Single Population
Example
A fast food restaurant claims that the average waiting time in their drive
through is less than a minute. We record the drive through waiting times
for a random sample of 30 customers. The sample average time is 55
seconds and the sample standard deviation is 15 seconds. Waiting times
are known to follow a normal distribution. We will conduct a hypothesis
test to examine the significance of the restaurant’s claim.
Solution
1 Let α = 0.01.
H0 : µ = 60
Ha : µ < 60
Solution
5 The P-value is P T(29) ≤ −1.83 = P T(29) ≥ 1.83 by the
symmetry of the t distributions. We see from Table 3 that
P T(29) ≥ 1.699 = 0.05 and P T(29) ≥ 2.045 = 0.025
Since 1.699 < 1.83 < 2.045, if follows that the P-value is between
0.025 and 0.05. For any P-value between 0.025 and 0.05, we fail to
reject the null hypothesis.
6 We have insufficient evidence to conclude that the true mean waiting
time in the drive thru is less than one minute.
H0 : µ = 60
Ha : µ < 60
Solution
Since t = −1.83 > −t ∗ = −2.462, we fail to reject H0 .
5 We have insufficient evidence to conclude that the true mean waiting
time in the drive thru is less than one minute.
## [1] -1.825742
#pvalue method:
pt(t, 29)
## [1] 0.03910166
Note when we did it by hand we knew the p-value was between 0.025 and
0.05, we see the exact p-value with R is 0.0391
Carrie Madden STAT 2000 – Unit 1 237 / 253
Unit 1 – Inference for the Mean of a Single Population
alpha=0.01
cv<-qt(0.01,29)
cv
## [1] -2.462021
Practice Question
A major car manufacturer wants to test a new engine to determine if it
meets new air pollution standards. Safety regulations require that the
mean emission of all engines of this type must be less than 20 parts per
million (ppm) of carbon. If it is not less than 20, they will have to redesign
parts of the engine. A random sample of 25 engines is tested and the
emission level of each is determined. The sample mean is calculated to be
19.6 ppm and the sample standard deviation is 2.0 ppm. It is known that
emission levels are normally distributed with standard deviation 1.6 ppm.
We would like to test whether there is evidence that the engine meets the
new pollution standards. The hypotheses for the appropriate test of
significance are:
A H0 : µ = 20 vs. Ha : µ > 20
B H0 : µ = 19.6 vs. Ha : µ < 19.6
C H0 : µ = 20 vs. Ha : µ < 20
D H0 : µ = 19.6 vs. Ha : µ > 19.6
E H0 : X̄ = 20 vs. Ha : X̄ < 20
Carrie Madden STAT 2000 – Unit 1 239 / 253
Unit 1 – Inference for the Mean of a Single Population
Practice Question
A major car manufacturer wants to test a new engine to determine if it
meets new air pollution standards. Safety regulations require that the
mean emission of all engines of this type must be less than 20 parts per
million (ppm) of carbon. If it is not less than 20, they will have to redesign
parts of the engine. A random sample of 25 engines is tested and the
emission level of each is determined. The sample mean is calculated to be
19.6 ppm and the sample standard deviation is 2.0 ppm. It is known that
emission levels are normally distributed with standard deviation 1.6 ppm.
We would like to test whether there is evidence that the engine meets the
new pollution standards. The test statistic for the appropriate test of
significance is:
A t = −1.25
B z = −1.00
C t = −0.40
D z = −1.25
E t = −1.00
Carrie Madden STAT 2000 – Unit 1 240 / 253
Unit 1 – Inference for the Mean of a Single Population
s 1.8
∗
x̄ ± t √ = 30.9 ± 2.064 √
n 25
= 30.9 ± 0.74 = (30.16, 31.64)
Note that we used the upper critical value from the t distribution with
24df .
Interpretation
If we repeatedly took samples of 25 fill pressures and calculated an interval
in a similar manner, then 95% of such intervals would contain the true
mean fill pressure for this machine.
xbar<-30.9
s<-1.8
n<-25
se<-(s/sqrt(n))
#95% CI
cv<-qt(0.975, 24)
cv
## [1] 2.063899
lower<-(xbar-(cv*se))
upper<-(xbar+(cv*se))
lower
## [1] 30.157
upper
## [1] 31.643
H0 : µ = 30
Ha : µ ̸= 30
5 The P-value is 2P T(24) ≥ 2.50 . We see from Table 3 that
P T(24) ≥ 2.492 = 0.01 and P T(24) ≥ 2.797 = 0.005
Practice Question
H0 : µ = 30
Ha : µ ̸= 30
P-Values
H0 : µ = µ 0
Ha : µ > µ0
P-Values
H0 : µ = µ 0
H0 : µ < µ0