Professional Documents
Culture Documents
Author: T. Farrar
Name:
Faculty of Applied Sciences
Department of Mathematics and Physics
Contents
1 Introduction 1
2 Sampling Distributions 3
1 Introduction
Textbooks
• The following books were used in preparing this module: (Wackerly et al.,
2002), (Navidi, 2006), (Wonnacott and Wonnacott, 1990), (Moore, 2000),
(Keller, 2012)
1
• In this module, the main focus is on confidence intervals and hypothesis testing,
although we won’t get into these topics right away
• Behind these methods is the basic need to make decisions based on data
• Suppose we have matric learners who are doing two different after-school maths
study programmes. The first group scores an average mark of 53% on their
exam and the second group scores an average of 56%. Do we have enough
evidence to conclude that the second after-school maths study programme is
more effective than the first? Or is this just random variation, and if we were to
repeat the programmes next year, maybe the first programme would produce
better results than the second? The descriptive statistics on their own are not
enough to answer this question. We need hypothesis testing.
• When doing research in almost any discipline, from medicine to ecology to
marketing to finance to chemistry to behavioural science, hypothesis testing is
the standard means of demonstrating a claim to be true. Hence, a person who
understands hypothesis testing understands how to produce knowledge in the
modern world
What is a statistician?
• There is a difference between ‘doing statistics’ and ‘being a statistician’
• Doing statistics is challenging on its own; you must learn many different meth-
ods and formulas
• Being a statistician means you must be able to see the big picture and answer
questions like:
◦ How can a real-world problem be expressed as a quantitative research
question?
◦ Where and how can we obtain data that will enable us to answer this
research question?
◦ What statistical method(s) would be most appropriate for answering this
research question with this data?
◦ What are the assumptions that must hold in order for this statistical
method to be valid? Do these assumptions hold?
◦ How should the statistical results be expressed scientifically? (It is im-
portant not to state results as facts but always recognize the uncertainties
that exist.)
◦ How should the statistical findings be communicated to an audience that
is not familiar with statistical methods?
• In addition to these, the statistician must think about issues such as ethics.
Statistics can be used to produce knowledge and demonstrate truth, at least
to a high degree of probability. But statistics can also be used to deceive and
mislead. A good statistician uses statistics only for honest purposes and has a
responsibility to point out when statistics are being used dishonestly by others.
2
• In short, a statistician is a problem-solver. She or he certainly doesn’t know
everything, but knows how to acquire knowledge using data
2 Sampling Distributions
The first part of this section is a revision of material already covered in Statistics
1A.
3
◦ The probability distribution that defines Y is the binomial distribu-
tion, because what we have described is a binomial experiment
• Now here is the big new insight: a statistic is also a random variable!
• This sounds strange at first: after all, once we have flipped the coin n times
we know the value of the statistic p̂, so how is it random? But remember,
that is true of any random variables: we know the actual outcome after the
experiment has been done. But not before! Before we flip any coins, we don’t
know the value of p̂ and various outcomes are possible; hence it is random
• In fact, it is easy to see mathematically that p̂ is a random variable, because
Y
its formula, , shows that it is a function of a random variable, Y . And any
n
function of a random variable must also be a random variable.
• Hence, a statistic is a random variable. Indeed, another way to define a statistic
is this:
◦ A statistic is a function of the observable random variables in a sample
and known constants.
◦ For example, p̂ is a function of a random variable Y and a known constant
n
Sampling Distributions
• Every random variable behaves according to a probability distribution
• Hence, because a statistic is a random variable, a statistic has its own proba-
bility distribution
• The probability distribution of a statistic drawn from a sample is called a
sampling distribution
• The focus of this chapter is on describing the sampling distribution of two
commonly used statistics: the sample mean and the sample proportion
4
• If you are interested in seeing a proof of this theorem, see (Wackerly et al.,
2002), pages 331-332.
Ȳ − µȲ Ȳ − µ
• It also follows from this theorem that Z = = √ has a standard
σȲ σ/ n
normal distribution.
• Let us revisit the human pregnancy example. Let us assume that the length
of a human pregnancy is normally distributed with a mean of 266 days and a
standard deviation of 16 days. But suppose we don’t know this mean, and we
want to estimate it by collecting data from a random sample of mothers
• This makes sense: if we collect more data, we would expect to have a more
precise estimate of the average length of a pregnancy
• What we can see is that if we were to take a sample of just one mother, and
another researcher were to do the same, and a third researcher were to do the
same, and so forth, then when we all compared our results they would be very
spread out: they would have a large variance. One researcher might estimate
the average pregnancy length to be 240 days, and another, 290 days
• The amount of time university lecturers devote to their jobs per week is nor-
mally distributed with a mean of 52 hours and a standard deviation of 6 hours.
It is assumed that all lecturers behave independently.
5
1. What is the probability that a lecturer works for more than 60 hours per
week?
Let Y1 be the number of hours worked per week by this lecturer. (Note
that we could equivalently define Ȳ as the sample mean of this sample
of n = 1 observation; in this case we could use the sampling distribution
approach and get the same answer.)
Y1 − µ 60 − µ
Pr (Y1 > 60) = Pr >
σ σ
60 − 52
= Pr Z >
6
= Pr (Z > 1.33)
= 1 − Pr (Z < 1.33) = 1 − 0.9082 = 0.0918
2. What is the probability that the mean amount of work per week for four
randomly selected lecturers is more than 60 hours?
6
Let Y1 , Y2 , Y3 , Y4 be the number of hours worked per week by these four
respective lecturers. Then, according to the sampling distribution theo-
rem, Ȳ is a normally distributed random variable with a mean of µ = 52
σ 6
and a standard deviation of √ = √ = 3.
n 4
Ȳ − µ 60 − µ
Pr Ȳ > 60 = Pr √ > √
σ/ n σ/ n
60 − 52
= Pr Z > √
6/ 4
= Pr (Z > 2.67)
= 1 − Pr (Z < 2.67) = 1 − 0.9962 = 0.0038
We can see that the probability is much smaller in this case. Does this
agree with the graph above in terms of the effect of increasing sample
size on the spread of the sampling distribution?
3. What is the probability that if four lecturers are randomly selected, all
four work for more than 60 hours?
Because we have assumed that all lecturers are independent, we can use
the multiplication rule for independent events, which says that Pr (A ∩ B) =
Pr (A) Pr (B) if events A and B are independent. In this case we have
four events: Y1 > 60, Y2 > 60, Y3 > 60 and Y4 > 60. Of course, the mul-
tiplication rule for independent events can be extended to any number of
independent events. Thus:
Pr (Y1 > 60 ∩ Y2 > 60 ∩ Y3 > 60 ∩ Y4 > 60)
= Pr (Y1 > 60) Pr (Y2 > 60) Pr (Y3 > 60) Pr (Y4 > 60)
= [Pr (Y1 > 60)]4 (since the four random variables are identically distributed)
= 0.09184 = 0.000071
Ȳ − µ 199 − µ
Pr Ȳ < 199 = Pr √ < √
σ/ n σ/ n
199 − 201.9
= Pr Z < √
5.8/ 32
= Pr (Z < −2.83)
= 1 − Pr (Z < 2.83) = 1 − 0.9977 = 0.0023
7
2. Suppose your random sample of 32 cans of tuna produced a mean weight
that is less than 199 grams. Comment on the statement made by the
manufacturer.
If the distribution of the net weight stated by the manufacturer is true,
then we had only a 0.0023 probability of achieving a mean weight of less
than 199 grams in our sample. Since we did achieve such a weight, either
something extremely improbable has happened, or the manufacturer has
given us an incorrect probability distribution. The latter is more likely:
probably either the mean weight is below 201.9 grams or the standard
deviation is 5.8 grams.
Ȳ − µ 50 − µ
Pr Ȳ > 50 = Pr √ > √ = 0.9
σ/ n σ/ n
50 − 58
Pr Z > √ = 0.9
13/ n
50 − 58
Let z = √
13/ n
Pr (Z > z) = 0.9
Pr (Z < −z) = 0.9
−z ≈ 1.28
z ≈ −1.28
50 − 58
√ ≈ −1.28
13/ n
√ −1.28(13)
n≈
√ −8
n ≈ 2.08
n ≈ 4.33
The minimum number of learners she should take is 4.33; but since she can
only take an integer number of learners, we must round up to 5. The teacher
should take at least 5 learners to the competition.
8
Sampling Distribution of a Sample Mean: Exercises
1. Find the probability that one selected subcomponent is shorter than 114
cm
2. Find the probability that if five subcomponents are randomly selected,
their mean length is less than 114 cm
3. Find the probability that if five subcomponents are randomly selected,
all five have a mean length of less than 114 cm
• (Challenging) The time it takes for a statistics lecturer to mark a test is nor-
mally distributed with a mean of 4.8 minutes and a standard deviation of 1.3
minutes. There are 60 students in the lecturer’s class. What is the probability
that he needs more than 5 hours to mark all the tests? (The 60 tests in this
year’s class can be considered a random sample of the many thousands of tests
the lecturer has marked and will mark.)
• It means that for a sample of ‘i.i.d.’ random variables from any distribution,
not only the normal distribution, one can perform a simple transformation
on the sample mean to get an approximately standard normally distributed
random variable
• Let us return to the idea of the normal approximation to the binomial distri-
bution in order to illustrate the central limit theorem in practice
9
• Suppose that Y1 , Y2 , . . . , Ym are independent binomially distributed random
variables each with p = 0.1 and n = 10.
• A rule of thumb is that the Central Limit Theorem may be used safely as long
as n > 30
10
Central Limit Theorem Example
• Example questions involving the CLT are similar to those from the sampling
11
distribution of the mean of a normally distributed sample. The main difference
is that the CLT gives only approximate probabilities whereas the sampling
distribution of the mean of a normally distributed sample is exact. So when
using the CLT we must always put ≈ instead of =.
• Example: The fracture strength of tempered glass averages 14 (measured in
thousands of pounds per square inch) and has standard deviation 2.
1. What is the probability that the average fracture strength of 100 ran-
domly selected pieces of this glass exceeds 14.5?
We do not know in this case that the fracture strength is normally dis-
tributed. Hence we need to use the Central Limit Theorem to say that
Ȳ − µ
U100 = √ approximately follows a standard normal distribution.
σ/ n
Ȳ − µ 14.5 − µ
Pr Ȳ > 14.5 = Pr √ > √
σ/ n σ/ n
14.5 − 14
≈ Pr Z > √
2/ 100
= Pr (Z > 2.5)
= 1 − Pr (Z < 2.5) = 1 − 0.9938 = 0.0062
2. Find an approximate interval that includes, with probability 0.95, the
average fracture strength of 100 randomly selected pieces of this glass.
Ȳ − µ
Let Z = √ . Then:
σ/ n
12
Hence, an approximate interval that includes, with probability 0.95, the
average fracture strength of 100 randomly selected pieces of glass is
(13.608, 14.392).
13
• A proportion is like a percent but it is expressed as a decimal between 0 and
1 rather than on a scale from 0 to 100
• When dealing with the expected value and variance of a random variable, the
following rules apply (which we will not take the trouble to prove, although it
is not too difficult)
◦ E (aY + b) = aE (Y ) + b
◦ Var (aY + b) = a2 Var (Y )
Y
• These rules help us in the case of p̂ because p̂ = is Y multiplied by a
n
constant, and we know the expected value and variance of Y from the binomial
distribution
• Now that we know the expected value and variance of p̂, we can use the Central
Limit Theorem to derive the approximate sampling distribution of p̂
p̂ − µp̂ p̂ − p
• By the CLT, we have that = p approaches a standard
σp̂ p(1 − p)/n
normal distribution as n becomes large
14
• As with the normal approximation to the binomial distribution, this approxi-
mation is satisfactory as long as np ≥ 5 and n(1 − p) ≥ 5
• A university bookstore claims that 50% of its customers are satisfied with the
service and prices.
1. Given that this claim is true, what is the probability that in a random
sample of 600 customers less than 45% are satisfied?
2. Suppose that a random sample of 600 customers is actually taken, and
270 customers say they are satisfied. What does this tell you about the
bookstore’s claim? Explain.
15
2. p̂ = 270/600 = 0.45. The percent of satisfied customers in the sample is
45%. We have shown that, if the true percent of satisfied customers in the
population were 50%, there is only a 0.71% chance of getting such a low
rate of satisfied customers in the sample. Because this is so unlikely to
have occurred, it seems better for us to assume that the bookstore’s claim
is false. We conclude that the percent of customers who are satisfied is
not 50%, but something lower.
• Using laws of expected value and variance it is possible to derive the expected
value and variance of X̄ − Ȳ :
µX̄−Ȳ = µ1 − µ2
2 σ12 σ22
σX̄− Ȳ = +
n1 n2
16
(X̄ − Ȳ ) − (µ1 − µ2 )
• That is, Z = s is a standard normal random variable
2 2
σ1 σ2
+
n1 n2
• If Populations 1 and 2 are not normally distributed, then this quantity will
still be approximately normally distributed for large sample sizes (n1 ≥ 30 and
n2 ≥ 30).
17
• A factory worker’s productivity is normally distributed. One worker produces
an average of 75 units per day with a standard deviation of 20. Another worker
produces at an average rate of 65 units per day with a standard deviation of
21. What is the probability that during one week (5 working days), worker 1
will outproduce worker 2? (Hint: think of the five days as a random sample
from the population of all the days that the worker has worked and will work
at the factory.)
• Estimation is a concept you have probably heard about since primary school
• The need for estimation arises precisely because the value of a population
parameter is almost always unknown
• For instance, based on research we might propose that the proportion of South
Africans above age 12 who smoke cigarettes is 0.15. In this case, ‘0.15’ is the
estimate. Or, we might propose based on our sample data that the mean
income of South Africans aged 15 to 65 is R1800 per month. In this case,
‘R1800 per month’ is the estimate.
18
• We can distinguish between two kinds of estimates: a point estimate and an
interval estimate
• You are probably much more familiar with using point estimators. However,
there are two major advantages to using interval estimators:
Examples of estimators
19
Y
• A sample proportion p̂ = is an estimator of the population proportion p
n
n
2 1 X 2
• A sample variance S = Yi − Ȳ is an estimator of the population
n − 1 i=1
2
variance σ
• The criteria that we can use are related to the expected value and variance
of an estimator
• If the
expected
value of the estimator equals the actual parameter value, i.e.
if E θ̂ = θ, then θ̂ is said to be an unbiased estimator of θ. Otherwise, θ̂
is said to be a biased estimator of θ.
20
• In practice, it can be proven mathematically that, with Ȳ and p̂ as defined
above, E Ȳ = µ, E (p̂) = p and E (S 2 ) = σ 2 , meaning that Ȳ is an unbiased
estimator of µ, p̂ is an unbiased estimator of p and S 2 is an unbiased estimator
of σ 2
• Provided that θ̂ is also unbiased, this means we will get a result close to the
population parameter almost every time
• If we are comparing two unbiased estimators θ̂1 and θ̂2 , the estimator with a
smaller variance is said to be more efficient
• The variance of an estimator always depends on the sample size n, and should
decrease as n increase
21
2. It will be relatively narrow (if the endpoints are extremely far apart, the
interval will not be very useful)
• Since the interval estimator depends on random variables, the endpoints are
random and we cannot guarantee that the parameter θ will lie between them.
What we can do is generate an interval that has a certain probability of con-
taining the parameter θ
• The quantile and standard error are taken from the sampling distribution of
the point estimator
• The quantile and standard error together are known as the margin of error
(sometimes written e)
• Important Note: ‘Standard error’ here covers only random sampling error,
i.e. the difference between the sample statistic and population parameter that
occurs due to random sampling. There are other kinds of statistical error
which, if present, could make our estimation procedures invalid. These kinds
of errors will be discussed in detail in Statistics 2A
• Alternatively the central limit theorem can be used, allowing us to use standard
normal quantiles even when the point estimator is not normally distributed.
In such cases the confidence interval is only approximate
• In the statement ‘(1 − α)100% confidence interval’, α is called the type I error.
We will discuss type I error in more detail later in the course, but for now you
can think of α simply as the probability that the confidence interval does not
contain the parameter
22
• Hence, if we want a 95% confidence interval, 95% = 0.95(100%) = (1 −
0.05)(100%) so α = 0.05. Similarly, if we want a 99% confidence interval
then α = 0.01.
• The reason why the quantile is written as qα/2 in the formula above is that
we usually divide the error equally at both ends of the interval, as seen in the
diagram below
• The area (probability) to the left of the lower endpoint is α/2 and the area
(probability) to the right of the upper endpoint is α/2, so the overall area
(probability) outside the interval is α, which means the area (probability)
inside the interval is 1 − α
23
• But what interval estimator should we use?
• Let L be the lower endpoint of our confidence interval and let U be the upper
endpoint.
• Our task is to find a formula for L and U using the sampling distribution of
Ȳ such that the probability that µ lies between L and U equals 1 − α, i.e.
Pr (L < µ < U ) = 1 − α
24
Pr (−z < Z < z) = 1 − α
Let zα/2 be the value of z that satisfies this equation
Clearly this is valid since the normal distribution is symmetrical
Pr −zα/2 < Z < zα/2 = 1 − α
Ȳ − µ
Pr −zα/2 < √ < zα/2 = 1 − α
σ/ n
This follows from the sampling distribution
σ σ
Pr −zα/2 √ < Ȳ − µ < zα/2 √ =1−α
n n
σ σ
Pr −zα/2 √ − Ȳ < −µ < zα/2 √ − Ȳ = 1 − α
n n
σ σ
Pr zα/2 √ + Ȳ > µ > −zα/2 √ + Ȳ = 1 − α
n n
Here we have multiplied the inequality by -1, so we must change < to >
σ σ
Pr Ȳ − zα/2 √ < µ < Ȳ + zα/2 √ =1−α
n n
Here we just rearranged the inequality to express it again in < terms
σ σ
Hence L = Ȳ − zα/2 √ and U = Ȳ + zα/2 √
n n
1. Find a 95% confidence interval for the mean lifetime of batteries produced
25
by this method
26
U = 153
σ
Ȳ + zα/2 √ = 153
n
25
150 + zα/2 √ = 153
100
2.5zα/2 = 3
zα/2 = 1.2
Pr (Z > 1.2) = 1 − Pr (Z < 1.2) = 1 − 0.8849 = 0.1151
Thus α/2 = 0.1151
α = 0.2302
1 − α = 1 − 0.2302 = 0.7698
We can state with about 77% confidence that the mean lifetime is between
147 and 153 hours.
15.8 8.9 7.4 10.6 11.5 15.7 16.5 15.0 13.8 10.2
19.0 26.1 25.8 41.4 34.4 32.5 25.3 26.5 28.2 22.1
σ 2.8
σȲ = √ = √ = 0.626
n 20
2. Determine a 95% confidence interval for µ, the mean fuel efficiency for
this vehicle when travelling 100 km/h
27
α = 1 − 0.95 = 0.05, σ = 2.8, n = 20
n
1X 1
Ȳ = Yi = (15.8 + 8.9 + 7.4 + · · · + 22.1)
n i=1 20
= 12.255
σ
Ȳ ± zα/2 √
n
2.8
12.255 ± z0.025 √
20
z0.025 ≈ 1.96
2.8
12.255 ± (1.96) √
20
L = 11.028, U = 13.482
We can say with 95% confidence that the mean fuel efficiency for this
vehicle when travelling 100 km/h is between 11.028 km/L and 13.482
km/L
3. Is your confidence interval exact or approximate? Why?
The confidence interval is approximate, because we were not given that
the fuel efficiency readings are normally distributed; thus the confidence
interval is based on the Central Limit Theorem
1. Find a 90% confidence interval for the mean breaking strength of all the
fibers in the shipment
2. Find a 98% confidence interval for the mean breaking strength of all the
fibers in the shipment
3. Are these confidence intervals exact or approximate? Explain.
4. What is the confidence level of the interval (27.5, 30.5)?
• (Challenging) Suppose you were told that the 90% confidence interval for the
mean µ based on some known σ is (329.87, 356.46). However, you want a 95%
confidence interval. With only the information provided here determine the
95% confidence interval.
28
Confidence Interval for the Mean when Standard Deviation is Unknown
• The interval estimator we have used so far is applicable as long as the popu-
lation standard deviation σ is known. Both the exact sampling distribution of
Ȳ and the Central Limit Theorem rest on this assumption
Ȳ − µ
• While √ follows a standard normal distribution under the usual assump-
σ/ n
Ȳ − µ
tions (see Sampling Distributions), √ does not follow a standard normal
S/ n
distribution
• Provided the sample size n is very large, we can still rely on the same interval
estimator formula, with σ replaced with S, and it will give us a reasonably
good approximation
29
• A t distribution is a continuous probability distribution whose shape resembles
that of the normal distribution:
• When working by hand, one uses a t distribution table to find the necessary
quantiles
• The t distribution has one parameter, the Greek letter ν, which is referred to
as the ‘degrees of freedom’
30
• The interval estimator for the mean of a normally distributed population with
unknown variance is as follows:
S
Ȳ ± tα/2,n−1 √
n
Confidence Interval for the Mean when Standard Deviation is Unknown:
Example
• Estimates of the earth’s biomass (the total amount of vegetation held by the
earth’s forests) are important in determining the amount of unabsorbed carbon
dioxide that we can expect to remain in the earth’s atmosphere. Suppose that
a sample of 61 one-square-metre plots randomly chosen in North America’s
northern forests produced a mean biomass of 4.2 kg per m2 and a standard
deviation of 1.5 kg per m2 . Give a 95% confidence interval for the average
biomass for North America’s northern forests.
S
Ȳ ± tα/2,n−1 √
n
1.5
4.2 ± t0.025,61−1 √
61
t0.025,60 = 2.000 (from table)
1.5
4.2 ± 2 √
61
L = 3.816, U = 4.584
31
We can say with 95% confidence that the average biomass for North America’s
northern forests is between 3.816 kg per m2 and 4.584 kg per m2 .
• A courier service wants to estimate the average delivery time for its local
deliveries. A random sample of times (in minutes) for 12 deliveries to an
address across town was recorded. These data are shown below. Assuming
the data are normally distributed, give a 98% confidence interval for the mean
delivery time.
• There are other methods that can be used in such cases, such as bootstrapping
or nonparametric methods. We will not discuss these other methods in this
module
32
• Here are some simulation results that illustrate the validity of methods in dif-
ferent situations (the uniform and exponential distributions are two examples
of non-normal distributions). The desired confidence level in each case is 95%
• These results demonstrate that the true confidence level for these methods is
exactly or close to 95% in most cases
33
• Recall that this approximation is satisfactory as long as np ≥ 5 and n(1−p) ≥ 5
• This is similar to replacing σ with S in the interval estimator for the mean,
and it thus requires us to use the t distribution instead of the Z distribution:
p̂ − p
T =p
p̂(1 − p̂)/n
34
• The formula for an approximate
r(1−α)100% confidence interval for p can thus
p̂(1 − p̂)
be expressed as p̂ ± tα/2,n−1
n
• Note: some textbooks will advise using this approximation with a normal z
quantile instead of the t quantile; this is not very accurate however!
• Even the t approximation will only be effective for relatively large n. We can
use the same rule of thumb as for the normal approximation: the t confidence
interval can be used if np ≥ 5 and n(1 − p) ≥ 5
• The following table gives the simulated probability that p falls inside a 95%
confidence interval calculated using the t distribution method and the Wilson
method for different values of p and n:
• We can see that, in general, the Wilson method is better. And notice how bad
the t method is when p and n are both small! For instance, if p = 0.01 and
n = 50, our so-called ‘95% confidence interval’ actually contains p with only
39% probability, not 95%!
35
Simulated Confidence Level for p using Wilson and t methods
based on 10 million simulations
p n Wilson method t method
0.01 10 0.9044 0.0956
0.01 20 0.9832 0.1821
0.01 30 0.9639 0.2603
0.01 40 0.9392 0.3309
0.01 50 0.9107 0.3949
0.1 10 0.9297 0.6497
0.1 20 0.9568 0.8763
0.1 30 0.9742 0.9498
0.1 40 0.9433 0.9145
0.1 50 0.9703 0.879
0.3 10 0.9244 0.9611
0.3 20 0.9753 0.9475
0.3 30 0.9299 0.953
0.3 40 0.9443 0.9301
0.3 50 0.9567 0.9476
0.5 10 0.9785 0.8908
0.5 20 0.9586 0.9586
0.5 30 0.9572 0.9572
0.5 40 0.9615 0.9615
0.5 50 0.9351 0.9351
• Hence we conclude with 98% confidence that p is between 0.6974 and 0.9075,
i.e. between 70% and 91% of CPUT students support the new transport
service.
36
• Using the Wilson score interval approach:
s
2 2
z z
1 p̂ + α/2 ± zα/2 p̂(1 − p̂) + α/2
2
1 + zα/2 /n 2n n 4n2
= (0.6819, 0.8850)
• Hence we conclude with 98% confidence that p is between 0.6819 and 0.8850,
i.e. between 68% and 89% of CPUT students support the new transport
service.
Confidence Interval for a Proportion: Example 2
• A toxicologist wants to estimate the proportion of rats that develop a certain
disease in a laboratory after exposure to a certain drug. A random sample
of 150 rats is exposed to the drug, and 42 of these later test positive for the
disease. Using the Wilson score interval method, give a 95% confidence interval
for the proportion of rats exposed to the drug that develop the disease.
s
2 2
z z
1 p̂ + α/2 ± zα/2 p̂(1 − p̂) + α/2
2
1 + zα/2 /n 2n n 4n2
= (0.2143, 0.3567)
• Hence we conclude with 95% confidence that p is between 0.2143 and 0.3567,
i.e. between 21% and 36% of rats exposed to the drug develop the disease
Confidence Interval for a Proportion: Exercises
1. A pizza chain tests out a new marketing strategy by sending out a promotional
offer by SMS to a random sample of 25 past customers in their database. Six
customers take advantage of the promotional offer. Give a 90% confidence
interval for the proportion of all customers in the database who would take
advantage of this promotional offer. Use the t distribution method.
2. An ecologist estimates by studying a random sample of 15 ecosystems in the
Western Cape that 60% of ecosystems are under environmental threat. Us-
ing the Wilson score interval method, give a 95% confidence interval for the
proportion of ecosystems in the Western Cape that are under environmental
threat.
37
3.3 Confidence Intervals for Comparing Two Parameters
Confidence Interval for Difference in Means of Two Normal Populations
with Known Variances
• A point estimator for the difference would be simply X̄ − Ȳ , but what about
an interval estimator?
• Recall:
µX̄−Ȳ = E X̄ − Ȳ = µ1 − µ2
2
σ12 σ22
σX̄−Ȳ = Var X̄ − Ȳ = +
n1 n2
(X̄ − Ȳ ) − (µ1 − µ2 )
Z= s
σ12 σ22
+
n1 n2
• The derivation of the confidence interval is almost the same as for a single
38
mean:
Pr (−z < Z < z) = 1 − α
Let zα/2 be the value of z that satisfies this equation
Clearly this is valid since the normal distribution is symmetrical
Pr −zα/2 < Z < zα/2 = 1 − α
X̄ − Ȳ − (µ1 − µ2 )
−zα/2 <
Pr s
2 2 =1−α
< zα/2
σ1 σ2
+
n1 n2
This follows from the sampling distribution
v v
u 2 2
u 2 2
uσ σ uσ σ2
1 2 1
Pr −zα/2 + < X̄ − Ȳ − (µ1 − µ2 ) < zα/2 + =1−α
t t
n1 n2 n1 n2
v v
u 2 2
u 2 2
uσ σ2 uσ σ2
1 1
Pr −zα/2 + − X̄ − Ȳ < − (µ1 − µ2 ) < zα/2 + − X̄ − Ȳ = 1 − α
t t
n1 n2 n1 n2
v v
u 2
σ2
u 2
uσ uσ σ2
Pr zα/2 t 1 + 2 + X̄ − Ȳ > µ > −zα/2 t 1 + 2 + X̄ − Ȳ = 1 − α
n1 n2 n1 n2
Here we have multiplied the inequality by -1, so we must change < to >
v v
u 2
σ2
u 2
uσ uσ σ2
Pr X̄ − Ȳ − zα/2 t 1 + 2 < µ < X̄ − Ȳ + zα/2 t 1 + 2 = 1 − α
n1 n2 n1 n2
39
• Thus our exact (1 − α)100% confidence interval for µ1 − µ2 is as follows (the
full derivation is not shown
s but the method is the same as in other cases):
1 1
X̄ − Ȳ ± tα/2,n1 +n2 −2 Sp2 +
n1 n2
• If σ12 6= σ22 then we simply use S12 to estimate σ12 and S22 to estimate σ22
• However, determining the degrees of freedom for our t distribution in this case
is much more complicated and we can only approximate it
• Seasonal ranges (in hectares) for crocodiles were monitored on a lake in Malawi
by biologists. Five crocodiles monitored in the spring showed ranges of 8.0,
12.1, 8.1, 18.2, and 31.7. Four different crocodiles monitored in the summer
showed ranges of 102.0, 81.7, 54.7 and 50.7. Give a 95% confidence interval
for the difference between mean spring and summer ranges. You may assume
that crocodile range is normally distributed and that the variances of crocodile
range in spring and summer are equal.
Let X1 , X2 , X3 , X4 , X5 be the spring crocodile ranges and let Y1 , Y2 , Y3 , Y4 be
40
the summer crocodile ranges.
s
1 1
X̄ − Ȳ ± tα/2,n1 +n2 −2 Sp2 +
n1 n2
1
X̄ = (8.0 + 12.1 + 8.1 + 18.2 + 31.7) = 15.62
5
1
Ȳ = (102.0 + 81.7 + 54.7 + 50.7) = 72.275
4
2 2
(n 1 − 1)S1 + (n2 − 1)S2
Sp2 =
n1 + n2 − 2
n1
1 X 2
S12 = Xi − X̄ = 98.057
n1 − 1 i=1
n 2
1 X 2
S22 = Yi − Ȳ = 582.2558
n2 − 1 i=1
(5 − 1)(98.057) + (4 − 1)(582.2558)
Sp2 = = 305.5708
s 5 + 4 − 2
1 1
15.62 − 72.275 ± t0.025,7 305.5708 +
5 4
t0.025,7 = 2.365
L = −56.655 − 2.365(11.72633) = −84.388
U = −56.655 + 2.365(11.72633) = −28.922
Thus we can conclude with 95% confidence that the difference in mean spring
and summer ranges is between -28.922 hectares and -84.388 hectares (or, be-
tween 28.922 hectares and 84.388 hectares, if we just want to express the
answer in absolute terms)
• Every month a clothing store conducts an inventory and calculates losses from
theft. The store would like to reduce these losses and is considering two
methods. The first is to hire a security guard, and the second is to install
cameras. To help decide which method to choose, the manager hired a security
guard for six months. During the next six-month period, the store installed
cameras but had no security guard. The monthly losses were recorded and
are listed below. Provide a 95% confidence for the difference in mean monthly
theft losses under the two methods, and use it to infer whether one method is
better than the other. You may assume that the standard deviation of monthly
theft losses is R800 regardless of which theft prevention method is used. Let
X1 , X2 , . . . , X6 be the theft losses in the months when a security guard was on
duty, and let Y1 , Y2 , . . . , Y6 be the theft losses in the months when there was a
41
Monthly losses due to theft
Security guard R3550 R2840 R4010 R3980 R4770 R2540
Cameras R4860 R3030 R2700 R3860 R4110 R4350
camera.
s
σ12 σ22
X̄ − Ȳ ± zα/2 +
n1 n2
1
X̄ =(3550 + 2840 + · · · + 2540) = 3615
6
1
Ȳ = (4860 + 3030 + · · · + 4350) = 3818.3333
6
z0.025 = 1.96
r
8002 8002
3615 − 3818.3333 ± 1.96 +
6 6
= (−1108.619, 701.9519)
We are 95% confident that the difference in mean monthly theft losses using
the security guard method as compared to the camera method are between -
R1109 and R702. Because this interval includes 0, we cannot be 95% confident
that one method is better than the other: there may be no mean difference.
• We can introduce this result now. Suppose we have two populations each
having an infinite number of objects which can be classified as either ‘successes’
or ‘failures’. Let p1 be the proportion of successes in the first population and
let p2 be the proportion of successes in the second population. If we take
a random sample of n1 independent objects from the first population and a
random sample of n2 independent objects from the second population, and
we call the number of successes in the first sample X and the number of
successes in the second sample Y , then our point estimators for p1 and p2 will
X Y
respectively be p̂1 = and p̂2 = .
n1 n2
• Our point estimator for the difference p1 − p2 will then be p̂1 − p̂2
42
◦ We could use the central limit theorem to obtain an approximately stan-
dard normal statistic; however, we face the same problem we had in the
confidence interval for a single proportion p: we don’t know the value of
p1 and p2 and thus can’t calculate the variance of the estimator. Thus,
we replace p1 and p2 in the formula with their estimators:
p̂1 (1 − p̂1 ) p̂2 (1 − p̂2 )
◦ σ̂p̂21 −p̂2 = Var (p̂1 − p̂2 ) = +
n1 n2
◦ This results in a t distribution statistic instead of a z distribution statistic
p̂1 − p̂2 − (p1 − p2 )
T =s has approximately a t distribution with
p̂1 (1 − p̂1 ) p̂2 (1 − p̂2 )
+
n1 n2
n1 + n2 − 2 degrees of freedom, provided n1 and n2 are fairly large
◦ From this we can derive the following (1 − α)100% confidence interval for
p1 s − p2 :
p̂1 (1 − p̂1 ) p̂2 (1 − p̂2 )
p̂1 − p̂2 ± tα/2,n1 +n2 −2 +
n1 n2
Confidence Interval for Difference in Proportions: Example
• A firm has classified its customers in two ways: (1) according to whether the
account is overdue and (2) whether the account is new (less than 12 months)
or old. To acquire information about which customers are paying on time and
which are overdue, a random sample of 292 customer accounts was drawn.
Each was categorized as either a new account or an old one, and whether the
customer has paid or is overdue. The results are summarized below. Let p1 be
the proportion of new accounts that are overdue and let p2 be the proportion
of old accounts that are overdue. Provide a 90% confidence interval for p1 −p2 .
s
p̂1 (1 − p̂1 ) p̂2 (1 − p̂2 )
p̂1 − p̂2 ± tα/2,n1 +n2 −2 +
n1 n2
p̂1 = 12/83
p̂2 = 49/209
t0.05,83+209−2 = t0.05,290 ≈ t0.05,100 = 1.660
s
12 12 49 49
83
1 − 83 209
1− 209
12/83 − 49/209 ± 1.660 +
83 209
= (−0.1703, −0.0094)
43
Thus we can be 90% confident that the proportion of new accounts that are
overdue is less than the proportion of old accounts that are overdue by between
0.0094 and 0.1703.
2. Several specimens of coal were sampled from each of two mines, and the heat
capacity (in kilocalories per kg) was measured for each specimen. The results
are below. Obtain a 99% confidence interval for the difference in mean heat
capacity between coal from mine 1 and coal from mine 2. It is assumed that
the variance of heat capacity of coal is the same in the two mines.
3. Six months ago, a survey was undertaken to determine the degree of support
for a national political leader. Of a sample of 1100 people, 56% indicated
that they support this politician. This month, another survey of 800 people
estimated that 46% now support the leader. Estimate with 95% confidence
the decrease in percentage support for this politician over the past six months.
44
Interval Estimators for Population Variance
• In this module, we have not discussed the sampling distribution of the sample
variance S 2 , nor have we discussed confidence intervals for the population
variance σ 2
• Two major tasks in statistics are estimation and inference. In the previous
chapter we focused on estimation, in which we were making quantitative state-
ments about parameters (either using a point estimate or an interval estimate)
Hypothesis Testing
• The method we will be using for statistical inference is called hypothesis testing
• There are also nonparametric hypotheses, but we will see these mostly in
second year
45
◦ There are two claims (or hypotheses) being made.
◦ The prosecution claims that the defendant is guilty.
◦ The defense claims that the defendant is not guilty.
◦ Both sides present and discuss evidence that they believe supports their
claim.
◦ The judge or jury then weighs the evidence and makes a decision.
◦ In most modern justice systems, there is a presumption of ‘Innocent until
proven guilty’. This means it is presumed that the defendant is innocent
unless the prosecution provides sufficient evidence to prove otherwise.
• Before we can understand how hypothesis testing works there is some termi-
nology we need to know, which we can relate back to the example of a criminal
trial.
1. Null Hypothesis, H0 . This is the default claim that we will presume to be true
unless the evidence (data) proves otherwise (so ‘Innocent’ is the null hypothesis
in a criminal trial)
3. Test Statistic. Based on our data we will need to come up with a test statistic
which gives us a means of testing whether the data is consistent with the null
hypothesis being true.
4. Rejection Rule (sometimes called Rejection Region). The rejection rule tells
us the set of values of the test statistic which would cause us to reject the null
hypothesis.
5. Decision. By calculating the observed value of the test statistic and applying
the rejection rule we will reach a decision. The decision will either be ‘Reject
the null hypothesis’ or ‘Fail to reject the null hypothesis’. If we reject the null
hypothesis, we will conclude that the alternative hypothesis is true. If we fail
to reject the null hypothesis, this does NOT mean we have proven the null
hypothesis is true. Rather, it means there is not enough evidence to disprove
it. Suppose in a criminal trial that there is not enough evidence to prove the
defendant’s guilt. This does not actually prove that the defendant is innocent;
it only means the defendant cannot be found guilty based on the evidence
presented. For instance, if enough evidence was presented to show that there
is a 55% chance that the defendant is guilty, he or she would be found ‘Not
Guilty’.
46
Type I and Type II Error in a Hypothesis Test
• There are two types of error that can occur in a hypothesis test, as shown in
the following table
Decision of Test
Reject H0 Fail to reject H0
H0 False Correct Type II Error
Reality
H0 True Type I Error Correct
• A Type I Error occurs when we reject a null hypothesis that is really true
• A Type II Error occurs when we fail to reject a null hypothesis that is really
false
• These two error probabilities α and β are inversely related : if we reduce one,
the other increases
• Consider what would happen if a judge simply found every defendant innocent
regardless of the evidence
• Alternatively, consider what would happen if a judge simply found every de-
fendant guilty regardless of the evidence
47
• Which of these two errors is more serious?
• It depends on the context, but often a Type II Error is more serious. Consider
the criminal trial, for example. Is it a more serious mistake to let a guilty
person go free, or to send an innocent person to prison? The justice system in
most countries today is built on the principle that it is a more serious mistake
to send an innocent person to prison (Type I Error)
• To fix α means we choose the largest value of α that we are willing to tolerate
• This value of α that we choose is called the significance level of the test
• If a coin is fair, then the probability of heads and the probability of tails will
be equal: p = 0.5
• If the coin is not fair, then the probability of heads and the probability of tails
will not be equal: p 6= 0.5
• Hence, if we want to potentially prove that a coin is unfair, we can test the
null hypothesis p = 0.5 against the alternative hypothesis p 6= 0.5
1. State the null and alternative hypotheses. This is best done using math-
ematical notation rather than just words. Hence, instead of saying ‘The
coin is fair’ as our null hypothesis, we say p = 0.5
2. State the significance level, α. The most commonly used values of α are
0.05 and 0.01.
3. State the test statistic that will be used in the hypothesis test. A test
statistic must have the following characteristics:
◦ It must be a statistic we can calculate from the data available
◦ We must know the null distribution of the test statistic. The null
distribution is the sampling distribution of the test statistic under
the null hypothesis, i.e. assuming that the null hypothesis is true
48
4. Use the null distribution to determine the rejection rule. The rejection
rule is determined using the null distribution together with α. There are
two possible approaches to determining the rejection rule: the critical
value approach and the p-value approach. They will both result in
the same answer. This will be explained further below.
• The rejection rule will change depending which alternative hypothesis is used.
For now let us use the alternative hypothesis HA : µ 6= µ0 ; we will come back
to the other cases
• We know that our best point estimator for µ is Ȳ (it is unbiased and consistent)
• The all-important question is, how much less or how much greater than µ0
should Ȳ be before we reject the claim that µ = µ0 ?
• To see this, consider the following diagram and ask yourself, ‘In which situation
would I reject the null hypothesis H0 : µ = µ0 ?’
• It seems obvious that we should reject the null hypothesis in the situation at
the top, because the estimate Ȳ is very far from the null value µ0 ; by contrast,
we should not reject the null hypothesis in the situation at the bottom, because
the estimate Ȳ is close to the null value µ0
49
• The two situations in between are perhaps more ambiguous; how does one
decide how far away from µ0 the estimate Ȳ must be before it is far enough to
reject H0 ? This already provides us with a reason for a statistical hypothesis
test
• But there is another problem: we have not taken into account the sampling
distribution of Ȳ
• Suppose that for these four situations the sampling distributions of Ȳ are as
illustrated below. How would this affect our view of H0 ?
50
◦ But since we have assumed the population is normally distributed with
standard deviation σ, the only way the sampling distribution could be
incorrect is if E Ȳ 6= µ0 , which means µ 6= µ0 . The null hypothesis
must be rejected!
• By similar reasoning, we cannot reject H0 in the case of the top graph, even
though Ȳ is much further from µ0 . Why not?
◦ The standard deviation of the sampling distribution is very large in this
case, so the distribution is wide. It is quite possible that we could have
gotten a sample mean value as large as what we did get, if this is really
the sampling distribution of Ȳ
◦ Thus it is reasonable to think µ0 could actually be the mean of the sam-
pling distribution
◦ Hence we cannot reject H0 ; it may be true
• What conclusions would you draw regarding the two graphs in the middle?
• We should now be able to understand why a common form of the test statistic
is
θ̂ − θ0
σθ̂
51
• Here, θ̂ is the point estimator, θ0 is the null value of the parameter (the value
if the null hypothesis is true), and σθ̂ is the standard deviation of the point
estimator
• The numerator of this fraction tells us how far apart the estimator and the
null value are in absolute terms (as in the first picture above)
• The denominator of this fraction expresses this distance in units of standard
deviations of the estimator, giving us an idea of how far apart the estimator
and the null value are in probability terms (as in the second picture above with
the distribution curves drawn)
Hypothesis Test for the Population Mean when Standard Deviation is
Known
• Let us use this logic to construct a hypothesis test in the situation we have
just described
• Our assumptions are that the population is normally distributed, Y1 , Y2 , . . . , Yn
is a sample of i.i.d. random variables from the population, and the population
standard deviation σ is known
• Our hypotheses are as follows:
H 0 : µ = µ0
HA : µ =6 µ0
• Let α be our significance level (the probability of a type I error that we are
willing to allow)
Ȳ − µ0
• Consider the test statistic Z = √
σ/ n
• What can we say about this test statistic?
• Under H0 (that is, if H0 is true),
√ Ȳ follows a normal distribution with mean
µ0 and standard deviation σ/ n; thus, under H0 , Z follows a standard normal
distribution:
• Suppose we calculate Ȳ and then Z from our data and we get Z = 3
• We can work out the probability of getting such an extreme Z value if H0 were
true:
Pr (Z > 3) + Pr (Z < −3) = [1 − Pr (Z < 3)] + [1 − Pr (Z < 3)]
= 2 [1 − Pr (Z < 3)]
= 2 [1 − 0.9987] = 0.0026
52
Making a Decision about a Null Hypothesis using the Significance Level
• These are called the critical value approach and the p-value approach
• The reason is that the critical approach can usually be done by hand (using
a statistical table), whereas the p-value approach requires a computer in most
cases
• However, most statistical software packages (e.g. SAS) use only the p-value
approach, so if you want to understand the output of these software packages
you must understand this approach
53
The Critical Value Approach
• The critical value is defined as the value of the test statistic for which the
probability of getting a more extreme value would be α
• In the case of our two-tailed test of H0 : µ = µ0 vs. HA : µ 6= µ0 , we are asking
for what value of z the following expression is true:
Pr (Z > |z|) = α
In other words: Pr (Z < −z) + Pr (Z > z) = α
[1 − Pr (Z < z)] + [1 − Pr (Z < z)] = α
2 [1 − P r (Z < z)] = α
α
1 − Pr (Z < z) =
2
• The reason why we have |z| instead of just z is that we would want to reject
H0 if Z is large and positive (Ȳ is much greater than µ0 ) or if Z is large and
negative (Ȳ is much less than µ0 ).
• Because the normal distribution is symmetrical, we divide α into two equal
pieces of α/2 (as in the graph above), just like we did with confidence intervals
• We call the z value that satisfies this probability statement zα/2 which is the
critical value for this hypothesis test
• If |Z| > zα/2 then we will reject H0 , otherwise we will fail to reject H0
• That is to say, we use the critical value to draw a rejection region of size
α, so that any Z statistic value more extreme than the critical value zα/2 will
lead us to reject the null hypothesis
The p-Value Approach
• The p-value approach describes the rejection rule in terms of probabilities
rather than in terms of values of the test statistic (e.g. z values)
• The p-value of a hypothesis test can be defined as ‘The probability of observing
a value of the test statistic as extreme or more extreme than what was actually
observed, given that the null hypothesis is true’
• In the case of the test for the mean that we are working with, the p-value
would be expressed as follows:
Pr (Z > |Zobserved | | H0 is true)
54
The Seven Steps of a Hypothesis Test
3. State the test statistic and its null distribution. For example, ‘Under H0 ,
Ȳ − µ0
Z= √ has a standard normal distribution’
σ/ n
4. State the rejection rule. For example, ‘Reject H0 if...’
7. State your conclusion in practical terms using words, not mathematical nota-
tion or jargon.
31 40 26 30 36 38
29 40 38 30 35 38
55
2. Repeat steps 4, 5 and 6 of the hypothesis test procedure, this time using
the p-value approach.
1. Hypotheses:
H0 : µ = 36
6 36
HA : µ =
2. Significance Level: α = 0.05
Ȳ − µ0
3. Test Statistic: Under H0 , Z = √ has a standard normal distribution
σ/ n
4. Rejection Rule: We will reject H0 if |Zobserved | > zα/2 = z0.025 = 1.96 (this
value we have calculated before in Chapter 4)
5. Calculation:
Ȳ − µ0
Zobserved = √
σ/ n
1
Ȳ = (31 + 40 + 26 + · · · + 38) = 34.25
12
34.25 − 36
Zobserved = √
8/ 12
= −0.758
6. Decision: |Zobserved | = 0.758 < 1.96 therefore we fail to reject H0 at the
0.05 significance level
7. We conclude there is not evidence that the average amount of time spent
doing homework by students in this course is different from 36 hours. (In
other words, it is reasonable to assume that the average amount of time
spent doing homework by students in this course is 36 hours)
56
Hypothesis Test for the Population Mean when Standard Deviation is
Known: Example 2
1. Hypotheses:
H0 : µ = 12
6 12
HA : µ =
Pr (Z > z) = 0.01
1 − Pr (Z < z) = 0.01
Pr (Z < z) = 0.99
z = 2.325 ( between 2.32 and 2.33)
Using a computer we can be more exact: 2.326
5. Calculation:
Ȳ − µ0
Zobserved = √
σ/ n
11.95 − 12
= √
0.19/ 100
= −2.632
57
5. Calculation: We have already calculated that Zobserved = 2.632; now we
need to determine the p-value
p−value = Pr (Z > |Zobserved | | H0 is true)
= Pr (Z > 2.632 | H0 is true) + Pr (Z < −2.632 | H0 is true)
Recall: if H0 is true, Z has a standard normal distribution
= Pr (Z > 2.63) + Pr (Z < −2.63) to be calculated from Z table
= 2 [1 − Pr (Z < 2.63)]
= 2(1 − 0.9957)
= 0.0085
6. Decision: p-value= 0.0085 < 0.02 thus we reject H0 at the 0.02 signifi-
cance level
7. Conclusion: same as before
58
1. Hypotheses:
H0 : µ = 128
HA : µ > 128
Note how the hypotheses have changed: we are only interested in whether
we can prove that the average blood pressure of this company’s young
executives exceeds the national average. Hence we ignore the possibility
that the average blood pressure is less than the national average. Accord-
ingly, our alternative hypothesis uses a > sign rather than a 6= sign. This
is called a ‘right-tailed test’. The following graph shows the distribution
of the test statistic in this case, along with the rejection region:
59
4. Rejection Rule: We will reject H0 if Zobserved > zα = z0.05 = 1.645 (note
we no longer have absolute value since we are only interested in positive
values of Z)
Pr (Z > z) = 0.05
1 − Pr (Z < z) = 0.05
Pr (Z < z) = 0.95
z = 1.645 ( between 1.64 and 1.65)
5. Calculation:
Ȳ − µ0
Zobserved = √
σ/ n
129.93 − 128
= √
15/ 72
= 1.09
• Note: if using the p-value approach, we would find Pr (Z > Zobserved ) = Pr (Z > 1.09)
which is 0.138. Because the p-value > α we fail to reject H0 .
• A salesperson at a call centre who is selling insurance policies must sell at least
60 policies per year in order to be profitable to the company. Tony has been
working at the company for ten years and the number of policies he has sold
each year are given below. If we treat his past ten years as a random sample
of all the years he might do this job, do we have evidence at the 0.01 level that
Tony is an underperforming salesperson (meaning that on average he fails to
sell 60 policies per year)? Assume it is known that the standard deviation of
the number of policies sold annually by any salesperson is 8.
1. Hypotheses:
H0 : µ = 60
HA : µ < 60
54 67 47 43 61
50 55 41 56 64
60
This time, because we are specifically interested in whether Tony is un-
derperforming, we have a ‘left-tailed test’. Our alternative hypothesis
uses a < sign. The following graph shows the distribution of the test
statistic in this case, along with the rejection region:
2. Significance Level: α = 0.01
Ȳ − µ0
3. Test Statistic: Under H0 , Z = √ has a standard normal distribution
σ/ n
4. Rejection Rule: We will reject H0 if Zobserved < −zα = −z0.01 = −2.326
(note we no longer have absolute value since we are only interested in
positive values of Z)
Pr (Z > z) = 0.01
1 − Pr (Z < z) = 0.01
Pr (Z < z) = 0.99
z = 2.325 ( between 2.32 and 2.33)
Using a computer we can be more exact: 2.326
5. Calculation:
1
Ȳ = (54 + 67 + 47 + 43 + 61 + 50 + 55 + 41 + 56 + 64) = 53.8
10
Ȳ − µ0
Zobserved = √
σ/ n
53.8 − 60
= √
8/ 10
= −2.450
61
6. Decision: Zobserved = −2.450 < −2.326 therefore we reject H0 at the 0.01
significance level
7. We conclude there is evidence that Tony is underperforming since his
annual average number of policies sold is less than 60.
• Note: if using the p-value approach, we would find Pr (Z < Zobserved ) = Pr (Z < −2.45)
which is 0.007. Because the p-value < α we reject H0 .
• The power function is useful for deciding the sample size, i.e. how much data
we need, before we collect our data
• Determine the power of this hypothesis test if the sample size is n = 10 and
62
the actual population mean is µ = 5
1 − β = Pr (Reject H0 |µ = 5)
= Pr (Z > zα | µ = 5)
Ȳ − µ0
= Pr √ > zα | µ = 5
σ/ n
Under H0, this statistic has a standard normal distribution
But under HA, it does not!
√
Under HA, Ȳ has a normal distribution with mean µ = 5 and s.d. σ/ n
Ȳ − µA − (µ0 − µA )
= Pr √ > zα | µ = 5
σ/ n
Ȳ − µA µ0 − µA
= Pr √ − √ > zα | µ = 5
σ/ n σ/ n
Ȳ − µA µ0 − µA
= Pr √ > zα + √ |µ=5
σ/ n σ/ n
µ0 − µA
= Pr Z > zα + √
σ/ n
(since we now have a correct sampling distribution on the left)
4−5
= Pr Z > z0.05 + √
1/ 10
4−5
= Pr Z > 1.645 + √
1/ 10
= Pr (Z > 1.645 − 3.162) = Pr (Z > −1.517)
= 0.935
• This means that with a sample size of 10 we have a 93.5% chance of rejecting
H0 and demonstrating that the mean (which is actually 5) is greater than 4,
at 0.05 significance level
• The graph below shows the power function of this test with α = 0.05 for
different values of µA and n
• From the graph it is clear that as sample size increases, power increases
• It is also clear that as the true population mean µA moves further away from
the null hypothesis value µ0 (which is 4 in this case), power increases
63
5 Hypothesis Tests about One Population
Review
• We now look at two other kinds of hypothesis test concerning a single popu-
lation: a hypothesis test for the population mean when standard deviation is
unknown, and a hypothesis test for the population proportion
64
• It is often unrealistic to assume the value of σ, the population standard devi-
ation, is known.
• Just as we did in the previous chapter when producing a confidence interval for
the mean in such a case, we estimate the standard deviation using the sample.
• But this means our test statistic will no longer be normally distributed but t
distributed
Ȳ − µ Ȳ − µ
• Our test statistic changes from Z = √ to t = √ , which has a t
σ/ n S/ n
distribution with n − 1 degrees of freedom
• We can get t distribution critical values from our t distribution table but we
cannot get t distribution p-values from the table. These cannot be done by
hand; we need a computer to get them
• The following are the monthly rand amounts for landline telephone service
for a random sample of eight households in your community: 550, 510, 470,
620, 480, 560, 600, 570. The telephone company claims in an advertisement
that customers in this community are paying an average of R490 per month for
their landline telephone service. Conduct a hypothesis test at 0.05 significance
level to determine whether this claim is accurate.
1. Hypotheses:
H0 : µ = 490
6 490
HA : µ =
Because we did not specify whether we want to show the mean monthly
cost is less than or greater than R490, we use a two-tailed test.
2. Significance Level: α = 0.05
Ȳ − µ0
3. Test Statistic: Under H0 , t = √ has a t distribution with n − 1 = 7
S/ n
degrees of freedom
4. Rejection Rule: We will reject H0 if |tobserved | > tα/2,n−1 = t0.025,7 = 2.365
65
5. Calculation:
1
Ȳ = (550 + 510 + 470 + 620 + 480 + 560 + 600 + 570) = 545
8
v
u
u 1 X 8
2
S= t Yi − Ȳ
n − 1 i=1
r
1
= (20600) = 54.2481
7
Ȳ − µ0
tobserved = √
S/ n
545 − 490
= √
54.2481/ 8
= 2.868
• With reference to the previous example, suppose that we were interested specif-
ically in whether the average cost for landline telephone service in this com-
munity is greater than the R490 that the company claims it is. How would
the hypothesis test change in this case?
1. Hypotheses:
H0 : µ = 490
HA : µ > 490
66
4. Rejection Rule: We will reject H0 if tobserved > tα,n−1 = t0.05,7 = 1.895
5. Calculation: tobserved = 2.868 (As before)
6. Decision: tobserved = 2.868 > 1.895 therefore we reject H0 at the 0.05
significance level
7. We conclude there is evidence that the mean monthly cost of landline
telephone services in this community is greater than what the company
claims it is.
• Note: if using the p-value approach, we would find Pr (t > tobserved ) = Pr (t > 2.868).
It is not possible to find this value by hand or from the t distribution table.
But using a computer we can determine it to be 0.012. Since it is less than
0.05 this confirms that we were correct to reject the null hypothesis.
• Now our hypotheses are about p rather than µ since the unknown parameter
about which we are making inferences is the proportion, not the mean
• So our null hypothesis will in general take the form p = p0 , while the alternative
hypothesis will be p 6= p0 , p < p0 or p > p0 (depending whether it is a two-
tailed, left-tailed or right-tailed test)
67
• We use our usual seven-step hypothesis testing procedure, but the test statis-
p̂ − p0
tic will change to Z = r , which under the null hypothesis has a
p0 (1 − p0 )
n
standard normal distribution (approximately) if n is sufficiently large
• Notice that unlike confidence intervals for the proportion, we do not need to
introduce the t distribution because rour formula has the exact standard devia-
p0 (1 − p0 )
tion of p̂ rather than an estimator. is the exact standard deviation
n
of p̂ if the null hypothesis is true (which means the population parameter p
has a value of p0 ).
• A coin is to be used to determine which team will defend which goal in the
first half of a soccer game. However, the captain of one team claims that the
coin is not fair (i.e. the probability of heads is not 50%). The referee does an
experiment by flipping the coin 60 times, and gets heads 37 times and tails
23 times. Is the captain justified in claiming that the coin is unfair? Use a
hypothesis test at the 5% significance level to answer this question.
1. Hypotheses:
H0 : p = 0.5
6 0.5
HA : p =
p̂ = 37/60 = 0.6167
p̂ − p0
Zobserved = r
p0 (1 − p0 )
n
0.6167 − 0.5
=r
0.5(1 − 0.5)
60
= 1.808
68
7. We conclude there is insufficient evidence to claim that the coin is unfair.
• Using the p-value approach:
4. Rejection Rule: We will reject H0 if p-value < 0.05
5. Calculation: We have already calculated that Zobserved = 1.808; now we
need to determine the p-value
p−value = Pr (Z > |Zobserved | | H0 is true)
= Pr (Z > 1.808 | H0 is true) + Pr (Z < −1.808 | H0 is true)
Recall: if H0 is true, Z has a standard normal distribution
= Pr (Z > 1.81) + Pr (Z < −1.81) to be calculated from Z table
= 2 [1 − Pr (Z < 1.81)]
= 2(1 − 0.9649)
= 0.0702
6. Decision: p-value= 0.0702 > 0.05 thus we fail to reject H0 at the 0.05
significance level
7. Conclusion: same as before
• Dogs are big and expensive. Rats are small and cheap. Can rats be trained to
replace dogs in sniffing out illegal drugs? One measure of the performance of
a drug-sniffing animal is the number of times in 80 trials that it can correctly
distinguish a cup with cocaine residue in it from within a group of other cups.
Suppose it is known that dogs are in general successful 98% of the time in
this test. A rat undergoes the 80 trials and successfully sniffs out the cup
with cocaine residue 73 times. The scientist conducting the experiment claims
that the rat is as good as a drug sniffing dog (i.e. it has the same probability
of success). Can you prove that the rat is less effective than dogs? Use a
hypothesis test with 0.02 significance level.
• A pastor claims that his sermons are less than 30 minutes long three-quarters
of the time. Unknown to him, a member of the church congregation times
his sermons at a random sample of 40 Sunday services. Of these 40 sermons,
26 are less than 30 minutes long. Is the pastor’s claim regarding his sermon
length reasonable? Use a hypothesis test with 0.05 significance level.
69
Population Population Population Sample Sample Sample St.
Mean St. Dev. Mean Dev.
1 µ1 σ1 X1 , X2 , . . . , Xn1 X̄ S1
2 µ2 σ2 Y1 , Y2 , . . . , Yn2 Ȳ S2
• To determine the effect of fuel grade on fuel efficiency, 80 new cars of the same
make, with identical engines, were each driven for 1000 km. Forty of the cars
ran on regular fuel and the other 40 received premium grade fuel. The cars
with the regular fuel averaged 11.6 km/L and the cars with the premium fuel
averaged 11.9 km/L. It is known that the population standard deviation of
fuel efficiency is 0.5 for cars running on regular fuel and 0.9 for cars running
on premium fuel. Is there a difference in fuel efficiency between the two grades
of fuel? Use 1% significance level.
1. Hypotheses:
H0 : µ1 = µ2
6 µ2
HA : µ1 =
70
2. Significance Level: α = 0.01
X̄ − Ȳ − 0
3. Test Statistic: Under H0 , Z = r 2 has a standard normal distri-
σ1 σ22
+
n1 n2
bution
4. Rejection Rule: We will reject H0 if |Zobserved | > zα/2 = z0.005 = 2.575
5. Calculation:
11.6 − 11.9 − 0
Zobserved = r
0.52 0.92
+
40 40
= −1.843
6. Decision: p-value= 0.0658 > 0.01 thus we fail to reject H0 at the 0.01
significance level
7. Conclusion: same as before
• Hence we assume that σ12 = σ22 = σ 2 although we don’t know the value of σ 2
(n1 − 1)S12 + (n2 − 1)S22
• We introduced a pooled estimator for the population variance: Sp2 =
n1 + n2 − 2
71
• We can involve the same estimator to derive a test statistic for testing the null
hypothesis µ1 − µ2 = ∆0 against some alternative:
X̄ − Ȳ − ∆0
t= r
1 1
Sp +
n1 n2
1. Hypotheses:
H0 : µ1 = µ2
HA : µ1 > µ2
72
X̄ − Ȳ − 0
3. Test Statistic: Under H0 , t = r has a t distribution with
1 1
Sp +
n1 n2
n1 + n2 − 2 degrees of freedom
4. Rejection Rule: We will reject H0 if tobserved > tα,n1 +n2 −2 = t0.05,68+33−2 =
t0.05,99 ≈ 1.660
5. Calculation:
1
X̄ = (1.50 + 0.10 + 1.76 + · · · + 1.96 + 1.81) = 1.7256
68
1
Ȳ = (0.82 + 0.89 + 1.31 + · · · + 0.89 + 0.09) = 0.8236
33
68
2 1 X 2
S1 = Xi − X̄ = 0.4087
68 − 1 i=1
33
1 X 2
S22 = Yi − Ȳ = 0.2314
33 − 1 i=1
(68 − 1)(0.4087) + (33 − 1)(0.2314) √
r
Sp = = 0.3514 = 0.5928
68 + 33 − 2
1.7256 − 0.8236 − 0
tobserved = r
1 1
0.5928 +
68 33
= 7.172
• Note that we cannot calculate a p-value by hand in this case because we are
using the t distribution
• If σ12 and σ22 are unknown and we cannot assume them to be equal, our test
statistic is
X̄ − Ȳ − ∆0
t= r 2
S1 S2
+ 2
n1 n2
• Under H0 , this test statistic approximately follows a t distribution with ν
2 2
S1 S22
+
n1 n2
degrees of freedom where ν = 4
S1 S24
+
n21 (n1 − 1) n22 (n2 − 1)
73
• This is known as the Welch-Satterthwaite Method and is typically used by
statistical software when conducting a hypothesis test comparing means of
two normally distributed populations with unknown, unequal variances
• She measures the height (in cm) of a random sample of 28 Maris plants and
a random sample of 24 Stella plants; the data is given in the table below
Conduct a hypothesis test at 1% significance level to determine whether there
1. Hypotheses:
H0 : µ1 = µ2
6 µ2
HA : µ1 =
74
X̄ − Ȳ − 0
3. Test Statistic: Under H0 , t = r 2 has a t distribution with ν
S1 S22
+
n1 n2
degrees of freedom, where
1
X̄ = (92 + 90 + 90 + · · · + 96 + 87) = 93.53571
28
1
Ȳ = (98 + 97 + 102 + · · · + 102 + 101) = 101.5417
24
28
2 1 X 2
S1 = Xi − X̄ = 38.25794
28 − 1 i=1
24
1 X 2
S22
= Yi − Ȳ = 41.99819
24 − 1 i=1
2 2
S1 S22
+
n1 n2
ν= 4
S1 S24
+
n21 (n1 − 1) n22 (n2 − 1)
2
38.25794 41.99819
+
28 24
= 2
38.25794 41.998192
+
282 (28 − 1) 242 (24 − 1)
= 48.007 ⇒ 48
75
6.2 Hypothesis Test for Comparing Means of Two Popula-
tions using Paired Samples
Hypothesis Test for Comparing Means of Two Normally Distributed Pop-
ulations: Paired Samples
• A paired t-test is a special case where we are comparing the means of two
normally distributed populations using related samples
• This means that the two samples have the same sample size and also that each
observation from the first sample has a ‘partner’ observation in the second
sample with which it can be directly compared
• The procedure in this case is to subtract each Xi value minus the corresponding
Yi value to get difference value di
• Note that the paired t test rests on the assumption that the two populations
have equal variances
1. Hypotheses:
H0 : µ1 = µ2
HA : µ1 < µ2
76
Xi Yi di
52 66 -14
47 52 -5
71 68 3
65 86 -21
55 58 -3
62 70 -8
39 51 -12
44 49 -5
74 82 -8
59 82 -23
• Fifty specimens of a new computer chip were tested for speed in a certain
application, along with 50 specimens of chips with the old design. The average
77
Weight Before Weight After
72 68
56 49
93 87
84 85
66 63
69 72
74 67
58 53
60 58
speed, in MHz, for the new chips was 495.6, and the average speed for the old
chips was 481.2. It is assumed that in the whole population of new chips, the
standard deviation of speed is 19.4, while in the whole population of old chips,
the standard deviation of speed is 14.3. Can you conclude that the mean speed
for the new chips is greater than that of the old chips? Conduct a hypothesis
test at 10% significance level to answer this, using the p-value approach.
• Two methods are being considered for a paint manufacturing process, in order
to increase production. In a random sample of 100 days, the mean daily
production using the first method was 625 tonnes and the standard deviation
was 40 tonnes. In a random sample of 64 days, the mean daily production
using the second method was 640 tonnes and the standard deviation was 50
tonnes. Assume the standard deviations of the two populations are equal. Do
we have evidence at the 5% significance level that the first method has slower
production than the second method?
• The manufacturers of a dietary supplement claim that people who take it will
lose weight. A random sample of nine women are weighed before taking the
supplement and again after 30 days of taking the supplement. Their before and
after weights (in kg) are shown in the table below. Test at the 5% significance
level whether the mean weight of women is less after taking the supplement
than before.
• Just as we developed a confidence interval for the difference between two pro-
portions, we can also use hypothesis testing to make inferences about the
difference between proportions of two populations
78
Population Sample Size # of Label Users
1 (Women) 296 63
2 (Men) 251 27
• The table below presents survey data on whether consumers are ‘label users’
who pay attention to details on the label when buying a garment. Are men
and women equally likely to be label users? Test the hypothesis that the
proportion of women who are label users is the same as the proportion of men
who are label users. Use α = 0.05.
1. Hypotheses:
H0 : p1 = p2
6 p2
HA : p1 =
79
5. Calculation:
p̂1 = 63/296 = 0.2128
p̂2 = 27/251 = 0.1076
63 + 27 90
p̂ = = = 0.1645
296 + 251 547
0.2128 − 0.1076
Zobserved = s
1 1
0.1645(1 − 0.1645) +
296 251
= 3.307
6. Decision: p-value= 0.001 < 0.05 thus we reject H0 at the 0.05 significance
level
7. Conclusion: same as before
Hypothesis Test for Comparing Proportions of Two Populations: Exer-
cises
• It’s difficult to persuade consumers to abandon a product with which they
are familiar. One experiment gave consumers free samples of a new washing
powder and also of a standard washing powder. After some time, subjects
were asked which washing powder they prefer. Among the 48 customers who
normally use the standard product, 19 preferred the new product. Among the
56 customers who did not previously use the standard product, 29 preferred
the new product. Are current users of the standard washing powder less likely
than nonusers to prefer the new washing powder? Summarize the data and
conduct a hypothesis test at 0.01 significance level.
80
• Two extrusion machines that manufacture steel rods are being compared. In
a sample of 1000 rods taken from machine 1, 960 met specifications regarding
length and diameter. In a sample of 600 rods taken from machine 2, 582 met
the specifications. Are the two machines equally effective at producing rods
that meet the specifications? Conduct a hypothesis test at 0.05 significance
level to reach a conclusion.
• Although not as widely used as some other tests, one can conduct a hypothesis
test to infer whether the variance of a normally distributed population is equal
to some specified null value
• Thus, when doing a two-tailed test, we cannot just look up one critical value
in the table and consider positive and negative cases. We must look up two
critical values: one for the left tail and one for the right tail
81
Hypothesis Test for a Population Variance: Example
• A company produces machined engine parts that are supposed to have a di-
ameter variance no larger than 0.0002 (diameters measured in cm). A random
sample of 10 parts gave a sample variance of 0.0003. Test, at the 5% level, for
evidence that the variance exceeds 0.0002.
1. Hypotheses:
H0 : σ 2 = 0.0002
HA : σ 2 > 0.0002
2. Significance Level: α = 0.05
(n − 1)S 2
3. Test Statistic: Under H0 , T = 2
has a χ2 distribution with n − 1
σ0
degrees of freedom
4. Rejection Rule: We will reject H0 if Tobserved > χ2α,n−1 = χ0.05,9 = 16.919
5. Calculation:
(n − 1)S 2
T =
σ02
(10 − 1)(0.0003)
= = 13.5
0.0002
6. Decision: Tobserved = 13.5 < 16.919 therefore we fail to reject H0 at the
0.05 significance level
7. There is insufficient evidence to conclude that the variance is larger than
0.0002.
82
Hypothesis Test for Comparing Variances of Two Populations (Optional)
• This is not an optional topic because this type of test will become very im-
portant in second year Statistics, so it is important to understand the basics
now
83
Hypothesis Test for Comparing Variances of Two Populations: Exam-
ple
• The manager of a dairy is in the process of deciding which of two new carton-
filling machines to use. The most important attribute is the consistency of the
fills (i.e. the fills should have a small variance). She takes a random sample of
ten cartons filled by machine 1 and a random sample of eleven cartons filled
by machine 2 and measures the fill volume of each carton. The results are
displayed below. Can we infer that the second machine is more consistent
than the first (i.e. it has a smaller variance)? Use a hypothesis test with 5%
significance level.
1. Hypotheses:
σ12
H0 : =1
σ22
σ2
HA : 12 > 1
σ2
84
2. Significance Level: α = 0.05
S12
3. Test Statistic: Under H0 , F = has an F distribution with n1 − 1
S22
numerator degrees of freedom and n2 − 1 denominator degrees of freedom
4. Rejection Rule: We will reject H0 if Fobserved > fα,n1 −1,n2 −1 = f0.05,9,10 =
3.020
◦ Note: there is a separate table for α = 0.05 and α = 0.01 because
we need the columns of the table to cover the numerator degrees of
freedom and the rows of the table to cover the denominator degrees
of freedom
◦ Hence we go to the table entitled ‘Critical Values of the F -Distribution:
α = 0.05’, look up n1 −1 in the columns (Numerator Degrees of Free-
dom) and n2 − 1 in the rows (Denominator Degrees of Freedom)
5. Calculation:
1
X̄ = (0.998 + 0.997 + · · · + 1.000) = 1.0002
10
1
S12 = [0.998 − 1.0002]2 + [0.997 − 1.0002]2 + · · · + [1.000 − 1.0002]2
10 − 1
= 5.7333 × 10−6
1
Ȳ = (1.003 + 1.004 + · · · + 0.996) = 1.000818
11
1
S22 = [1.003 − 1.000818]2 + [1.004 − 1.000818]2 + · · · + [0.996 − 1.000818]2
11 − 1
= 1.1364 × 10−5
S2
Fobserved = 12
S2
5.7333 × 10−6
= = 0.5045
1.1364 × 10−5
85
Day 1: 5.0 4.8 5.1 5.1 4.8 5.1 4.8
4.8 5.0 5.2 4.9 4.9 5.0
Day 2: 5.8 4.7 4.7 4.9 5.1 4.9 5.4
5.3 5.3 4.8 5.7 5.1 5.7
Day 3: 6.3 4.7 5.1 5.9 5.1 5.9 4.7
6.0 5.3 4.9 5.7 5.3 5.6
• All the hypothesis tests rely on the assumption that the data consists of inde-
pendent random samples
• If these assumptions are not met, our hypothesis test is not valid.
• In the computer lab we will learn some basic techniques for checking the va-
lidity of the ‘normality’ assumption, which can be done graphically or using
normality tests
• More sophisticated techniques for checking our test assumptions will wait until
second and third year subjects
• This section describes two tests used to analyze categorical data, i.e. data
measured on the nominal scale of measurement
• These tests are named after the British statistician Karl Pearson
• Both tests use the χ2 distribution, which we already encountered for testing
the variance of one population
86
• The difference is that instead of two possible outcomes for each trial (‘success’
and ‘failure’) we have k possible outcomes, which we can call outcome 1,
outcome 2, outcome 3, etc. up to outcome k
• Instead of just one random variable Y , in this case we define k random variables
O1 , O2 , O3 , . . . , Ok where Oj is the observed number of occurrences of outcome
j
• Notice that in the special case where k = 2, this reduces to a binomial exper-
iment (we could say if the ball lands in box 1 it is a ‘success’ and if the ball
lands in box 2 it is a ‘failure’)
• Suppose we have k boxes and n = 100 balls are tossed at them. Suppose
further that π1 = 0.1, i.e. for each toss there is a 10% chance that the ball
will land in box 1. How many balls would we expect to find in box 1 after 100
trials?
E (O1 ) = nπ1 = (100)(0.1) = 10
• In general, the number of balls expected to land in box j is E (Oj ) = nπj for
j = 1, 2, . . . , k
• Suppose we want to test the null hypothesis that π1 = π1? , π2 = π2? , . . . , πk = πk?
against the alternative that these are not all the correct probabilities
• This could be called a ‘goodness of fit’ test because we are basically testing
whether the data fit a particular probability distribution
• Let us call these expected values under the null hypothesis Ej for j = 1, 2, . . . , k
87
• If we throw 100 balls at k boxes and the probability of a ball landing in box 1 is
10%, the expected number of balls in box 1 is 10 and we will be very surprised
if we find 40 balls in box 1; we will doubt whether our 10% probability is
correct
• Hence, a statistic that could measure how well the observed data fits the null
Xk
hypothesis expected values is (Oj − Ej )2
j=1
• The further apart the observed counts Oj are from the expected counts Ej ,
k
X
the larger this statistic will be; hence if (Oj − Ej )2 is very large we will
j=1
reject H0
• But how large must the statistic be before we reject H0 ? We need a probability
distribution for the statistic.
• Hence we will reject the null hypothesis (the null probability distribution) if
χ2observed > χ2α,k−1
• The key assumptions of the chi-squared goodness of fit test are as follows:
1. Expected frequencies (Ej ) should not be too small, otherwise the χ2 dis-
tribution is not a good approximation. A standard rule of thumb is to
avoid using this test if any Ej < 1 or if more than 20% of the categories
have Ej < 5
2. The ‘trials’ in the underlying multinomial experiment should be indepen-
dent (just as in a binomial experiment)
88
Chi-Squared Goodness of Fit Test: Example 1
• A group of rats, one by one, proceed down a ramp to one of three doors.
We wish to test the hypothesis that the rats have no preference as to which
1
door they choose, which would mean that π1 = π2 = π3 = where πj is the
3
probability that a rat will choose door j for j = 1, 2, 3
• Suppose we send 90 rats down the ramp and observe that 23 rats choose Door
1, 36 rats choose Door 2, and 31 rats choose Door 3. At the 5% significance
level, test the null hypothesis that the rats have no preference for which door
they choose
1. Hypotheses:
1
H0 : π1 = π2 = π3 =
3
HA : This is not the probability distribution
89
(Oj − Ej )2
Door Observed Frequency Oj Expected Frequency Ej
Ej
90 31 = 30
1 23 1.6333
2 36 30 1.2
3 31 30 0.0333
Total 90 90 χ2observed = 2.867
• Laptop computers have become relatively much less expensive since they first
entered the market in the late 1980s. Has this changed the profile of laptop
customers? A 1988 survey found that 69% of laptop customers were busi-
nesses, 21% were government agencies, 7% were educational institutions, and
only 3% were for private use at home. A more recent survey of 150 buyers of
laptops from a particular vendor found that 76 were businesses, 25 were gov-
ernment agencies, 17 were educational institutions and 32 were for personal
home use. Do these data fit the 1988 distribution of laptop customers or has
the distribution changed? Use α = 0.1 level for this test.
1. Hypotheses:
90
(Oj − Ej )2
Customer Type Observed Frequency Oj Expected Frequency Ej
Ej
Business 76 150(0.69) = 103.5 7.307
Government 25 150(0.21) = 31.5 1.341
Education 17 150(0.06) = 9 7.111
Home 32 150(0.04) = 6 112.667
2
Total 150 150 χobserved = 128.43
• This method is used to test for an association between two categorical (nomi-
nal) variables
• The data for such a test are assembled into a ‘contingency table’ (sometimes
called a cross-tabulation table) in which the rows represent categories of one
variable and the columns represent categories of a second variable
• For example, consider the following 2 × 5 contingency table showing the rela-
tionship between shift and day of the week for absenteeism. Each value in the
table represents the number of absences in one month at a factory, according
to the shift (day or evening) and day of the week (Monday to Friday)
Day of Week
Shift Mon Tues Wed Thurs Fri Total
Day 52 28 37 31 33 181
Evening 35 34 34 37 41 181
Total 87 62 71 68 74 362
91
• The table gives us an idea of whether there is an association between shift
and day of the week for absenteeism. For instance, maybe the day shift has
more absenteeism on Mondays and the evening shift has more absenteeism on
Fridays
• This helps us to visualize the possible relationship between shift and day of the
week, but it doesn’t give us an objective way of deciding whether a relationship
actually exists (as opposed to just random variation)
• The null hypothesis in this case is that the row frequencies are independent of
the column frequencies
• The alternative hypothesis is that the row frequencies and the column frequen-
cies have an association
92
◦ H0 : The two variables are independent
◦ HA : The two variables are dependent
• Once again we are going to test the null hypothesis by comparing the Observed
frequencies with the Expected frequencies; but the method for determining the
Expected frequencies is now different
◦ Let n be the total number of observations, i.e. the total frequency of all
the cells in the table
◦ Let ri be the number of observations in the ith row of the table, i.e. the
sum of row i
◦ Let cj be the number of observations in the jth column of the table, i.e.
the sum of column j
ri
• The estimated probability that an observation falls in row i is equal to ; for
n
181
example, in the table above, Pr (Shift = Evening) = = 0.5
362
• Similarly, the estimated probability that an observation falls in column j is
cj 87
equal to ; for example, Pr (Weekday = Monday) = = 0.2403
n 362
• What is the probability that an observation falls in row i and column j? (Let
us call this πij ). For example, what is Pr (Shift = Evening ∩ Weekday = Monday)?
• If the row variable and the column variable are independent (as the null hy-
pothesis claims), then we can apply the multiplication rule for independent
r i cj
events: πij =
nn
• For example, if Shift and Weekday are independent, then
• In fact, if the null hypothesis is true then the observed contingency table
frequencies Oij follow a multinomial distribution with k = rc categories and
n trials (in our example, 362 trials)
93
• The expected frequency in row i, column j under the null hypothesis is thus
ri cj ri cj
Eij = nπij = n =
nn n
• If the observed frequencies tend to be far from these expected frequencies,
it will be evidence that the row variable and the column variable are not
independent since the probabilities calculated using the multiplication rule for
independent events do not fit the data
• Note that the assumptions of the chi-squared test for independence are ba-
sically the same as those of the chi-squared goodness of fit test: every cell in
the contingency table should have an expected frequency of at least 1 and at
least 80% of the cells should have an expected frequency of at least 5.
• Let us apply the test to our absenteeism example at the 5% significance level
1. Hypotheses:
94
Day of Week
Shift Mon Tues Wed Thurs Fri Total
(181)(87) (181)(62) (181)(71) (181)(68) (181)(74)
Day 52 ( 362
= 43.5) 28 ( 362
= 31) 37 ( 362
= 35.5) 31 ( 362
= 34) 33 ( 362
= 37) 181
(181)(87) (181)(62) (181)(71) (181)(68) (181)(74)
Evening 35 ( 362
= 43.5) 34 ( 362
= 31) 34 ( 362
= 35.5) 37 ( 362
= 34) 41 ( 362
= 37) 181
Total 87 62 71 68 74 362
2 X 5
X (Oij − Eij )2
χ2observed =
i=1 j=1
Eij
(52 − 43.5)2 (28 − 31)2 (37 − 35.5)2 (31 − 34)2
= + + +
43.5 31 35.5 34
2 2 2
(33 − 37) (35 − 43.5) (34 − 31) (34 − 35.5)2
+ + + +
37 43.5 31 35.5
2 2
(37 − 34) (41 − 37)
+ +
34 37
= 5.424
1. Hypotheses:
H0 : Participation in Programme and Behaviour after Release are independent
HA : Participation in Programme and Behaviour after Release are related
95
Does not
Re-offends Total
re-offend
Completes 3 57 60
programme
Does not complete
27 13 40
programme
Total 30 70 100
2 X 5
X (Oij − Eij − 0.5)2
χ2observed =
i=1 j=1
Eij
(3 − 18 − 0.5)2 (57 − 42 − 0.5)2
= +
18 42
(27 − 12 − 0.5)2 (13 − 28 − 0.5)2
+ +
12 28
= 44.454
96
What to do when expected cell frequencies are too small
Education Level
Employment Status Primary Some Secondary Matric Tertiary Qualification Total
Employed 3 (6.5) 36 (37.7) 44 (40.95) 8 (5.85) 91
Unemployed 7 (3.5) 22 (20.3) 19 (22.05) 1 (3.15) 49
Total 10 58 63 9 140
• We can see that two cells (25%) have an expected count less than 5; this
violates the rule of thumb
• What we can do in this case is to merge one or two of the small categories into
an adjacent, larger category
• For instance, we might combine ‘Primary’ and ‘Some Secondary’ into a single
category, ‘Primary or Some Secondary’
• Or, we might combine ‘Matric’ and ‘Tertiary Qualification’ into a single cate-
gory, ‘Matric or Tertiary Qualification’
• Or we might do both; if we do both, the new table will look like this:
Education Level
Employment Status Primary or Some Secondary Matric or Tertiary Qualification Total
Employed 39 (44.2) 52 (46.8) 91
Unemployed 29 (23.8) 20 (25.2) 49
Total 68 72 140
• Now we have no problem with our model assumptions and we can continue
with the test
97
Value Observed
1 115
2 97
3 91
4 101
5 110
6 86
Total 600
• A six-sided die is rolled 600 times and the frequency of each number from 1
to 6 is observed (shown in table). Test whether this die is fair (i.e. all six
numbers are equally likely to be rolled). Use 0.1 significance level.
• The Mendelian theory of genetics states that the number of a certain type of
peas falling into the classifications ‘round and yellow’, ‘wrinkled and yellow’,
‘round and green’ and ‘wrinkled and green’ should be in the ratio 9:3:3:1
(meaning that 9/16 of the peas should be in the first category, 3/16 in the
second, etc.) Suppose that 100 randomly sampled peas of this type revealed
56, 19, 17 and 8 in the respective categories. Do these data fit the model at
the 0.05 significance level?
• Specifications for the dimensions of a roller are 2.10-2.11 cm. Rollers that are
too thick can be reground, while those that are too thin must be scrapped.
Three machinists grind these rollers. Samples of rollers were collected from
each machine, and their diameters were measured. The results are as follows:
Can you conclude that the proportions of rollers in the three categories differ
• Complete the hypothesis test for the education vs. employment status example
above after the categories were combined, using α = 0.05. Don’t forget the
Yates continuity correction since this is now a 2 × 2 table!
98
Statistical Techniques for Categorical Data
• The following table can help you determine which method to use to solve a
given problem
99
8 Introduction to Nonparametric Methods
Introduction: Parametric vs. Nonparametric Statistical Methods
• As a general rule, hypothesis tests which evaluate nominal or ordinal data are
nonparametric, while tests that evaluate interval or ratio data are parametric
• The advantage of parametric methods is that they are more powerful (when
their assumptions hold): they have a lower type II error than their correspond-
ing nonparametric method
• The disadvantage of parametric methods is that they may give incorrect con-
clusions if the model assumptions do not hold: the type I error and/or type
II error may be much higher than it is supposed to be; thus nonparametric
methods can be used in a wider set of circumstances than parametric methods
• In this brief chapter we will introduce just two nonparametric methods, though
there are many more
• (In fact, we have already learned two others, since the chi-squared tests covered
in the previous chapter are nonparametric or distribution-free)
• We will first look at a basic nonparametric test called the Sign Test which is
used to compare paired observations to see whether they tend to be equal
100
8.1 The Sign Test
Sign Test: Data
• The observations within a pair need not be independent; in fact, they should
not be independent, because if they were, the Mann-Whitney Test would be
a more powerful method to use
• There should thus be some natural basis for pairing the observations, such as
weight loss of a single person using two different diet types, or weight of a
single person at two different points in time
• Within each pair a comparison is made, and the pair is classified as “+” (‘plus’)
if Xi < Yi , as “-” (‘minus’) if Xi > Yi , or as “0” (‘tie’) if Xi = Yi
H0 : Pr (+) = Pr (−)
H1 : Pr (+) 6= Pr (−)
• In words, the null hypothesis means that a ‘plus’ and a ‘minus’ are equally
likely to occur; the populations are equal in location
101
Sign Test: Test Statistic
• The test statistic is T which equals the number of ‘plus’ pairs; that is, T =
total number of +’s
• Under the null hypothesis, T follows a binomial distribution with p = 0.5 and
n =the number of non-tied pairs, i.e. n0 minus number of ties
Sign Test: Rejection Rule for Two-Tailed Test
• For n ≤ 20, we construct the critical region (rejection rule) using the binomial
distribution table
1
• We are only interested in the case where p = 2
(given in a table in the ap-
pendix)
• We look for a value of x in the table for which the probability is close to α/2
◦ We denote this value of x by t, and denote the probability value by α1
◦ We say that the significance level of our test is 2α1 (which will not usually
be exactly equal to α)
• We reject H0 if T ≤ t or T ≥ n − t
1 √
• If n ≥ 20 we can use a normal approximation: t = n − zα/2 n
2
• We would again reject H0 if T ≤ t or T ≥ n − t, and this time our significance
level is (approximately) equal to α
Sign Test: Rejection Rule for One-Tailed Test
• In the lower-tailed case, we have the hypotheses
H0 : Pr (+) ≥ Pr (−)
H1 : Pr (+) < Pr (−)
1
• For n ≤ 20, again we use the binomial distribution table with p = 2
• We look for a value of x in the table for which the probability is close to α
and call it t; the probability is called α1
• We reject H0 if T ≤ t (with significance level α1 )
1 √
• If n ≥ 20 we use the approximation t = n − zα n
2
• In the upper-tailed case, we have the hypotheses
H0 : Pr (+) ≤ Pr (−)
H1 : Pr (+) > Pr (−)
• We again find t in the same way as in the lower-tailed case, but we reject H0
if T ≥ n − t
102
Sign Test: Example 1
• Twenty-two customers in a grocery store were asked to taste each of two types
of cheese (cheddar and gouda) and declare their preference. Seven customers
preferred cheddar, twelve preferred gouda, and three had no preference. Does
this indicate a significant difference in preference?
• We define an observation to be “+” if a customer preferred cheddar, “-” if the
customer preferred gouda, and “0” if there was no preference
1. Hypotheses:
H0 : Pr (+) = Pr (−)
H1 : Pr (+) 6= Pr (−)
2. Significance level: α = 0.05 (we can use this exact level since n ≥ 20)
3. Test statistic: T = total number of +’s
1 √ 1 √
4. Rejection Rule: t = n − zα/2 n = 19 − 1.96 19 = 5.23
2 2
◦ Hence n − t = 13.77
◦ Thus we reject H0 if T ≤ 5.23 or T ≥ 13.77
5. Calculation: Tobserved = 7 (the number of customers who preferred ched-
dar)
6. Decision: 5.23 < Tobserved < 13.77 thus we fail to reject H0 at 0.05 signif-
icance level
7. We conclude that we cannot claim a difference in preference between the
two types of cheese
Sign Test: Example 2
• Six athletes went on a diet in an attempt to lose weight, with the following
results:
Name Abdul John Senzo Frank Lerato Simon
Weight Before 74 91 188 82 101 88
Weight After 65 86 83 78 103 81
103
2. Target Significance Level: α = 0.05 (see below)
3. Test statistic: T = total number of +’s
4. Rejection Rule: We have n0 = 6 and n = 6 since there are no ties
◦ From our binomial table, with n = 6 and p = 0.5, the closest we
can get to α without going over is α1 = 0.0156; this becomes our
actual significance level
◦ Thus we reject H0 if T ≤ 0, with a significance level of 0.0156
5. Calculation; Tobserved = 1 (the number of people whose weight was greater
after than before)
6. Decision: Tobserved > 0 thus we fail to reject H0 at 0.0156 significance level
7. Conclusion: We cannot claim that a person’s weight while on the diet is
more likely to decrease than to increase; we have not proven that the diet
is effective
• Note that we reached this conclusion even though five out of six participants
did actually lose weight. Probably we should collect more data; the main
problem here is that we have low statistical power (high probability of a Type
II error) due to low sample size
• The data used for a Mann-Whitney Test (sometimes called a Wilcoxon Test)
consist of two random samples
• Assign the overall ranks 1 to n1 + n2 to all the observations and let R(Xi ) and
R(Yj ) denote the rank assigned to Xi and Yj for all i and j
• If several sample values are exactly equal to each other (tied), assign to each
value the average of the ranks that would have been assigned to them had
there been no ties
2. There is independence within each sample as well as between the two samples
(the second part of this assumption differs from the Sign Test)
104
Mann-Whitney Test: Hypotheses
H0 : E (X) = E (Y )
H1 : E (X) 6= E (Y )
• If there are no ties, or just a few ties, the sum of the ranks in the first sample
can be used as a test statistic:
n1
X
T = R(Xi )
i=1
N
X
• Here, Ri2 refers to the sum of the squares of all N of the ranks or average
i=1
ranks actually used in the samples (after adjusting for ties)
• The upper quantiles for the distribution of T are given by the relation w1−p =
n1 (n1 + n2 + 1) − wp where wp is the lower quantile from the table
• When n1 and n2 are both greater thanr20 and there are no ties, we can use the
n1 (N + 1) n1 n2 (N + 1)
approximation wp ≈ + zp where zp is the standard
2 12
normal quantile
• Thus we would reject H0 at the level of significance α if T < wα/2 or T > w1−α/2
(if using T ) or if |T1 | > zα/2 (if using T1 )
105
Mann-Whitney Test: Rejection Rule for One-Tailed Tests
H0 : E (X) = E (Y )
H1 : E (X) < E (Y )
H0 : E (X) = E (Y )
H1 : E (X) > E (Y )
• Here, if we are using T then we reject H0 if T > w1−α where w1−α is taken
from the Mann-Whitney table in appendix (if n1 and n2 are small) and from
the normal approximation (if n1 and n2 are ≥ 20)
• If there are many ties and we use T1 , we reject H0 if T1 > z1−α
• The matric class in a particular high school had 48 boys. 12 boys lived on farms and
the other 36 lived in urban areas. A test was devised to see if farm boys in general
were more physically fit than city boys. Each boy in the class was given a physical
fitness test in which a low score indicates poor physical condition. The scores of the
farm boys (Xi ) and the city boys (Yj ) are as follows:
• Although these two groups are not true random samples from the populations
of farm boys and city boys, it seems reasonable to assume that they would
resemble random samples from the populations of farm boys and city boys of
that age group. The independence assumption also seems reasonable.
• The hypotheses to be tested can be stated in words as follows:
H0 : Farm boys do not tend to be more fit, physically, than city boys
H1 : Farm boys tend to be more fit than city boys
106
• Mathematically, the null hypothesis could be stated as E (X) = E (Y ) and the
alternative hypothesis as E (X) > E (Y ) (so it is an upper tailed test)
• Note that we actually need to do the ranking before we begin our seven-step
hypothesis testing procedure because the test statistic depends on whether we
have ties
Mann-Whitney Test Example 1: Ranks
• We first rank the data as follows:
X Y Rank X Y Rank X Y Rank
1.0 1 6.2 17 11.3 33
1.8 2 6.3 18 11.4 34
2.1 3 6.4 19 11.8 35
2.4 4 6.7 20.5 12.5 36
2.6 5 6.7 20.5 12.6 37
2.7 6 7.3 22 12.7 38
3.2 7 7.6 23 12.9 39
3.6 8 7.9 24 14.2 40
4.0 9 8.3 25 14.8 41.5
4.2 10 9.0 26 14.8 41.5
5.0 11 9.1 27 15.3 43
5.6 13 9.9 28 16.0 44
5.6 13 10.6 30.5 16.1 45
5.6 13 10.6 30.5 16.9 46
5.9 15 10.6 30.5 17.7 47
6.1 16 10.6 30.5 18.6 48
• The bolded values in the table represent ties, where the ranks were calculated
by averaging. For instance, because the 12th, 13th and 14th values in order
12 + 13 + 14
were all equal, they each receive a rank of = 13
3
• Because we have a number of ties here, it is better to use T1 as our test statistic
Mann-Whitney Test Example 1: Hypothesis Test
1. Hypotheses:
H0 : E (X) = E (Y )
H1 : E (X) > E (Y )
107
4. Rejection Rule: We will reject H0 if T1 observed > z0.05 = 1.645
N
X
5. Calculations: we first need to calculate Ri2 (including the tied ranks):
i=1
N
X
Ri2 = 12 + 22 + 32 + · · · + 482 = 38016
i=1
n
X
Next we need to calculate T = R(Xi ) = 6 + 10 + 13 + 18 + 22 + 26 +
i=1
30.5 + 34 + 36 + 39 + 41.5 + 45 = 321
N +1
T − n1
T1 = v 2
u N
u n1 n2 X 2 n1 n2 (N + 1)2
t R −
N (N − 1) i=1 i 4(N − 1)
48 + 1
321 − 12
=s 2
(12)(36) (12)(36)(48 + 1)2
38016 −
48(48 − 1) 4(48 − 1)
= 0.6431
6. Decision: T1observed = 0.6431 < 1.645 thus we fail to reject H0 at 0.05 signifi-
cance level
• A simple experiment was designed to see if flint in area A tended to have the
same degree of hardness as flint in area B. Four sample pieces of flint were
collected in area A and five pieces in area B. To determine which of two pieces
of flint was harder, the two pieces were rubbed against each other. The piece
sustaining less damage was judged the harder of the two. In this manner all
nine pieces of flint were ordered according to hardness. The rank of 1 was
assigned to the softest piece, rank 2 to the next softest, and so on
108
• The results are shown in the table below:
Origin of Piece Rank
A 1
A 2
A 3
B 4
A 5
B 6
B 7
B 8
B 9
1. Hypotheses:
H0 : E (X) = E (Y )
H1 : E (X) 6= E (Y )
In words, the null hypothesis states that the flints from areas A and B are
of equal hardness, whereas the alternative states that they are not of equal
hardness
3. Test Statistic: Under H0 , T = sum of ranks of flints from area A follows the
Mann-Whitney distribution
7. Conclusion: We conclude that the flints from the two areas differ in degree of
hardness
109
9 Single-Factor Analysis of Variance
Comparing means of more than two populations
• We have learned how to use a two-sample test to compare the means of two
independent populations
• This is not only tedious, but also introduces a statistical problem: when we do
multiple hypothesis tests at the same time, the overall type I error increases
• For instance, using the additive probability rule, if we have two hypothesis
tests, we have
• If the two tests are independent and each have a type I error probability of
0.05, this becomes
Pr (type I error in test 1 ∪ type I error in test 2) = 0, 05+0, 05−(0, 05)(0, 05) = 0, 0975
• Thus if we have two independent hypothesis tests each with a type I error of
0,05 the probability of making a type I error in at least one of the two tests
is nearly 0,1. If we have three hypothesis tests (or 36) the cumulative type I
error will be even greater
• Hence we want to know if there is a type of hypothesis test that will allow us
to compare means of three or more populations all at once
• It turns out that there is, and it is called ANOVA (Analysis of Variance)
110
Analysis of Variance (ANOVA)
• Suppose you have a independent populations and have drawn a random sample
from each; the sample sizes are n1 , n2 , n3 , . . ., na
• In words, the null hypothesis says that the means of all the a populations are
the same; the alternative hypothesis says that at least two means differ
• It seems strange that we would analyse the variance in order to test for dif-
ferences between means
• The logic of ANOVA comes from breaking the total variance of the data into
two pieces
ANOVA Example
• In the above table, the rows are i = 1, 2, 3 and the columns are j = 1, 2, . . . , ni
where n1 = 6, n2 = 4, n3 = 5
• We will the observation in the ith row and the jth column as yij
• We will denote the sum of observations in the ith row as yi• and the sum of
all observations, the grand total, as y••
111
• A basic way to compare the profits across the three cities would be to calculate
the sample mean for each city, ȳi• :
74, 2
ȳ1• = = 12, 3667
6
38, 7
ȳ2• = = 9, 675
4
43, 7
ȳ3• = = 8, 74
5
• This graph may suggest that mean profits are highest in Johannesburg, fol-
lowed by Cape Town and then Durban, but as statisticians we must ask the
question, ‘Are the differences in means between the cities statistically signifi-
cant or do they merely reflect random variation in the data?’
• Our method for answering this question involves comparing variation in profits
between cities to variation in profits within each city. Variation in profits
within each city would be understood as random. Thus, if the variation in
profits between cities is similar to variation in profits within each city, we will
conclude that the apparent difference in profits between cities can be explained
as random variation rather than actual differences in mean profits. However,
if the variation in profits between cities is much larger than the variation in
profits within each city, this would imply that there are actual, non-random
differences in mean profits between cities.
• The overall variance in the data we can call
SSTotal
M STotal =
N −1
112
a
X
where N = ni is the total number of observations and
i=1
ni
a X 2
X y••
SSTotal = yij2 −
i=1 j=1
N
• Here, ‘MS’ stands for ‘Mean Sum of Squares’ while ‘SS’ stands for ‘Sum of
Squares’
• Notice that M STotal is basically the formula for a sample variance S 2 that you
learned in Statistics 1A.
• SSTotal can be broken into two pieces that we call SSTreatment and SSError .
SSTreatment measures the amount of variation between treatments or between
populations (in our example, between cities) while SSError measures the amount
of variation within treatments or within populations (in our example, within
each city)
• We have
SSTreatment
M STreatment = where
a−1
a 2
X yi• y2
SSTreatment = − ••
i=1
ni N
SSError
M SError = where
N −a
SSError = SSTotal − SSTreatment
• Note that, when the number of observations from each population (or treat-
ment) is unequal, as in our example, this is called an unbalanced ANOVA.
If the number of observations from each population is equal, this is called a
balanced ANOVA and we can replace ni in the above formulas with n, the
number of observations drawn from each population
• For this reason, we can use this ratio as a test statistic. We reject the null
M STreatment
hypothesis if the observed value of the test statistic F = is much
M SError
larger than the values that our F distribution with d.f. a − 1 and N − a is
likely to produce
• If we reject the null hypothesis we will conclude that the means are not equal
across all populations
113
• Let us implement our seven-step hypothesis testing procedure to perform an
ANOVA on the restaurant profits data at 5% significance level
1. Hypotheses:
H 0 : µ1 = µ2 = µ3
HA : At least one µi 6= µj
N = 6 + 4 + 5 = 15
y•• = 74, 2 + 38, 7 + 43, 7 = 156, 6
ni
3 X 2
X y••
SSTotal = yij2 −
i=1 j=1
N
156, 62
= 10, 82 + 11, 42 + 13, 52 + · · · + 5, 22 + 10, 62 + 10, 12 −
15
= 1712, 96 − 1634, 904 = 78, 056
3
X y2 i•
2
y••
SSTreatment = −
i=1
ni N
74, 22 38, 72 43, 72 156, 62
= + + −
6 4 5 15
= 1673, 967 − 1634, 904 = 39, 06317
SSError = SSTotal − SSTreatment
= 78, 056 − 39, 06317 = 38, 99283
M STreatment SSTreatment /(a − 1)
Fobserved = =
M SError SSError /(N − a)
39, 06317/(2)
= = 6, 011
38, 99283/(12)
• Note: we cannot apply the p-value approach by hand for ANOVA because we
cannot calculate the p-values for the F distribution by hand. However, we can
use the p-value approach within SAS since SAS calculates the p-values for us.
114
ANOVA and Experimental Design
• One of the most common settings in which ANOVA is used is the design
of experiments. In our example above, we were analysing pre-existing data
taken from three different ‘populations’. In designed experiments, however,
we choose an ‘independent variable’ or ‘factor’, some categorical or discrete
variable whose effect on a continuous ‘dependent variable’ we want to deter-
mine. We choose two or more levels of this factor (treatments) and conduct an
experiment where we observe the value of the dependent variable repeatedly at
each value of the independent variable. We then use ANOVA to test whether
the value of the independent variable makes a difference to the outcome of the
experiment (the value of the dependent variable).
• Solution:
115
1. Hypotheses:
H 0 : µ1 = µ2 = µ3
HA : At least one µi 6= µj
• We can see that the mean stiffness is lowest for method 2: ȳ2• = 36, 6/6 = 6, 1
whereas ȳ1• = 46, 5/6 = 7, 75 and ȳ3• = 46, 2/6 = 7, 7. Since we found
that there are statistically significant differences between the mean stiffness
outcomes of the three methods, we can conclude that method 2 is more effective
than the other two methods.
116
Single-factor ANOVA Exercises
• The following table reports stress at 600% elongation for a pieces of a certain
type of rubber tested at five different laboratories. At the 1% significance
level, test whether the mean stress differs among the five laboratories.
• The following table gives the weight gains of three groups of rats that were
involved in a scientific experiment. The first group of rats were given the
hormone thyroxin, the second group were given the hormone thiouracil and the
third group, a ‘control group’, were given no hormones. Test at 5% significance
level for a difference in mean weight gain between the three groups. Comment
on the possible effects of thiouracil on rat weight gain.
117
118
Integre Technical Publishing Co., Inc. Moore/McCabe November 16, 2007 1:29 p.m. moore page T-11
Tables T-11
•
TABLE D
t distribution critical values
Upper-tail probability p
df .25 .20 .15 .10 .05 .025 .02 .01 .005 .0025 .001 .0005
1 1.000 1.376 1.963 3.078 6.314 12.71 15.89 31.82 63.66 127.3 318.3 636.6
2 0.816 1.061 1.386 1.886 2.920 4.303 4.849 6.965 9.925 14.09 22.33 31.60
3 0.765 0.978 1.250 1.638 2.353 3.182 3.482 4.541 5.841 7.453 10.21 12.92
4 0.741 0.941 1.190 1.533 2.132 2.776 2.999 3.747 4.604 5.598 7.173 8.610
5 0.727 0.920 1.156 1.476 2.015 2.571 2.757 3.365 4.032 4.773 5.893 6.869
6 0.718 0.906 1.134 1.440 1.943 2.447 2.612 3.143 3.707 4.317 5.208 5.959
7 0.711 0.896 1.119 1.415 1.895 2.365 2.517 2.998 3.499 4.029 4.785 5.408
8 0.706 0.889 1.108 1.397 1.860 2.306 2.449 2.896 3.355 3.833 4.501 5.041
9 0.703 0.883 1.100 1.383 1.833 2.262 2.398 2.821 3.250 3.690 4.297 4.781
10 0.700 0.879 1.093 1.372 1.812 2.228 2.359 2.764 3.169 3.581 4.144 4.587
11 0.697 0.876 1.088 1.363 1.796 2.201 2.328 2.718 3.106 3.497 4.025 4.437
12 0.695 0.873 1.083 1.356 1.782 2.179 2.303 2.681 3.055 3.428 3.930 4.318
13 0.694 0.870 1.079 1.350 1.771 2.160 2.282 2.650 3.012 3.372 3.852 4.221
14 0.692 0.868 1.076 1.345 1.761 2.145 2.264 2.624 2.977 3.326 3.787 4.140
15 0.691 0.866 1.074 1.341 1.753 2.131 2.249 2.602 2.947 3.286 3.733 4.073
16 0.690 0.865 1.071 1.337 1.746 2.120 2.235 2.583 2.921 3.252 3.686 4.015
17 0.689 0.863 1.069 1.333 1.740 2.110 2.224 2.567 2.898 3.222 3.646 3.965
18 0.688 0.862 1.067 1.330 1.734 2.101 2.214 2.552 2.878 3.197 3.611 3.922
19 0.688 0.861 1.066 1.328 1.729 2.093 2.205 2.539 2.861 3.174 3.579 3.883
20 0.687 0.860 1.064 1.325 1.725 2.086 2.197 2.528 2.845 3.153 3.552 3.850
21 0.686 0.859 1.063 1.323 1.721 2.080 2.189 2.518 2.831 3.135 3.527 3.819
22 0.686 0.858 1.061 1.321 1.717 2.074 2.183 2.508 2.819 3.119 3.505 3.792
23 0.685 0.858 1.060 1.319 1.714 2.069 2.177 2.500 2.807 3.104 3.485 3.768
24 0.685 0.857 1.059 1.318 1.711 2.064 2.172 2.492 2.797 3.091 3.467 3.745
25 0.684 0.856 1.058 1.316 1.708 2.060 2.167 2.485 2.787 3.078 3.450 3.725
26 0.684 0.856 1.058 1.315 1.706 2.056 2.162 2.479 2.779 3.067 3.435 3.707
27 0.684 0.855 1.057 1.314 1.703 2.052 2.158 2.473 2.771 3.057 3.421 3.690
28 0.683 0.855 1.056 1.313 1.701 2.048 2.154 2.467 2.763 3.047 3.408 3.674
29 0.683 0.854 1.055 1.311 1.699 2.045 2.150 2.462 2.756 3.038 3.396 3.659
30 0.683 0.854 1.055 1.310 1.697 2.042 2.147 2.457 2.750 3.030 3.385 3.646
40 0.681 0.851 1.050 1.303 1.684 2.021 2.123 2.423 2.704 2.971 3.307 3.551
50 0.679 0.849 1.047 1.299 1.676 2.009 2.109 2.403 2.678 2.937 3.261 3.496
60 0.679 0.848 1.045 1.296 1.671 2.000 2.099 2.390 2.660 2.915 3.232 3.460
80 0.678 0.846 1.043 1.292 1.664 1.990 2.088 2.374 2.639 2.887 3.195 3.416
100 0.677 0.845 1.042 1.290 1.660 1.984 2.081 2.364 2.626 2.871 3.174 3.390
1000 0.675 0.842 1.037 1.282 1.646 1.962 2.056 2.330 2.581 2.813 3.098 3.300
z∗ 0.674 0.841 1.036 1.282 1.645 1.960 2.054 2.326 2.576 2.807 3.091 3.291
50% 60% 70% 80% 90% 95% 96% 98% 99% 99.5% 99.8% 99.9%
Confidence level C
119
Chi-Square Distribution Table
0 χ2
df χ2.995 χ2.990 χ2.975 χ2.950 χ2.900 χ2.100 χ2.050 χ2.025 χ2.010 χ2.005
1 0.000 0.000 0.001 0.004 0.016 2.706 3.841 5.024 6.635 7.879
2 0.010 0.020 0.051 0.103 0.211 4.605 5.991 7.378 9.210 10.597
3 0.072 0.115 0.216 0.352 0.584 6.251 7.815 9.348 11.345 12.838
4 0.207 0.297 0.484 0.711 1.064 7.779 9.488 11.143 13.277 14.860
5 0.412 0.554 0.831 1.145 1.610 9.236 11.070 12.833 15.086 16.750
6 0.676 0.872 1.237 1.635 2.204 10.645 12.592 14.449 16.812 18.548
7 0.989 1.239 1.690 2.167 2.833 12.017 14.067 16.013 18.475 20.278
8 1.344 1.646 2.180 2.733 3.490 13.362 15.507 17.535 20.090 21.955
9 1.735 2.088 2.700 3.325 4.168 14.684 16.919 19.023 21.666 23.589
10 2.156 2.558 3.247 3.940 4.865 15.987 18.307 20.483 23.209 25.188
11 2.603 3.053 3.816 4.575 5.578 17.275 19.675 21.920 24.725 26.757
12 3.074 3.571 4.404 5.226 6.304 18.549 21.026 23.337 26.217 28.300
13 3.565 4.107 5.009 5.892 7.042 19.812 22.362 24.736 27.688 29.819
14 4.075 4.660 5.629 6.571 7.790 21.064 23.685 26.119 29.141 31.319
15 4.601 5.229 6.262 7.261 8.547 22.307 24.996 27.488 30.578 32.801
16 5.142 5.812 6.908 7.962 9.312 23.542 26.296 28.845 32.000 34.267
17 5.697 6.408 7.564 8.672 10.085 24.769 27.587 30.191 33.409 35.718
18 6.265 7.015 8.231 9.390 10.865 25.989 28.869 31.526 34.805 37.156
19 6.844 7.633 8.907 10.117 11.651 27.204 30.144 32.852 36.191 38.582
20 7.434 8.260 9.591 10.851 12.443 28.412 31.410 34.170 37.566 39.997
21 8.034 8.897 10.283 11.591 13.240 29.615 32.671 35.479 38.932 41.401
22 8.643 9.542 10.982 12.338 14.041 30.813 33.924 36.781 40.289 42.796
23 9.260 10.196 11.689 13.091 14.848 32.007 35.172 38.076 41.638 44.181
24 9.886 10.856 12.401 13.848 15.659 33.196 36.415 39.364 42.980 45.559
25 10.520 11.524 13.120 14.611 16.473 34.382 37.652 40.646 44.314 46.928
26 11.160 12.198 13.844 15.379 17.292 35.563 38.885 41.923 45.642 48.290
27 11.808 12.879 14.573 16.151 18.114 36.741 40.113 43.195 46.963 49.645
28 12.461 13.565 15.308 16.928 18.939 37.916 41.337 44.461 48.278 50.993
29 13.121 14.256 16.047 17.708 19.768 39.087 42.557 45.722 49.588 52.336
30 13.787 14.953 16.791 18.493 20.599 40.256 43.773 46.979 50.892 53.672
40 20.707 22.164 24.433 26.509 29.051 51.805 55.758 59.342 63.691 66.766
50 27.991 29.707 32.357 34.764 37.689 63.167 67.505 71.420 76.154 79.490
60 35.534 37.485 40.482 43.188 46.459 74.397 79.082 83.298 88.379 91.952
70 43.275 45.442 48.758 51.739 55.329 85.527 90.531 95.023 100.425 104.215
80 51.172 53.540 57.153 60.391 64.278 96.578 101.879 106.629 112.329 116.321
90 59.196 61.754 65.647 69.126 73.291 107.565 113.145 118.136 124.116 128.299
100 67.328 70.065 74.222 77.929 82.358 118.498 124.342 129.561 135.807 140.169
120
Upper Tail Critical Values of F Distribution for a=0.1
121
Upper Tail Critical Values of F Distribution for a=0.05
122
Upper Tail Critical Values of F Distribution for a=0.01
123
Binomial Cumulative Distribution Function for p=1/2
nk
n 1 1
x k
k 0 k 2 2
n
x 1 2 3 4 5 6 7 8 9 10
0 0.500 0.250 0.125 0.063 0.031 0.016 0.008 0.004 0.002 0.001
1 1.000 0.750 0.500 0.313 0.188 0.109 0.063 0.035 0.020 0.011
2 1.000 0.875 0.688 0.500 0.344 0.227 0.145 0.090 0.055
3 1.000 0.938 0.813 0.656 0.500 0.363 0.254 0.172
4 1.000 0.969 0.891 0.773 0.637 0.500 0.377
5 1.000 0.984 0.938 0.855 0.746 0.623
6 1.000 0.992 0.965 0.910 0.828
7 1.000 0.996 0.980 0.945
8 1.000 0.998 0.989
9 1.000 0.999
10 1.000
n
x 11 12 13 14 15 16 17 18 19 20
0 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
1 0.006 0.003 0.002 0.001 0.000 0.000 0.000 0.000 0.000 0.000
2 0.033 0.019 0.011 0.006 0.004 0.002 0.001 0.001 0.000 0.000
3 0.113 0.073 0.046 0.029 0.018 0.011 0.006 0.004 0.002 0.001
4 0.274 0.194 0.133 0.090 0.059 0.038 0.025 0.015 0.010 0.006
5 0.500 0.387 0.291 0.212 0.151 0.105 0.072 0.048 0.032 0.021
6 0.726 0.613 0.500 0.395 0.304 0.227 0.166 0.119 0.084 0.058
7 0.887 0.806 0.709 0.605 0.500 0.402 0.315 0.240 0.180 0.132
8 0.967 0.927 0.867 0.788 0.696 0.598 0.500 0.407 0.324 0.252
9 0.994 0.981 0.954 0.910 0.849 0.773 0.685 0.593 0.500 0.412
10 1.000 0.997 0.989 0.971 0.941 0.895 0.834 0.760 0.676 0.588
11 1.000 1.000 0.998 0.994 0.982 0.962 0.928 0.881 0.820 0.748
12 1.000 1.000 0.999 0.996 0.989 0.975 0.952 0.916 0.868
13 1.000 1.000 1.000 0.998 0.994 0.985 0.968 0.942
14 1.000 1.000 1.000 0.999 0.996 0.990 0.979
15 1.000 1.000 1.000 0.999 0.998 0.994
16 1.000 1.000 1.000 1.000 0.999
17 1.000 1.000 1.000 1.000
18 1.000 1.000 1.000
19 1.000 1.000
20 1.000
124
Mann-Whitney lower quantiles wp
To get w1-p, use formula w1-p=n1(n1+n2+1)-wp
n1 prob n2=2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
2 0.001 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
2 0.005 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4
2 0.01 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 5 5
2 0.025 3 3 3 3 3 3 4 4 4 4 5 5 5 5 5 6 6 6 6
2 0.05 3 3 3 4 4 4 5 5 5 5 6 6 6 7 7 7 8 8 8
2 0.1 3 3 4 5 5 5 6 6 7 7 8 8 8 9 9 10 10 11 11
3 0.001 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7
3 0.005 6 6 6 6 6 6 6 7 7 7 8 8 8 9 9 9 9 10 10
3 0.01 6 6 6 6 6 7 7 8 8 8 9 9 9 10 10 11 11 11 12
3 0.025 6 6 6 7 8 8 9 9 10 10 11 11 12 12 13 13 14 14 15
3 0.05 6 6 7 8 9 9 10 10 11 12 12 13 14 14 15 16 16 17 18
3 0.1 6 7 8 9 10 11 12 12 13 14 15 16 17 17 18 19 20 21 22
4 0.001 10 10 10 10 10 10 10 10 11 11 11 12 12 12 13 13 14 14 14
4 0.005 10 10 10 10 11 11 12 12 13 13 14 14 15 16 16 17 17 18 19
4 0.01 10 10 10 11 12 12 13 14 14 15 16 16 17 18 18 19 20 20 21
4 0.025 10 10 11 12 13 14 15 15 16 17 18 19 20 21 22 22 23 24 25
4 0.05 10 11 12 13 14 15 16 17 18 19 20 21 22 23 25 26 27 28 29
4 0.1 11 12 13 15 16 17 18 20 21 22 23 24 26 27 28 29 31 32 33
5 0.001 15 15 15 15 15 15 16 17 17 18 18 19 19 20 21 21 22 23 23
5 0.005 15 15 15 16 17 17 18 19 20 21 22 23 23 24 25 26 27 28 29
5 0.01 15 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
5 0.025 15 16 17 18 19 21 22 23 24 25 27 28 29 30 31 33 34 35 36
5 0.05 16 17 18 20 21 22 24 25 27 28 29 31 32 34 35 36 38 39 41
5 0.1 17 18 20 21 23 24 26 28 29 31 33 34 36 38 39 41 43 44 46
6 0.001 21 21 21 21 21 22 23 24 25 26 26 27 28 29 30 31 32 33 34
6 0.005 21 21 22 23 24 25 26 27 28 29 31 32 33 34 35 37 38 39 40
6 0.01 21 21 23 24 25 26 28 29 30 31 33 34 35 37 38 40 41 42 44
6 0.025 21 23 24 25 27 28 30 32 33 35 36 38 39 41 43 44 46 47 49
6 0.05 22 24 25 27 29 30 32 34 36 38 39 41 43 45 47 48 50 52 54
6 0.1 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 56 58 60
7 0.001 28 28 28 28 29 30 31 32 34 35 36 37 38 39 40 42 43 44 45
7 0.005 28 28 29 30 32 33 35 36 38 39 41 42 44 45 47 48 50 51 53
7 0.01 28 29 30 32 33 35 36 38 40 41 43 45 46 48 50 52 53 55 57
7 0.025 28 30 32 34 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63
7 0.05 29 31 33 35 37 40 42 44 46 48 50 53 55 57 59 62 64 66 68
7 0.1 30 33 35 37 40 42 45 47 50 52 55 57 60 62 65 67 70 72 75
8 0.001 36 36 36 37 38 39 41 42 43 45 46 48 49 51 52 54 55 57 58
8 0.005 36 36 38 39 41 43 44 46 48 50 52 54 55 57 59 61 63 65 67
8 0.01 36 37 39 41 43 44 46 48 50 52 54 57 59 61 63 65 67 69 71
8 0.025 37 39 41 43 45 47 50 52 54 56 59 61 63 66 68 71 73 75 78
8 0.05 38 40 42 45 47 50 52 55 57 60 63 65 68 70 73 76 78 81 84
8 0.1 39 42 44 47 50 53 56 59 61 64 67 70 73 76 79 82 85 88 91
9 0.001 45 45 45 47 48 49 51 53 54 56 58 60 61 63 65 67 69 71 72
9 0.005 45 46 47 49 51 53 55 57 59 62 64 66 68 70 73 75 77 79 82
9 0.01 45 47 49 51 53 55 57 60 62 64 67 69 72 74 77 79 82 84 86
9 0.025 46 48 50 53 56 58 61 63 66 69 72 74 77 80 83 85 88 91 94
9 0.05 47 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100
9 0.1 48 51 55 58 61 64 68 71 74 77 81 84 87 91 94 98 101 104 108
10 0.001 55 55 56 57 59 61 62 64 66 68 70 73 75 77 79 81 83 85 88
10 0.005 55 56 58 60 62 65 67 69 72 74 77 80 82 85 87 90 93 95 98
10 0.01 55 57 59 62 64 67 69 72 75 78 80 83 86 89 92 94 97 100 103
10 0.025 56 59 61 64 67 70 73 76 79 82 85 89 92 95 98 101 104 108 111
10 0.05 57 60 63 67 70 73 76 80 83 87 90 93 97 100 104 107 111 114 118
10 0.1 59 62 66 69 73 77 80 84 88 92 95 99 103 107 110 114 118 122 126
11 0.001 66 66 67 69 71 73 75 77 79 82 84 87 89 91 94 96 99 101 104
11 0.005 66 67 69 72 74 77 80 83 85 88 91 94 97 100 103 106 109 112 115
11 0.01 66 68 71 74 76 79 82 85 89 92 95 98 101 104 108 111 114 117 120
11 0.025 67 70 73 76 80 83 86 90 93 97 100 104 107 111 114 118 122 125 129
11 0.05 68 72 75 79 83 86 90 94 98 101 105 109 113 117 121 124 128 132 136
11 0.1 70 74 78 82 86 90 94 98 103 107 111 115 119 124 128 132 136 140 145
125
Mann-Whitney lower quantiles wp
To get w1-p, use formula w1-p=n1(n1+n2+1)-wp
n1 prob n2=2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
12 0.001 78 78 79 81 83 86 88 91 93 96 99 102 104 107 110 113 116 119 121
12 0.005 78 80 82 85 88 91 94 97 100 103 106 110 113 116 120 123 126 130 133
12 0.01 78 81 84 87 90 93 96 100 103 107 110 114 117 121 125 128 132 135 139
12 0.025 80 83 86 90 93 97 101 105 108 112 116 120 124 128 132 136 140 144 148
12 0.05 81 84 88 92 96 100 105 109 113 117 121 126 130 134 139 143 147 151 156
12 0.1 83 87 91 96 100 105 109 114 118 123 128 132 137 142 146 151 156 160 165
13 0.001 91 91 93 95 97 100 103 106 109 112 115 118 121 124 127 130 134 137 140
13 0.005 91 93 95 99 102 105 109 112 116 119 123 126 130 134 137 141 145 149 152
13 0.01 92 94 97 101 104 108 112 115 119 123 127 131 135 139 143 147 151 155 159
13 0.025 93 96 100 104 108 112 116 120 125 129 133 137 142 146 151 155 159 164 168
13 0.05 94 98 102 107 111 116 120 125 129 134 139 143 148 153 157 162 167 172 176
13 0.1 96 101 105 110 115 120 125 130 135 140 145 150 155 160 166 171 176 181 186
14 0.001 105 105 107 109 112 115 118 121 125 128 131 135 138 142 145 149 152 156 160
14 0.005 105 107 110 113 117 121 124 128 132 136 140 144 148 152 156 160 164 169 173
14 0.01 106 108 112 116 119 123 128 132 136 140 144 149 153 157 162 166 171 175 179
14 0.025 107 111 115 119 123 128 132 137 142 146 151 156 161 165 170 175 180 184 189
14 0.05 108 113 117 122 127 132 137 142 147 152 157 162 167 172 177 183 188 193 198
14 0.1 110 116 121 126 131 137 142 147 153 158 164 169 175 180 186 191 197 203 208
15 0.001 120 120 122 125 128 131 135 138 142 145 149 153 157 161 164 168 172 176 180
15 0.005 120 123 126 129 133 137 141 145 150 154 158 163 167 172 176 181 185 190 194
15 0.01 121 124 128 132 136 140 145 149 154 158 163 168 172 177 182 187 191 196 201
15 0.025 122 126 131 135 140 145 150 155 160 165 170 175 180 185 191 196 201 206 211
15 0.05 124 128 133 139 144 149 154 160 165 171 176 182 187 193 198 204 209 215 221
15 0.1 126 131 137 143 148 154 160 166 172 178 184 189 195 201 207 213 219 225 231
16 0.001 136 136 139 142 145 148 152 156 160 164 168 172 176 180 185 189 193 197 202
16 0.005 136 139 142 146 150 155 159 164 168 173 178 182 187 192 197 202 207 211 216
16 0.01 137 140 144 149 153 158 163 168 173 178 183 188 193 198 203 208 213 219 224
16 0.025 138 143 148 152 158 163 168 174 179 184 190 196 201 207 212 218 223 229 235
16 0.05 140 145 151 156 162 167 173 179 185 191 197 202 208 214 220 226 232 238 244
16 0.1 142 148 154 160 166 173 179 185 191 198 204 211 217 223 230 236 243 249 256
17 0.001 153 154 156 159 163 167 171 175 179 183 188 192 197 201 206 211 215 220 224
17 0.005 153 156 160 164 169 173 178 183 188 193 198 203 208 214 219 224 229 235 240
17 0.01 154 158 162 167 172 177 182 187 192 198 203 209 214 220 225 231 236 242 247
17 0.025 156 160 165 171 176 182 188 193 199 205 211 217 223 229 235 241 247 253 259
17 0.05 157 163 169 174 180 187 193 199 205 211 218 224 231 237 243 250 256 263 269
17 0.1 160 166 172 179 185 192 199 206 212 219 226 233 239 246 253 260 267 274 281
18 0.001 171 172 175 178 182 186 190 195 199 204 209 214 218 223 228 233 238 243 248
18 0.005 171 174 178 183 188 193 198 203 209 214 219 225 230 236 242 247 253 259 264
18 0.01 172 176 181 186 191 196 202 208 213 219 225 231 237 242 248 254 260 266 272
18 0.025 174 179 184 190 196 202 208 214 220 227 233 239 246 252 258 265 271 278 284
18 0.05 176 181 188 194 200 207 213 220 227 233 240 247 254 260 267 274 281 288 295
18 0.1 178 185 192 199 206 213 220 227 234 241 249 256 263 270 278 285 292 300 307
19 0.001 190 191 194 198 202 206 211 216 220 225 231 236 241 246 251 257 262 268 273
19 0.005 191 194 198 203 208 213 219 224 230 236 242 248 254 260 265 272 278 284 290
19 0.01 192 195 200 206 211 217 223 229 235 241 247 254 260 266 273 279 285 292 298
19 0.025 193 198 204 210 216 223 229 236 243 249 256 263 269 276 283 290 297 304 310
19 0.05 195 201 208 214 221 228 235 242 249 256 263 271 278 285 292 300 307 314 321
19 0.1 198 205 212 219 227 234 242 249 257 264 272 280 288 295 303 311 319 326 334
20 0.001 210 211 214 218 223 227 232 237 243 248 253 259 265 270 276 281 287 293 299
20 0.005 211 214 219 224 229 235 241 247 253 259 265 271 278 284 290 297 303 310 316
20 0.01 212 216 221 227 233 239 245 251 258 264 271 278 284 291 298 304 311 318 325
20 0.025 213 219 225 231 238 245 252 259 266 273 280 287 294 301 309 316 323 330 338
20 0.05 215 222 229 236 243 250 258 265 273 280 288 295 303 311 318 326 334 341 349
20 0.1 218 226 233 241 249 257 265 273 281 289 297 305 313 321 330 338 346 354 362
126
References
Keller, G. (2012), Managerial Statistics, 9th edn, South-Western Cengage Learning,
Victoria.
Moore, D. (2000), The Basic Practice of Statistics, 2nd edn, W.H. Freeman and
Company, New York.
Navidi, W. (2006), Statistics for Engineers and Scientists, McGraw-Hill, New York.
127