You are on page 1of 6

Statistical errors

In statistical hypothesis testing, if the result of the test corresponds with reality, then a correct
decision has been made. However, if the result of the test does not correspond with reality, then
an error has occurred. There are two situations in which the decision is wrong:

• Type I error is the rejection of a true null hypothesis as the result of a test procedure (also known
as a "false positive" finding or conclusion)

• Type II error is the non-rejection of a false null hypothesis as the result of a test procedure (also
known as a "false negative" finding or conclusion).

Figure. Types of errors in hypothesis testing

We illustrate the idea of Type I and Type II errors by looking at hypothesis testing from the point
of view of a criminal trial. In any trial, the defendant is assumed to be innocent. Evidence must be
collected proving that the defendant is guilty beyond all reasonable doubt. Because we are seeking
evidence for guilt, it becomes the alternative hypothesis. Innocence is assumed, so it is the null
hypothesis.
• H0 : the defendant is innocent
• H1 : the defendant is guilty
In a trial, the jury obtains information (sample data). It then deliberates about the evidence (the
data analysis). Finally, it either convicts the defendant (rejects the null hypothesis) or declares the
defendant not guilty (fails to reject the null hypothesis).

1
Note that the defendant is never declared innocent. That is, the null hypothesis is never declared
true. The two correct decisions are to declare an innocent person not guilty or declare a guilty
person to be guilty. The two incorrect decisions are to convict an innocent person (a Type I error)
or to let a guilty person go free (a Type II error). It is helpful to think in this way when trying to
remember the difference between a Type I and a Type II error.
When we studied confidence intervals, we learned that we never know whether a confidence interval
contains the unknown parameter. We only know the likelihood that a confidence interval captures
the parameter. Similarly, we never know whether the conclusion of a hypothesis test is correct.
However, just as we place a level of confidence in the construction of a confidence interval, we can
assign probabilities to making Type I or Type II errors when testing hypotheses. The following
notation is commonplace:
• α = P(Type I error) = P(rejecting H0 when H0 is true)
• β = P(Type II error) = P(not rejecting H0 when H1 is true)

2
Probability of Type I error
The probability of making a Type I error, α, is chosen by the researcher before the sample data are
collected. This probability equals the significance level α of the hypothesis test.
The choice of the level of significance depends on the consequences of making a Type I error. If
the consequences are severe, the level of significance should be small (say, α = 0.01). However, if
the consequences are not severe, a higher level of significance can be chosen (say α = 0.05 or α =
0.10). Why is the level of significance not always set at α = 0.01? Reducing the probability of making
a Type I error increases the probability of making a Type II error, β. Using the court analogy, a jury
is instructed that the prosecution must provide proof of guilt “beyond all reasonable doubt.” This
implies that we are choosing to make α small so that the probability of convicting an innocent person
is very small. The consequence of the small α, however, is a large β, which means many guilty
defendants will go free. There is an inverse relation between α and β (as one goes up the other
goes down).

3
Probability of Type II error
Given that H0 is false, a Type II error results from not reject H0 . This probability is not constant but
depends on factors such as the actual value of the parameter, the significance level α and the
sample size n.

Let’s see how to find the probability of a Type II error in a hypothesis test about a population
proportion. Consider a hypothesis test of H0 : p0 = 1/3 against H1 : p0 >1/3, using a significance level
of 0.05. Suppose the experiment uses a sample of size n = 116. The standard error for the test
p0 (1−p0 ) 1/3(1−1/3)
statistic is SE0 = √ =√ = 0.0438.
n 116
For H1 : p0 >1/3, a test statistic of z = 1.645 has a p-value of 0.05, so, if z ≥ 1.645, the p-value ≤
0.05 and we reject H0 . That is, we reject H0 when sample proportion p̂ falls at least 1.645 standard
errors about p0 = 1/3, i.e.,
1 1
p̂ ≥ 3 + 1.645*SE0 = 3 + 1.645 ∗ 0.0438 = 0.405
Therefore, we reject H0 when we get a sample proportion that is 0.405 or larger. The figure below
shows the sampling distribution of p̂ and this rejection region.

Figure. Rejection region of the hypothesis test for proportion

When H0 is false, a Type II error occurs when we fail to reject H0 . From the figure above, we do
not reject H0 if p̂ < 0.405. If the true value of p0 is 0.50, then the true sampling distribution of p̂ is
centered at 0.50. The probability of a Type II error is the probability that p̂ < 0.405. When p0 =
0.50, the sampling distribution of p̂ is approximately normal, with mean 0.50 and standard deviation
p0 (1−p0 ) 0.5(1−0.5)
SE0 = √ =√ = 0.0464. So, to find the probability of a Type II error when p0 = 0.50,
n 116
all we need to do is to find the area to the left of 0.405 under this normal distribution (as shown in

4
the figure below). This area is 0.02. In summary, when p0 = 0.50, the probability of marking a Type
II error and failing to reject H0 : p0 =1/3 is 0.02.

Figure. Calculation of P(Type II error)

The figure below shows the two figures together used in the reasoning. The normal distribution
with mean 1/3 was used to find the rejection region, based on what we expect for p̂ when H0 : p0
=1/3 is true. The normal distribution with mean 0.50 was used to find the probability that p̂ does
not fall in the rejection region even though p0=0.50 (that is, a Type II error occurs).

Figure. Sampling distributions of sample proportion

For a fixed significance level α, P(Type II error) decreases


• as the parameter value moves farther into the 𝐻1 values and away from the 𝐻0 value
• as the sample size increases
5
Before conducting a study, researchers should find P(Type II error) for the size of effect they want
to be able to detect. If P(Type II error) is high, it may not be worth conducting the study unless
they can use a larger sample size and lower the probability. They won’t know the value of the
parameter, so they won’t know the actual P(Type II error). It may be large if n is small and if the
true parameter value is not far from the value in 𝐻0 . This may be the reason a particular significance
test does not obtain a small p-value and reject 𝐻0 .

You might also like