Professional Documents
Culture Documents
Type I and Type II Errors
Type I and Type II Errors
Analogy
A statistical test is like a court case:
H0 is the defendant. H1 is the prosecutor
H0 is assumed innocent until proved guilty
Data is the evidence. The statistician is the jury.
If the evidence is sufficient, H0 is convicted
But, there can be miscarriages of justice
Risks
Whenever we carry out a significance test we risk making two mistakes:
Example:
You have been asked by a local health authority to test the hypothesis that there are more
girls being born than boys in that area.
Hence you visit the maternity ward of a local hospital and record the genders of the next 16 babies
born
Type I Error
You decide that you will reject H0 if there are at least 11 girls out of the 16 births
1
Demonstrating this in a spreadsheet simulation:
We could randomly generate lots of samples of 16 children, and collate the results. Eg,
Num of Samples 15 50 100 200
Num rejected 1 4 12 20
% rejected 6.7 8 12 10
We can see that as the number of samples increases, the % of rejections in our simulation
approaches 10.5% which is the figure expected according to theory (the binomial distribution).
We may want to reduce this type I error. ie reduce the probability of rejecting.
Clearly this can be done by making the rejection criterion stricter
eg. Reject if number of girls ≥ 12
2
Type II Error
Now, what if H0 is not true?
Suppose that the proportion of girls born in the area (population) as a whole is actually 0.6
Making the rejection criterion stricter reduced α but increased β. We can see that this continues to
be true:
eg. Consider rejecting H0 when X≥13.
α = 16C13 0.513 0.55 + … + 0.516 = 0.001
β = 1 – (16C13 0.613 0.45 + … + 0.416) = 0.935
Decreasing the Type I error increases the Type II error and vice-versa
Again assuming p is actually 0.6, we can (use Excel to) calculate 1 - P(X≥30 | p = 0.6) = β = 0.439
We have roughly the same α (≈ 0.1), but we have reduced β.
3
Increasing the sample size can decrease the Type II error without increasing the Type I error
Power
The power of a test is the probability that it correctly detects when H0 is false
We saw earlier that if we have a sample of 16 babies and we reject H0 if X≥11, and p actually = 0.6,
then β = 0.671. Hence under those conditions, the Power of the test = 1 – 0.671 = 0.329
We discovered above that increasing the sample size decreases β. Hence it follows that
increasing the sample size increases the power
The further the actual value moves away from 0.5, the higher the power ie
The further H0 is from the truth, the more likely we are to detect that it is not true
It follows that the closer your hypothesised value is to the true value, the more likely you don’t reject
it when you should. Hence in your conclusions to a statistical test you should never really say
‘Accept H0’, because there is a good chance that your hypothesised value (point estimate) isn’t
accurate. It is better to give a Confidence Interval.