You are on page 1of 5

Tracy Nadine Z.

Pagsanjan
MAED MATH 1B

How And When To Use The Different Types Of Post-Hoc Test

A post hoc test is used only after find a statistically significant result and needs
to determine where our differences come from. The term "post hoc" is derived from
the Latin phrase "after the occurrence." There are several other post hoc tests devised,
and most of them will give us identical results. The following are the most regularly
used ones. The most commonly used ones are the following:

1. Bonferroni Test
2. Bonferroni Procedure
3. Benjamini-Hochberg (BH) procedure
4. Duncan’s new multiple range test (MRT)
5. Dunn’s Multiple Comparison Test
6. Dunnett’s correction
7. Fisher’s Least Significant Difference (LSD)
8. Holm-Bonferroni Procedure
9. Newman-Keuls
10. Rodger’s Method
11. Scheffé’s Method
12. Tukey’s Honest Significant Difference

1. Bonferroni Test
A Bonferroni test is the simplest post hoc analysis. It is a series of t-tests
performed on each pair of groups. For example, the number of comparisons quickly
increases as the number of groups grows, inflating Type I error rates. Using the
Bonferroni test, the significance level decreases by the number of comparisons,
resuming the original Type I error rate once it is all performed. After we have our new
significance level, we simply do independent samples t-tests to look for a difference
between our two groups. This adjustment is sometimes called a Bonferroni
Correction, and it is easy to do by hand if we want to compare obtained p-values to
our new corrected α level, but it will be more difficult to do when using critical values
as we do for our analyses so we will leave our discussion of it to that.

2. Bonferroni Procedure (Bonferonni Correction)


This multiple-comparison post hoc adjustment can be used when doing many
independent or dependent statistical tests. The problem with running many
simultaneous tests is that the probability of a significant result increases with each test
run. This post hoc test sets the significance cut-off at α/n. For example, if 20 tests are
done simultaneously at = 0.05, the correction is 0.0025. The Bonferroni does suffer
from a loss of power. Due to several reasons the fact that Type II error rates are high
for each test. In other words, it overcorrects for Type I errors.
Bonferroni's method provides a pairwise comparison of the means. To determine
which means are significantly different, we must compare all pairs. To start, we must
select a value for alpha (α), the confidence level. We will select α = 0.05. In
Bonferroni's method, the idea is to divide this family-wise error rate (0.05) among the
k tests. So each test is done at the α/k level. We will use the t distribution to help
determine the pairwise confidence interval. To start, we need to calculate the pooled
variance.

The Bonferroni method says a pairwise difference is significant if:

or

where t is the value from the t distribution for γ degrees of freedom and α/2k
confidence, sp is the pooled standard deviation, y is the mean and n is the sample size.
The i and j represent two different treatments. As long as the confidence interval does
not contain 0, there is a significant difference in the two means.

3. Benjamini-Hochberg (BH) procedure


The Benjamini-Hochberg procedure is a useful method for lowering the rate of
false discoveries. If you run a large number of experiments, one or more of them will
provide a meaningful result only by chance. The rate helps to control the issue that p-
values (less than 5%) might occur by accident, causing the valid null hypotheses to be
rejected. In other words, the B-H Procedure helps you to avoid Type I errors (false
positives). A p-value of 5% indicates that if the null hypothesis were true, there is
only a 5% probability of getting the observed result. In other words, a p-value of 5%
indicates that the null hypothesis is likely to be false and should be rejected. However,
it's simply a probability–true null hypotheses are frequently dismissed due to the
unpredictability of outcomes.

4. Duncan’s new multiple range test (MRT)


Duncan's Multiple Range Test involves the computation of numerical
boundaries, that allow for the classification of the difference between any 2 treatment
means as significant or non-significant and will identify the pairs of means that are
out of the ordinary. The MRT is similar to the LSD, except instead of at-value, it
employs a Q value. If there is a difference in means, the results of an Analysis of
Variance (ANOVA) will tell you. It will not, however, distinguish the distinct pairs of
means.

5. Dunn’s Multiple Comparison Test


The Dunn's Multiple Comparison Test can identify which means vary considerably
from the others. A non-parametric post hoc test is Dunn's Multiple Comparison Test.
It's one of the weaker multiple comparisons tests, and it's best used with caution,
especially when there are many. If 10 comparisons are conducted at an alpha level
of.05, the approach produces a per-comparison error rate of.005. The Dunn test is an
alternative to the Tukey test when you only want to test for differences in a small
portion of all possible pairings; for larger sets of pairwise comparisons, use Tukey's
instead. Use Dunn's when you want to test a certain number of comparisons but don't
want to compare to controls before conducting the ANOVA. The Dunnett test is a
suitable choice if you're comparing it to a control group. Dunn’s test divides the
overall alpha level by the number of comparisons

6. Dunnett’s correction
Dunnett's test compares each experimental, or treatment, group to a single
control group by generating a student's t-statistic for each. Since each comparison has
the same control in common, the procedure incorporates the dependencies between
these comparisons. Following the ANOVA, Dunnett's test may be performed to
determine whether pairs have significant differences.
Because all other samples are compared to one fixed "control" group, it should only
be used when one is available. If you don't have a control group, you can utilize
Tukey's Test. It is the same as Tukey’s test but it can control means.

7. Fisher’s Least Significant Difference (LSD)


Fisher’s Least Significant Difference (LSD) is a method for figuring out which
pairs of means differ statistically. Essentially the same as Duncan’s MRT, but with t-
values instead of Q values. The least significant difference test, invented by Fisher in
1935, is only used when your hypothesis test results reject the null hypothesis. The
LSD estimates the lowest significance between the two means as though a test had
been performed on them.
Formula :

Where: t = critical value from the t-distribution table


MSw = mean square within, obtained from the results of your ANOVA test
n = number of scores used to calculate the means.

8. Holm-Bonferroni Method
The usual Bonferroni method is sometimes criticized for being too
conservative. Holm's sequential Bonferroni post hoc test is a less rigorous correction
for multiple comparisons. The Holm-Bonferroni Method (also known as Holm's
Sequential Bonferroni Procedure) is a procedure for dealing with the familywise error
rates of multiple hypothesis tests (FWER). It's a Bonferroni correction that has been
adjusted. It's just as simple to calculate as the single-step Bonferroni approach, but it's
more powerful.

Formula:

Where: Target alpha level = overall alpha level (usually .05),


n = number of tests.

9. Newman-Keuls
Newman-Keuls (sometimes called Student–Newman–Keuls or SNK) is a post
hoc test for differences in means. Once an ANOVA has given a statistically
significant result, you can run a Newman-Keuls to see which specific pairs of means
are different. The outliers range distribution will be used in the test. This post hoc test,
like Tukey's, identifies sample means that vary from one another.
For comparing pairs of means, Newman-Keuls utilizes distinct critical values. As a
result, substantial differences are more likely to be discovered. Duncan's multiple
range test (MRT) is a variation of the Student–Newman–Keuls test that employs
rising alpha levels to compute critical values in each step. It will result in a lower
probability of Type I error.
Ho: mean A = mean B,
Ha: mean A ≠ mean B.

Where A and B could be any possible pair.

10. Rodger’s Method


Some consider this to be the most effective post hoc test for finding group
differences. As the degrees of freedom rise, this test guards against statistical power
loss.

11. Scheffe’s Test


Scheffe's Test is another popular post hoc test. It works in a similar way to
Tukey's HSD in that it adjusts the test statistic for the number of comparisons made
but in a slightly different way. As a result, the test is less likely to make a Type I
Error but has a lower capacity to detect effects. Looking at Scheffe's test's confidence
intervals, we can see the following:

Table 1: Confidence intervals given by Scheffe’s test


Comparison Difference Tukey’s HSD CI
None vs Relevant 40.60 (28.35, 52.85)
None vs Unrelated 19.50 (7.25, 31.75)
Relevant vs Unrelated 21.10 (8.85, 33.35)

As we can see, these are somewhat broader than the intervals we got with Tukey's
HSD. If all other components are equivalent, this indicates that they are more likely to
contain zero. However, the findings are the same in our case, and we infer that the
three groups are different once more.

12. Tukey’s Honest Significant Difference


Tukey’s Honest Significant Difference (HSD) is a very popular post hoc
analysis. This approach, like Bonferroni's, makes adjustments based on the number of
comparisons, but it also makes adjustments to the test statistic when comparing two
groups. These comparisons provide us a figure for the difference between the groups
as well as a confidence interval for the figure. This confidence interval is the same
way as a confidence interval for a traditional independent samples t-test is used: if it
contains 0.00, the groups are not different; if it does not 0.00, the groups are different.
The following is a list of the differences between the group means, as well as Tukey's
HSD confidence intervals for the differences:

Table 2: Differences between the group means and the Tukey’s HSD confidence
intervals
Comparison Difference Tukey’s HSD CI
None vs Relevant 40.60 (28.87, 52.33)
None vs Unrelated 19.50 (7.77, 31.23)
Relevant vs Unrelated 21.10 (9.37, 32.83)

As we can see, none of these intervals contain 0.00, so we can conclude that all three
groups are different from one another.

There are many more post hoc tests and they all approach the task in different
ways, with some being more conservative and others being more powerful. In general,
though, they will give highly similar answers. It is critical to be able to grasp a post
hoc analysis in this situation. If you're given confidence intervals for post hoc
analysis, read them the same way you did confidence intervals in chapter 10: if they
include zero, there's no difference; if they don't include zero, there is.

References:

Foster et al. (2021, May 2). “Post Hoc Tests.” (University of Missouri-St. Louis,
Rice University, & University of Houston, Downtown Campus).Retrieved
May 17, 2021, from https://stats.libretexts.org/@go/page/7154

Glen, Stephanie. (2015)"Post Hoc Definition and Types of Tests" From


StatisticsHowTo.com: Elementary Statistics for the rest of us!. Retrieved May
17, 2021, https://www.statisticshowto.com/probability-and-statistics/statistics-
definitions/post-hoc
Glen, Stephanie. (2017) "Dunnett’s Test / Dunnett’s Method: Definition" From
StatisticsHowTo.com: Elementary Statistics for the rest of us! Retrieved May
20, 2021 https://www.statisticshowto.com/dunnetts-test/

Glen, Stephanie. (2021)"How to Calculate the Least Significant Difference (LSD)"


From StatisticsHowTo.com: Elementary Statistics for the rest of us! Retrieved
May 20, 2021https://www.statisticshowto.com/how-to-calculate-the-least-
significant-difference-lsd/

McNeese, Bill. (2009) “Comparing Multiple Treatment Means: Bonferroni's Method”


Copyright © 2021 BPI Consulting, LLC. Retrieved May 17, 2021
https://www.spcforexcel.com/knowledge/comparing-processes/bonferronis-
method

Vidhi, J. “Duncan’s Multiple Range Test (With Diagram) | Statistics” Retrieved May
20, 2021 https://www.biologydiscussion.com/vegetable-breeding/duncans-
multiple-range-test-with-diagram-statistics/68180

You might also like