You are on page 1of 24

LESSON 3.

5: POST HOC ANALYSIS


As noted earlier, the primary advantage of ANOVA (compared to t tests) is that it allows researchers to
test for significant mean differences when there are more than two treatment conditions. ANOVA
accomplishes this feat by comparing all the individual mean differences simultaneously within a single
test. Unfortunately, the process of combining several mean differences into a single test statistic creates
some difficulty when it is time to interpret the outcome of the test.

Specifically, when you obtain a significant F-ratio (reject H0), it simply indicates that somewhere among
the entire set of mean differences there is at least one that is statistically significant. In other words, the
overall F-ratio only tells you that a significant difference exists; it does not tell exactly which means are
significantly different and which are not.

Consider, for example, a research study that uses three samples to compare three treatment conditions.
Suppose that the three sample means are M1 3, M2 5, and M3 10. In this hypothetical study there are
three mean differences:

1. There is a 2-point difference between M1 and M2.

2. There is a 5-point difference between M2 and M3.


3. There is a 7-point difference between M1 and M3

If an ANOVA were used to evaluate these data, a significant F-ratio would indicate that at least one of the
sample mean differences is large enough to satisfy the criterion of statistical significance. In this example,
the 7-point difference is the biggest of the three and, therefore, it must indicate a significant difference
between the first treatment and the third treatment(𝜇1 ≠ 𝜇2 ). But what about the 5-point difference? Is
it also large enough to be significant? And what about the 2-point difference between M1 and M2? Is it
also significant? The purpose of post hoc tests is to answer these questions.

 Post Hoc –or posttest are additional hypothesis test that are done after an ANOVA to
determine exactly the differences are significant and which are not.

As the name implies, post hoc tests are done after an ANOVA. More specifically, these tests are
done after ANOVA when
1. You reject H0 and
2. There are three or more treatments (k ≥ 3)
Answer the following question:

1. Imagine you are trap in a dark room where there is no existing light for about
one week. What will be reaction about it?
2. What is the greatest lesson you learned from your parents?
3. If you become blind by the challenges in life, how you were able to surpassed
this?

Rejecting H0 indicates that at least one difference exists among the treatments. If there are only two
treatments, then there is no question about which means are different and, therefore, no need for
posttests. However, with three or more treatments (k≥ 3), the problem is to determine exactly which
means are significantly different.

POSTTESTS AND TYPE I ERRORS

In general, a post hoc test enables you to go back through the data and compare the individual
treatments two at a time. In statistical terms, this is called making pairwise comparisons. For
example, with k = 3, we would compare 𝜇1 versus 𝜇 2, then 𝜇 2 versus 𝜇 3, and then 𝜇 1 versus
𝜇 3. In each case, we are looking for a significant mean difference. The process of conducting
pairwise comparisons involves performing a series of separate hypothesis tests, and each of these
tests includes the risk of a Type I error. As you do more and more separate tests, the risk of a
Type I error accumulates and is called the experiment wise alpha level.

We have seen, for example, that a research study with three treatment conditions produces three
separate mean differences, each of which could be evaluated using a post hoc test. If each test
uses 𝑎 = .05, then there is a 5% risk of a Type I error for the first posttest, another 5% risk for
the second test, and one more 5% risk for the third test. Although the probability of error is not
simply the sum across the three tests, it should be clear that increasing the number of separate
tests definitely increases the total, experiment wise probability of a Type I error. Whenever you
are conducting posttests, you must be concerned about the experiment wise alpha level.
Statisticians have worked with this problem and have developed several methods for trying to
control Type I errors in the context of post hoc tests. We consider two alternatives
DIFFERENT KINDS OF POST HOC ANALYSIS

Tukey’s Honesty Significant Difference (HSD) test

- The first post hoc test we consider is Tukey’s HSD test. We selected Tukey’s HSD test because
it is a commonly used test in psychological research.

- Tukey’s test allows you to compute a single value that determines the minimum difference
between treatment means that is necessary for significance.

- This value, called the honestly significant difference, or HSD, is then used to compare any two
treatment conditions.

- If the mean difference exceeds Tukey’s HSD, then you conclude that there is a significant
difference between the treatments. Otherwise, you cannot conclude that the treatments are
significantly different.

The formula for Tukey’s HSD is

where the value of q is found in Table B.5 (Appendix B, p. 708), MSwithin is the within
treatments variance from the ANOVA, and n is the number of scores in each treatment. Tukey’s
test requires that the sample size, n, be the same for all treatments. To locate the appropriate
value of q, you must know the number of treatments in the overall experiment (k), the degrees
of freedom for MSwithin (the error term in the F-ratio), and you must select an alpha level
(generally the same 𝑎 used for the ANOVA).

Example:

To demonstrate the procedure for conducting post hoc tests with


Tukey’s HSD, we use the hypothetical data shown below
The data represent the results of a study comparing scores in three different
treatment conditions. Note that the table displays summary statistics for each sample and
the results from the overall ANOVA. With k = 3 treatments, df within =24, and 𝑎 = .05, you
should find that the value of q for the test is q =3.53. Therefore, Tukey’s HSD is

Thus, the mean difference between any two samples must be at least 2.36 to be significant. Using
this value, we can make the following conclusions:

1. Treatment A is significantly different from treatment B (MA - MB = 2.44).


2. Treatment A is also significantly different from treatment C (MA -MC = 4.00).
3. Treatment B is not significantly different from treatment C (MB - MC = 1.56).

Scheffe Test

Because it uses an extremely cautious method for reducing the risk of a


Type I error, the Scheffé test has the distinction of being one of the safest of all
possible post hoc tests (smallest risk of a Type I error). The Scheffé test uses an
F-ratio to evaluate the significance of the difference between any two treatment
conditions. The numerator of the F-ratio is an MSbetween that is calculated using only
the two treatments you want to compare. The denominator is the same MS within
that was used for the overall ANOVA. The “safety factor” for the Scheffé test
comes from the following two considerations.

1. Although you are comparing only two treatments, the Scheffé test uses
the value of k from the original experiment to compute df between
treatments. Thus, df for the numerator of the F-ratio is k - 1.

2. The critical value for the Scheffé F-ratio is the same as was used to
evaluate the F-ratio from the overall ANOVA. Thus, Scheffé requires
that every posttest satisfy the same criterion that was used for the
complete ANOVA.
Bonferroni Procedure (Bonferonni Correction)

- This multiple-comparison post hoc correction is used when you are performing many
independent or dependent statistical tests at the same time.
- The problem with running many simultaneous tests is that the probability of
a significant result increases with each test run.
- This post hoc test sets the significance cut off at α/n.

For example, if you are running 20 simultaneous tests at α = 0.05, the correction would be
0.0025. More detail. The Bonferroni does suffer from a loss of power. This is due to several
reasons, including the fact that Type II error rates are high for each test. In other words, it
overcorrects for Type I errors.

Holm – Bonferroni Method

- The ordinary Bonferroni method is sometimes viewed as too conservative. Holm’s


sequential Bonferroni post hoc test is a less strict correction for multiple comparisons.

Duncan’s new multiple range test (MRT)

- When you run Analysis of Variance (ANOVA), the results will tell you if there is a difference in
means. However, it won’t pinpoint the pairs of means that are different. Duncan’s Multiple
Range Test will identify the pairs of means (from at least three) that differ. The MRT is similar to
the LSD, but instead of a t-value, a Q Value is used.

Fisher’s Least Significant Different (LSD)


- A tool to identify which pairs of means are statistically different. Essentially the same
as Duncan’s MRT, but with t-values instead of Q value.

Newman-Keuls
- Like Tukey’s, this post hoc test identifies sample means that are different from each other.
Newman-Keuls uses different critical values for comparing pairs of means. Therefore, it is
more likely to find significant differences.

Rodgers Method
- Considered by some to be the most powerful post hoc test for detecting differences among
groups. This test protects against loss of statistical power as the degrees of freedom increase.

Dunnett’s correction
- Like Tukey’s this post hoc test is used to compare means. Unlike Tukey’s, it compares every
mean to a control mean.

Benjamini- Hochberg (BH) procedure


- If you perform a very large amount of tests, one or more of the tests will have a significant
result purely by chance alone. This post hoc test accounts for that false discovery rate.
Answer the following:

1. With k=2 treatments, are post hoc tests necessary when the null hypothesis
is rejected? Explain why or why not.

2. An ANOVA comparing three treatments produces an overall F-ratio with df= 2,


27. If the Scheffé test was used to compare two of the three treatments, then
the Scheffé F-ratio would also have df = 2, 27. (True or false?)
LESSON 3.6: TWO-WAY ANALYSIS OF VARIANCE (ANOVA)
As we learn about the basic Analysis of Variance which the One-Way is used to
enable the equality testing between three or means, this eventually a best guide in
order to know more about the two-way.

Listen to the song of James Ingram “There’s No Easy Way” and answer
the following question:
1. Which among the following lines you hit the most?
2. What does the song conveys?
3. TRUE or FALSE: There’s no easy way to break somebody’s heart

Two- Way ANOVA - is used to estimate how the mean of a quantitative


variable changes according to the levels of two
categorical variables.

- Used when you wanted to know how two independent


variables, in combination, affect a dependent variable.

Example:

You are researching which type of fertilizer and planting density produces the greatest
crop yield in a field experiment. You assign different plots in a field to a combination of fertilizer
type (1, 2, or 3) and planting density (1=low density, 2=high density), and measure the final
crop yield in bushels per acre at harvest time.

You can use a two-way ANOVA to find out if fertilizer type and planting density have an
effect on average crop yield.

How does the ANOVA test work?


- ANOVA tests for significance using the F-test for statistical significance. The F-
test is a group wise comparison test, which means it compares the variance in
each group mean to the overall variance in the dependent variable.

- If the variance within groups is smaller than the variance between groups, the
F-test will find a higher F-value, and therefore a higher likelihood that the
difference observed is real and not due to chance.

A two-way ANOVA with interaction tests three null hypotheses at the same time: There is
no difference in group means at any level of the first independent variable. There is no
difference in group means at any level of the second independent variable. The effect of
one independent variable does not depend on the effect of the other independent variable
(a.k.a. no interaction effect). A two-way ANOVA without interaction (a.k.a. an additive
two-way ANOVA) only tests the first two of these hypotheses.

Assumptions of the two-way ANOVA

To use a two-way ANOVA your data should meet certain assumptions. Two-way ANOVA makes
all of the normal assumptions of a parametric test of difference:
1. Homogeneity of variance (a.k.a. homoscedasticity)

The variation around the mean for each group being compared should be similar among
all groups. If your data don’t meet this assumption, you may be able to use a non-
parametric alternative, like the Kruskal-Wallis test.

2. Independence of observations

Your independent variables should not be dependent on one another (i.e. one should
not cause the other). This is impossible to test with categorical variables – it can only
be ensured by good experimental design.

In addition, your dependent variable should represent unique observations – that is,
your observations should not be grouped within locations or individuals.

If your data don’t meet this assumption (i.e. if you set up experimental treatments
within blocks), you can include a blocking variable and/or use a repeated-measures
ANOVA.

3. Normally-distributed dependent variable

The values of the dependent variable should follow a bell curve. If your data don’t meet
this assumption, you can try a data transformation.

Differences of data from One Way ANOVA and Two way ANOVA?
How to perform the two-way ANOVA?

Example:

Does noise has an effect on the students scores?

Does sex has an effect on the students score?

Does sex affects how students react to noise?

Step 1: Know the hypothesis

Null hypothesis: There is no effect of one variable on the other.


Step 2: Get the column and rows total

Step 3: Substitute with the formula

Step 4: Get the sum of squares total (SST).


Step 5: Get the sum of squares of column for Variation of Noise:

Step 6: Get the sum of squares of column for Variation of Sex:

Step 7: Get the sum of each group


Step 8: Know the sum of squares within group

= 16.33

Step 9: Get the residual sum of squares (SSE) (Error)

Step 10: Create a table


Step 11: Get the f-ratio

Step 12: Get the critical values at a=0.05

Step 13: Create an interpretation


LESSON 4: CORRELATION

In different fields like Psychology and Education, correlation as a measure of relationship


between test scores and other measures of performance. With the help of this, it is possible to
have a correct idea of the working capacity of the person.

Answer the following:

1. What makes you relationship stronger?


2. If you ever engaged in relationship, what do you preferred: private or
public?
3. What are the things you wanted to give up or hold on in order to make your
relationship stronger?

Statisticians use a measure called the correlation coefficient to determine the strength of
the linear relationship between two variables. There are several types of correlation coefficient

 The population correlation coefficient - denoted by the Greek letter 𝜌 is the correlation
computed by using all possible pairs of data values (x,
y) taken from a population.

 The linear correlation coefficient - computed from the sample data measures the strength
and direction of a linear relationship between two
quantitative variables. The symbol for the sample correlation
coefficient is r.

The linear correlation coefficient explained in this section is called the Pearson product moment
correlation coefficient (PPMC), named after statistician Karl Pearson, who pioneered the research in this
area.

The range of the linear correlation coefficient is from −1 to +1. If there is a strong positive linear
relationship between the variables, the value of r will be close to +1. If there is a strong negative linear
relationship between the variables, the value of r will be close to −1.

When there is no linear relationship between the variables or only a weak relationship, the value
of r will be close to 0.
When the value of r is 0 or close to zero, it implies only that there is no linear relationship between
the variables. The data may be related in some other nonlinear way.
Levene’s Test Equality of Variance

- It is used to check hat variances are equal for all samples when your data comes
from a non normal distribution. You can use Levene’s test to check the assumption of
equal variances before running a test like One-Way ANOVA.

The null hypothesis for Levene’s is that the variances are equal across all samples. In more formal terms,
that’s written as:
H0: σ12 = σ22 = … = σk2.
The alternate hypothesis (the one you’re testing), is that the variances are not equal for at least one pair:
H0: σ12 ≠ σ22 ≠… ≠ σk2.
The test statistic is a little ugly and involves a few summations:

Zi,j can take on three meanings, depending on if you use the mean, median, or trimmed mean of any
subgroup. The three choices actually determine the robustness and power of the test.

 Robustness, is a measure of how well the test does not falsely report unequal variances (when the
variances are actually equal).
 Power is a measure of how well the test correctly reports unequal variances.

According to Brown and Forsythe:

 Trimmed means work best with heavy-tailed distributions like the Cauchy distribution.
 For skewed distributions, or if you aren’t sure about the underlying shape of the distribution, the median may
be your best choice.
 For symmetric and moderately tailed distributions, use the mean.
Levene’s test is built into most statistical software. For example, the Independent Samples T Test in
SPSS generates a “Levene’s Test for Equality of Variances” column as part of the output. The result from
the test is reported as a p-value, which you can compare to your alpha level for the test. If the p-value is
larger than the alpha level, then you can say that the null hypothesis stands — that the variances are
equal; if the p-value is smaller than the alpha level, then the implication is that the variances are unequal.
Example:

Boys Girls
26 36
15 32
21 42
20 33

38 29

19 29
28 46
49 33

18 25
16 35

First Step: Get the mean of each group

Boys Girls
26 36
15 32
21 42
20 33

38 29

19 29
28 46
49 33

18 25
16 35
Mean: 25 Mean: 34
Second Step: Get the absolute differences of x-𝜇

Boys Girls Xboys Xgirls


26 36 1 2
15 32 10 2
21 42 4 8
20 33 5 1

38 29 13 5

19 29 6 5
28 46 3 12
49 33 24 1

18 25 7 9
16 35 9 1
Mean: 25 Mean: 34

Third step: get the sum of x- 𝜇

Boys Girls Xboys Xgirls


26 36 1 2
15 32 10 2
21 42 4 8
20 33 5 1

38 29 13 5

19 29 6 5
28 46 3 12
49 33 24 1

18 25 7 9
16 35 9 1
Mean: 25 Mean: 34 8.2 4.6
Fourth step: Get the mean of sum of x- 𝜇 of boys and girls

Boys Girls Xboys Xgirls


26 36 1 2
15 32 10 2
21 42 4 8
20 33 5 1

38 29 13 5

19 29 6 5
28 46 3 12
49 33 24 1

18 25 7 9
16 35 9 1
Mean: 25 Mean: 34 8.2 4.6

Mean: 6.4

Fifth step: Get again the value of each boys and girls with the overall mean of two groups

Boys Girls Xboys Xgirls xb Xg


26 36 1 2 -5.4 -4.4
15 32 10 2 3.6 -4.4
21 42 4 8 -2.4 1.6
20 33 5 1 -1.4 -5.4

38 29 13 5 6.6 -1.4

19 29 6 5 -0.4 -1.4
28 46 3 12 -3.4 5.6
49 33 24 1 17.6 -5.4

18 25 7 9 0.6 2.6
16 35 9 1 2.6 -5.4
Mean: 25 Mean: 34 8.2 4.6

Mean: 6.4
Sixth step: Squared the xb and xg

Boys Girls Xboys Xgirls xb Xg xb2 xg2


26 36 1 2 -5.4 -4.4 29.16 19.36
15 32 10 2 3.6 -4.4 12.96 19.36
21 42 4 8 -2.4 1.6 5.76 2.56
20 33 5 1 -1.4 -5.4 1.96 29.16
38 29 13 5 6.6 -1.4 43.56 1.96
19 29 6 5 -0.4 -1.4 0.16 1.96
28 46 3 12 -3.4 5.6 11.56 31.36
49 33 24 1 17.6 -5.4 309.76 29.16
18 25 7 9 0.6 2.6 0.36 6.76
16 35 9 1 2.6 -5.4 6.76 29.16
Mean: 25 Mean: 34 8.2 4.6

Mean: 6.4

Seventh step: Get the sum of xb2 and xg 2 and overall sum

Boys Girls Xboys Xgirls xb Xg xb2 xg2


26 36 1 2 -5.4 -4.4 29.16 19.36
15 32 10 2 3.6 -4.4 12.96 19.36
21 42 4 8 -2.4 1.6 5.76 2.56
20 33 5 1 -1.4 -5.4 1.96 29.16
38 29 13 5 6.6 -1.4 43.56 1.96
19 29 6 5 -0.4 -1.4 0.16 1.96
28 46 3 12 -3.4 5.6 11.56 31.36
49 33 24 1 17.6 -5.4 309.76 29.16
18 25 7 9 0.6 2.6 0.36 6.76
16 35 9 1 2.6 -5.4 6.76 29.16
Mean: 25 Mean: 34 8.2 4.6 422 170.8
Mean: 6.4 Sum of squares: 592.8
Eight step: Complete the table put the total sum of squares

SS Df MS F
Between

Within

Total 592.8

Ninth step: Get the degrees of freedom between, within and total

SS Df MS F
Between 1

Within 18

Total 592.8 19

Tenth step: Get the sum of square between group (Formula for excel: 10*(xboys-overall xboys
and xgirl)^2

SS Df MS F
Between 1

Within 64.8 18

Total 592.8 19

Eleventh step: Get the sum of squares within by subtracting sum of squares total and sum of
squares between
SSw= SSt – SSb

SS Df MS F
Between 64.8 1

Within 528 18

Total 592.8 19
Twelve step: Get the mean sum of square by dividing the Sum of squares by degrees of
freedom

SS Df MS F
Between 64.8 1 64.8

Within 528 18 29.33

Total 592.8 19

Thirteenth step: Get the f-value by dividing the two mean sum of squares

SS Df MS F
Between 64.8 1 64.8 2.20934..

Within 528 18 29.33

Total 592.8 19

Fourteenth step: Get the sig 2-tail by

You might also like