You are on page 1of 11

9 – Introduction to Factorial Experiments

Two-way ANOVA
Example - Capsule Dissolving Experiment (Capsule.JMP)

In this experiment researchers are interested in studying the effect of two factors or
treatments on the time to begin dissolving a capsule which is recorded as the time until
bubbles first appear (seconds). The factors of interest to the researchers are digestive
juice type - gastric or duodenal (Factor A) and capsule type - C or V (Factor B).

To conduct the experiment 5 capsules of each type are randomly assigned to each juice
giving us 5 observations or replicates for each of the four treatment combinations
(Gastric & C, Gastric & V, Duodenal & C, Duodenal & V). The data obtained from the
experiment are shown below:
Capsule Type
Type of Digestive Juice C V Juice Type Means
39.5 47.4
45.7 43.5 Y1  45.7
Gastric 49.8 39.8
50.2 36.1
63.8 41.2
Y11  49.8 Y12  41.6
33.5 44
36.7 41.2 Y2  40.2
Duodenal 42 47.3
38.1 45.3
31.2 42.7
Y21  36.3 Y22  44.1
Capsule Type Means Y1  43.05 Y2  42.85 Grand Mean
X   42.95

We can construct plots to visualize the effects of each factor.


Digestive Juice Capsule Type

By plotting the mean time until By plotting the mean time until
bubbles for both digestive juices, bubbles for both capsule types we see
we can see that mean dissolution that the mean dissolution times for
time for duodenal juice is slightly the capsule types are approximately
smaller than that for gastric (about equal.
5 seconds). 127
Our preliminary conclusions would be first that fluid type has a small effect on the
dissolution time with duodenal juice dissolving capsules about 5 seconds quicker on
average, and secondly that capsule type has little or no effect.

These conclusions are completely WRONG!! Why?

When considering the effect of two factors on the response we cannot do so marginally,
i.e. individually. It is possible, for example, that the effect of digestive juice is not the
same for both capsule types. If we consider the means for each of the treatment
combinations above we see that for type C capsules the duodenal juice dissolves the
capsule quicker, while the exact opposite is true for type V capsules, gastric juice
dissolves the capsules faster.

A better display shows the means for each treatment combination. Here we have a
separate profile for each digestive juice showing how the capsule effect depends on the
type of digestive juice we are using. This is what we call an interaction.

Questions of Interest in Two-way ANOVA:


1) Is there a significant interaction between the two factors being studied?
This question needs to answered first, because if we conclude there is a
significant interaction then both effects are important and there effects can
not be discussed individually. If we conclude there isn’t a significant
interaction between the factors being studied then we can test the effects
individually.
2) Is there a significant Factor A effect?
3) Is there a significant Factor B effect?

128
As always it is important to quantify any significant differences using pair-wise
comparisons and CI’s for the differences in the population/treatment means.
Analysis in JMP
To fit the two-way model for these data select Fit Model from the Analyze menu and put
the response Time to Bubbles in the Y box and then highlight both Fluid & Capsule and
select Full Factorial from the Macros pull-down menu as shown below.

Then click Run Model to obtain the results on the next page.

These sections of output can be shut


off as our interest is in primarily
identifying which effects are
significant. These results are in the
Effect Tests box.

The Fluid*Capsule interaction is


significant (p=.0049), so we know
both fluid and capsule type
significantly effect the response.

The p-values for the effects suggests that the Fluid*Capsule interaction is significant
(p = .0049), which implies the main effect tests for Fluid and Capsule are of little interest.
It is interesting to note that the main effect of Capsule is not significant (p = .9361). This

129
happens because the presence of the Fluid*Capsule interaction "masks" the main effect of
capsule as we have seen in marginal effect plots above. The main effect of fluid is only
partially masked by the Fluid*Capsule interaction and so it still tests as significant.

Because the interaction is significant we can ONLY MEANINGFULLY COMPARE


LEVELS OF ONE FACTOR WHILE HOLDING THE OTHER FACTOR FIXED! For
example, in this experiment we can compare juice types for a given capsule type or
conversely we can compare capsule types for a given digestive juice type. To do this
select LSMeans Tukey HSD from the Fluid*Capsule interaction pull-down menu.

Results of the treatment mean comparisons are shown below.

130
Here we see that Gastric,C and Duodenal,C mean dissolution times significantly differ.
In particular we estimate that the type C capsules in gastric fluid take between 3.57 and
23.429 seconds longer to dissolve on average. In contrast type V capsules appear to
dissolve equally well in either digestive juice.

If the interaction between the two factors is NOT significant we can use the Tukey’s
procedure to compare the means across the levels of each factor individually. This means
that we can use Tukey’s pair-wise comparisons to compare the mean response across the
levels of factor A and factor B individually.

Checking Two-way ANOVA Assumptions (Normality and constant variance)

Assumptions:
1. The observations between and within the treatment combinations are
independent.
2. The response is normally distributed for each treatment combination.
3. The variance of the response is the same for each treatment combination.

To check the constant variance assumption we can examine the residuals plotted vs. the
fitted values and each factor.  The fitted values are simply the observed mean response at
each of the four treatment combinations and the residuals are the deviations from the
treatment combination means. The spread of the residuals, i.e. the spread of the observed
response values about their respective treatment combination means, should be uniform
indicating constant response variation for the different treatment combinations.

A plot of the residuals vs. the fitted values is given each time we fit a model in JMP.  The
resulting plot is shown below:

There appears to be a potential outlier in this plot, otherwise this plot looks fine.  

To examine the normality assumption we assess the normality of the residuals.  Save the
residuals to the spreadsheet as shown below and use Analyze > Distribution to examine
them. With the exception of two mild outliers, normality seems satisfied.

131
   
STATISTICAL DETAILS (FYI, equal sample size case only)

Two-way ANOVA Model

Yijk     i   j  ( ) ij   ijk i  1,..., a j  1,..., b k  1,..., n

where,

Yijk  kth observed response value when level i of factor A and level j of factor B is used.
 i  effect due to the fact level i of Factor A was used.
 j  effect due to the fact level j of Factor B was used.
( ) ij  effect due to the interaction of ith level of Factor A and the jth level of Factor B.
 ijk  the random error, represents the variation in the response values when the ith level of Factor A and
the jth level of Factor B are used.

We assume that  ijk ~ N (0,  ) , i.e. the errors are normal and their variation is constant.
2

See your text for formulae used to estimate these quantities and those used to test the
hypotheses. The three questions of interest in a two-way ANOVA can be formulated in
terms of these parameter values.

1. For testing the interaction between Factors A and B we have:


H o : ( ) ij  0 for all treatment combinations
H a : ( ) ij  0 for all treatment combinations
2. For testing the Factor A effect we have:
H o :  i  0 for all i
H a :  i  0 for all i
3. For testing the Factor B effect we have:
H o :  j  0 for all j
H a :  j  0 for all j

As in one-way ANOVA the test procedures decomposes total response variation into
components that measure how much variation in the response is due to Factor A, Factor

132
B, the interaction between Factors A & B, and random error.

Sum of Squares: SSTotal  SS A  SS B  SS A B  SS Error

Degrees of Freedom: N  1  ( a  1)  (b  1)  (a  1)(b  1)  ab( n  1)

SUM OF SQUARES FORMULAE:

a b n a b

 ( X ijk  X  ) 2 = nb ( X i  X  ) 2 + an ( X  j  X  ) 2 +


i 1 j 1 k 1 j 1
i 1

a b a b n
n (X ij   X i  X  j   X  ) 2 +  ( X ijk  X ij  ) 2
i 1 j 1 i 1 j 1 k 1

MEAN SQUARES (measures of variation)


The mean square for an effect is the effect sum of squares divided by the degrees of
freedom.

SS effect
MS effect 
df effect
When the null hypothesis of “no effect” is true the mean squares are all estimates of  2 ,
the common response variance for all treatment combinations. If there is a significant
effect then we expect the MS effect  MS Error  (within treatment combination
variation).

Testing Effect Significance


For testing the main effects (A & B) and the interaction effect (A  B) we simply
compare the size of the MS effect to the MS Error . If the MS effect >> MS Error we have
evidence that the effect is significant. If MS effect  MS Error then we have little evidence
that the effect is significant. This is analogous to the comparison of the between group
variation to the within group variation in One-way ANOVA.

To compare the mean squares we use the ratio, which has an F-distribution.
MS effect
Fo  ~ F-distribution (numerator df = df for the effect , denominator df = df for error)
MS Error

Fo >> 1 will lead to the conclusion that the effect in question significantly impacts the
response. Large Fo values lead to small p-values which support effect significance.

133
Example 2 – Comparing the Effectiveness of Three Forms of
Psychotherapy for Alleviating Depression

Suppose that a clinical psychologist is interested in comparing the relative effectivenss of


three forms of psychotherapy for alleviating depression. Fifteen individuals are randomly
assigned to each of three treatment groups: cognitive-behavioral, Rogerian, and
assertiveness training. The Depression Scale of MMPI serves as the response. The
psychologist also wished to incorporate information about the patients severity of
depression, so all subjects in the study were classified as having mild, moderate, or
severe depression. Thus we have two factor of interest in this study: the treatment they
received and the initial severity of their depression. It is possible some forms of therapy
may be more effective for certain levels of depression so a two-way ANOVA would be
an appropriate method of analysis. The results are presented in the table below.

Therapy Mild Moderate Severe


Cognitive-Behavioral 41 51 45
(CB) 43 43 55
50 53 56
54 60
46 58
62
62

Rogerian (R) 56 58 59
47 54 55
45 49 68
46 61 63
49 52
62

Assertiveness Training 43 59 55
(A) 56 46 69
48 58 63
46 54 56
47 62
67

134
The data in JMP is entered as shown on the
left. The first column contains the Therapy
grown, the second column contains the
Degree of Severity of their depression, and
the last column denotes the MMPI
Depression score.

Data File: MMPI Depression

In JMP select Analyze > Fit Model and set up the dialog box as shown below. First
highlight both Therapy and Degree of Severity by holding down the <Shift> key and
select Full Factorial from the Macro pull-down menu as shown below.

The output is shown on the next page.

135
Test results:
Interaction

Therapy

Degree of Severity

Interaction Plot

The interaction plot shown on the


left shows no signs of non-
parallelism and hence interaction,
and the p-value in the ANOVA
table suggests we have absolutely
no evidence for its significance
(p=.9845).

136
Because the interaction is not significant we can examine the main effects of therapy and
degree of severity separately.

The mean MMPI depression scores differ across therapy (p = .0422). In particular,
the mean depression scores for patients in the cognitive-behavioral group appear to
be significantly lower than that for the other to therapies.

As expected the initial severity of the depression highly significant (p < .0001). The
mean depression scores follow the expected trend after treatment with those with the
most severe initial depression having the highest mean depression and those with
mild initial depression the lowest.

Multiple Comparisons for Therapy and Degree of Severity

We see that the mean nutrient depression score for cognitive-behavioral therapy was
significantly lower than that for patients undergoing Rogerian therapy. We estimate the
mean MMPI depression score is between .079 and 9.872 units smaller for patients
receiving cognitive-behavioral therapy vs. Rogerian regardless of the their initial
depression severity.

All levels of initial severity of depression significantly differ from one another in terms of
mean MMPI depression score following therapy regardless of what therapy they
received.

137

You might also like