Professional Documents
Culture Documents
Introduction
To date we have considered designs in which there has been one, unstructured, set of
treatments. Often we are interested in conducting experiments in which we consider
the effect of each of a number of factors (attributes) on the response. Examples
of factors include nutrients, drugs, moisture content, temperature, pressure, colour,
shape, and so on.
For instance we might be interested in the effects of both print colour and screen
colour on the comfort of users of computer screens. We might use print colours of
white, black, yellow and green. These are called the levels of the factor print colour.
For the factor screen colour we might use levels of black, red and blue. Thus we
have a design with two factors. The first has four levels and the second has three
levels.
We define a treatment combination to be the specific levels of each of the factors
applied to a particular experimental unit.
In a complete factorial design each of the treatment combinations appears once.
In a replicated complete factorial experiment, each of the treatment combinations
appears the same number of times.
The goal of a factorial experiment is to say what the effect of each of the factors is
on the response, and whether two or more of the factors interact.
Factors can be quantitative (have levels that are some metrical values) or qualita-
tive (levels are categories). Both of the factors print colour and screen colour are
qualitative whereas factors like temperature and pressure are quantitative.
Consider an experiment involving two factors. The two factors are said to be ad-
ditive if the effect on the response of a change in the levels of one of the factors is
independent of the level of the other factor. If this is not true then the two factors
are said to interact.
A simple effect is the mean change in the response in one factor when we fix the
other factor to a single level
We define a simple effect of factor A to be a contrast between the levels of factor A
for a fixed level of factor B. Thus if a = b = 2 we get
for the simple effects of B. If a = 3 and b = 4 then there are various different ways
that we could define the simple effects of A and B.
B
A 1 2
1
Main effects
A main effect is the mean change in response averaged over all levels of the other
factor. We can also think of a main effect as the average of all of the simple effects
for that factor.
The main effect of a factor is the average of all the simple effects of that factor (all
defined using the same contrast) or, equivalently, it is the contrast applied to the
row (or column) means. To calculate the main effect of a factor F with f levels we
can use any set of f − l orthogonal contrasts. Thus if a = b = 2 we get
An interaction effect measures how different the simple effects for one factor are for
each level of another factor. If the simple effects are quite different, then the two
factors are said to interact. That is, the effect of one factor on the response depends
on which level the other factor is at.
An interaction effect of factors A and B is a contrast between the simple effects of A
or, equivalently, between the simple effects of B. Again to calculate the interaction
effect between two factors with levels a and b we must use a set of (a − 1)(b − 1)
orthogonal contrasts. Thus if a = b = 2 we get
We can investigate main effects using main effects plots, and both simple effects and
interaction effects using interaction plots. To obtain a main effects plot in Minitab,
we use Stat > ANOVA > Main Effects Plot. To obtain an interaction plot, we
use Stat > ANOVA > Interactions Plot.
EXAMPLE 1.
Artificial data to illustrate the difference between models with only row effects, only
column effects, additive effects of rows and columns, and a two-factor model with
interaction.
Data Set 1
9, 10, 11 8, 11, 12 9, 11, 13
28, 29, 30 27, 30, 31 31, 30, 29
Data Set 2
9, 10, 11 19, 20, 21 39, 40, 41
8, 10, 11 20, 21, 22 38, 40, 42
Data Set 3
18, 20, 22 27, 31, 33 48, 51, 54
36, 39, 41 47, 51, 53 69, 70, 71
Data Set 4
19, 21, 23 29, 33, 35 52, 55, 58
48, 47, 49 57, 61, 63 31, 30, 29
I n t e r a c t io n P lo t f o r D a t a S e t 1
re sp o n se
4 0
3 0
2 0
1 0
0
1 2 3
c o l F rid a y , M a rc h 1 4 , 2 0 1 4 0 1 :0 1 :3 2 P M 2
ro w 1 2
I n t e r a c t io n P lo t f o r D a t a S e t 2
re sp o n se
5 0
4 0
3 0
2 0
1 0
0
1 2 3
c o l
ro w 1 2
I n t e r a c t io n P lo t f o r D a t a S e t 3
re sp o n se
8 0
7 0
6 0
5 0
4 0
3 0
2 0
1 0
1 2 3
c o l F rid a y , M a rc h 1 4 , 2 0 1 4 0 1 :0 1 :3 2 P M 4
ro w 1 2
I n t e r a c t io n P lo t f o r D a t a S e t 4
re sp o n se
7 0
6 0
5 0
4 0
3 0
2 0
1 0
1 2 3
c o l
ro w 1 2
Analysis of Variance
As before we assume that the random error terms are independently identically
normal with constant variance.
We can investigate the structure of this model with a Hasse diagram. In this di-
agram, the models further down the diagram contain subsets of the effects in the
models above it. The model at the top of the diagram is the full model and the model
at the bottom is the minimal model. We can move between models by performing
hypothesis tests.
There are three null hypotheses that we want to test. Two involve the treatment
factors. These are
H0 : τ1 = τ2 = . . . = τa = 0
H1 : at least one τi 6= 0.
H0 : β1 = β2 = . . . = βb = 0
H1 : at least one βj 6= 0.
H0 : (τ β)ij = 0 ∀i, j
H1 : at least one (τ β)ij 6= 0.
yijk − y ... = y i.. − y ... + y .j. − y ... + y ij. − y i.. − y .j. + y ... + yijk − y ij.
= (y i.. − y ... ) + (y .j. − y ... ) + (y ij. − y i.. − y .j. + y ... ) + (yijk − y ij. )
Temperature
Material 15◦ F 70◦ F 125◦ F
1 130 155 74 180 34 40 80 75 20 70 82 58
2 150 188 159 126 136 122 106 115 25 70 58 45
3 138 110 168 160 174 120 150 139 96 104 82 60
proc glm data=lect.battery plots=diagnostics;
class temp material;
model life=temp material temp*material;
means temp material temp*material;
output out=battery r=resid;
run;
data battery;
set battery;
combined=material + 10*temp;
run;
T h e G L M P ro c e d u re
D e p e n d e n t V a r ia b le : life
S u m o f
S o u r c e D F S q u a r e s M e a n S q u a r e F V a lu e P r > F
M o d e l 8 5 9 4 1 6 .2 2 2 2 2 7 4 2 7 .0 2 7 7 8 1 1 .0 0 < .0 0 0 1
E r r o r 2 7 1 8 2 3 0 .7 5 0 0 0 6 7 5 .2 1 2 9 6
C o r r e c te d T o ta l 3 5 7 7 6 4 6 .9 7 2 2 2
R -S q u a r e C o e ff V a r R o o t M S E lif e M e a n
0 .7 6 5 2 1 0 2 4 .6 2 3 7 2 2 5 .9 8 4 8 6 1 0 5 .5 2 7 8
S o u r c e D F T y p e I S S M e a n S q u a r e F V a lu e P r > F
te m p 2 3 9 1 1 8 .7 2 2 2 2 1 9 5 5 9 .3 6 1 1 1 2 8 .9 7 < .0 0 0 1
m a t e r ia l 2 1 0 6 8 3 .7 2 2 2 2 5 3 4 1 .8 6 1 1 1 7 .9 1 0 .0 0 2 0
t e m p * m a t e r ia l 4 9 6 1 3 .7 7 7 7 8 2 4 0 3 .4 4 4 4 4 3 .5 6 0 .0 1 8 6
S o u r c e D F T y p e III S S M e a n S q u a r e F V a lu e P r > F
te m p 2 3 9 1 1 8 .7 2 2 2 2 1 9 5 5 9 .3 6 1 1 1 2 8 .9 7 < .0 0 0 1
m a t e r ia l 2 1 0 6 8 3 .7 2 2 2 2 5 3 4 1 .8 6 1 1 1 7 .9 1 0 .0 0 2 0
t e m p * m a t e r ia l 4 9 6 1 3 .7 7 7 7 8 2 4 0 3 .4 4 4 4 4 3 .5 6 0 .0 1 8 6
I n te r a c tio n P lo t fo r D a ta S e t 4 F rid a y , M a rc h 1 4 , 2 0 1 4 0 1 :0 1 :3 2 P M 7
T h e G L M P ro c e d u re
D e p e n d e n t V a r ia b le : life
I n te r a c tio n P lo t fo r D a ta S e t 4 F rid a y , M a rc h 1 4 , 2 0 1 4 0 1 :0 1 :3 2 P M 1 1
T h e G L M P ro c e d u re
I n te r a c tio n P lo t fo r D a ta S e t 4 F rid a y , M a rc h 1 4 , 2 0 1 4 0 1 :0 1 :3 2 P M 1 2
T h e U N IV A R IA T E P ro c e d u re
V a r ia b le : r e s id
M o m e n ts
N 3 6 S u m W e ig h t s 3 6
M e a n 0 S u m O b s e r v a t io n s 0
S t d D e v ia t io n 2 2 .8 2 2 7 6 4 3 V a r ia n c e 5 2 0 .8 7 8 5 7 1
S k e w n e ss - 0 .4 6 3 5 9 1 1 K u r t o s is 0 .0 9 6 6 3 6 0 5
U n c o r r e c te d S S 1 8 2 3 0 .7 5 C o r r e c te d S S 1 8 2 3 0 .7 5
C o e f f V a r ia t io n . S td E r r o r M e a n 3 .8 0 3 7 9 4 0 5
B a s ic S t a t is t ic a l M e a s u r e s
L o c a t io n V a r ia b ilit y
lif e lif e
M e a n 0 .0 0 0 0 0 S t d D e v ia t io n 2 2 .8 2 2 7 6
L e v e l o f L e v e l o f
te m p N M e a n S td D e v M e m d ai a t ne r i a 1 l .3 7 5 0 0
N V a M r i ae a n n c e S td D e v 5 2 0 .8 7 8 5 7
1 5 1 2 1 4 4 .8 3 3 3 3 3 3 1 .6 9 4 0 8 7 0 M o1 d e -4 .7 5 0 0 0 R a n g e
1 2 8 3 .1 6 6 6 6 7 4 8 .5 8 8 8 7 5 1
1 0 6 .0 0 0 0 0
7 0 1 2 1 0 7 .5 8 3 3 3 3 4 2 .8 8 3 4 7 5 0 2 1 2
I n t e r q u a r 4 t i 9 l e . 4 R7 2 a 3 n 6 g 7 e 6
1 0 8 .3 3 3 3 3 3
3 3 .6 2 5 0 0
1 2 5 1 2 6 4 .1 6 6 6 6 7 2 5 .6 7 2 1 7 5 7 3 1 2 1 2 5 .0 8 3 3 3 3 3 5 .7 6 5 5 4 5 5
T e s t s f o r L o c a t io n : M u 0 = 0
lif e
T e st S t a t is t ic p V a lu e
L e v e l o f L e v e l o f
te m p m a t e r ia l N M e a n S td D e v S t u d e n t 's t t 0 P r > |t | 1 .0 0 0 0
1 5 1 4 1 3 4 .7 5 0 0 0 0 4 5 .3 5 3 2 4 3 2 S ig n M 1 P r > = |M | 0 .8 6 7 9
1 5 2 4 1 5 5 .7 5 0 0 0 0 2 5 .6 1 7 3 7 6 9 S ig n e d R a n k S 3 .5 P r > = |S | 0 .9 5 7 1
1 5 3 4 1 4 4 .0 0 0 0 0 0 2 5 .9 7 4 3 4 6 3
7 0 1 4 5 7 .2 5 0 0 0 0 2 3 .5 9 9 0 8 1 9 T e s t s f o r N o r m a lit y
7 0 2 4 1 1 9 .7 5 0 0 0 0 1 2 .6 5 8 9 8 8 9 T e st S t a t is t ic p V a lu e
7 0 3 4 1 4 5 .7 5 0 0 0 0 2 2 .5 4 4 4 0 0 6 S h a p ir o - W ilk W 0 .9 7 6 0 5 7 P r < W 0 .6 1 1 7
1 2 5 1 4 5 7 .5 0 0 0 0 0 2 6 .8 5 1 4 4 3 2 K o lm o g o r o v - S m ir n o v D 0 .1 0 5 9 3 P r > D > 0 .1 5 0 0
1 2 5 2 4 4 9 .5 0 0 0 0 0 1 9 . 2 6 1 3 6 I 0 n 3 t e r a c t i o Cn P r a l o m t f e o r r - D v o a t n a M S e t i s 4 e s W - S q F r 0i d . 0a y 5 , 4 M 4 1 a 5r c h P 1 4 r , > 2 0 W 1 4 - S 0 q 1 : 0 >1 : 0 3 . 2 5 P 0 M 0 1 6
1 2 5 3 4 8 5 .5 0 0 0 0 0 1 9 .2 7 8 6 5 8 3
T h e G AL M n d e P r r s o o c n e - d D u a r r e l i n g A -S q 0 .3 4 0 3 3 7 P r > A -S q > 0 .2 5 0 0
L e v e n e 's T e s t f o r H o m o g e n e it y o f lif e V a r ia n c e
A N O V A o f S q u a r e d D e v ia t io n s f r o m G r o u p M e a n s
S u m o f M e a n
S o u r c e D F S q u a r e s S q u a r e F V a lu e P r > F
c o m b in e d 8 5 4 0 7 4 3 6 6 7 5 9 2 9 1 .4 8 0 .2 1 0 7
E r r o r 2 7 1 2 3 3 2 8 8 5 4 5 6 7 7 4
We could proceed as before and find the derivatives of the error sum of squares with
respect to each of the parameters in turn, set the resulting equations equal to 0 and
solve. Doing so gives rise to the normal equations, and we give these directly by
remembering the short-cut of summing over the subscripts which do not appear in
each of the model parameters in turn. You should be sure that you can get to the
normal equations by differentiating the error sum of squares.
Weights
Legs 50lb 75lb 100lb
Straight 27 29 32 34 37 37
28 27 31 35 35 38
27 29 33 34 33 35
25 28 32 31 34 37
27 30 34 33 35 38
30 34 36
Bent 26 27 31 30 32 34
24 26 32 34 33 35
27 25 30 31 31 34
27 24 31 32 36 32
25 23 34 30 32 33
24 31 34
proc glm data=lect.wtl plots=diagnostics;
class legs weight;
model bpfs=legs weight legs*weight;
means legs weight legs*weight;
output out=wtl2 r=resid;
run;
T h e G L M P ro c e d u re
D e p e n d e n t V a r ia b le : B P F S
S u m o f
S o u r c e D F S q u a r e s M e a n S q u a r e F V a lu e P r > F
M o d e l 5 8 3 2 .8 6 3 6 3 6 4 1 6 6 .5 7 2 7 2 7 3 7 6 .3 5 < .0 0 0 1
E r r o r 6 0 1 3 0 .9 0 9 0 9 0 9 2 .1 8 1 8 1 8 2
C o r r e c te d T o ta l 6 5 9 6 3 .7 7 2 7 2 7 3
R -S q u a r e C o e ff V a r R o o t M S E B P F S M e a n
0 .8 6 4 1 7 0 4 .7 4 3 9 6 4 1 .4 7 7 0 9 8 3 1 .1 3 6 3 6
S o u r c e D F T y p e I S S M e a n S q u a r e F V a lu e P r > F
L E G S 1 8 5 .2 2 7 2 7 2 7 8 5 .2 2 7 2 7 2 7 3 9 .0 6 < .0 0 0 1
W E IG H T 2 7 4 3 .2 7 2 7 2 7 3 3 7 1 .6 3 6 3 6 3 6 1 7 0 .3 3 < .0 0 0 1
L E G S * W E IG H T 2 4 .3 6 3 6 3 6 4 2 .1 8 1 8 1 8 2 1 .0 0 0 .3 7 3 9
S o u r c e D F T y p e III S S M e a n S q u a r e F V a lu e P r > F
L E G S 1 8 5 .2 2 7 2 7 2 7 8 5 .2 2 7 2 7 2 7 3 9 .0 6 < .0W0 0e 1d n e s d a y , F e b r u a r y 1 1 , 2 0 1 5 1 0 :5 3 :4 1 A M 4
W E IG H T 2 T 7 h4 3 e . 2 G 7 2 L 7 2 M 7 3 P r o 3 c 7 e 1 . d 6 u3 6 r 3 e 6 3 6 1 7 0 .3 3 < .0 0 0 1
L E G S * W E IG H T D e p e2 n 4 . d 3 6e 3 n 6 t3 6 V 4 a r ia b 2 l e . 1 : 8 B1 8 P 1 8 F 2 S 1 .0 0 0 .3 7 3 9
In te ra c tio n P lo t fo r B P F S
3 5
B P F S
3 0
2 5
1 2
L E G S
W E IG H T 1 2 3
T h e G L M P ro c e d u re
D e p e n d e n t V a r ia b le : B P F S
S u m o f
S o u r c e D F S q u a r e s M e a n S q u a r e F V a lu e P r > F
M o d e l 3 8 2 8 .5 0 0 0 0 0 0 2 7 6 .1 6 6 6 6 6 7 1 2 6 .5 8 < .0 0 0 1
E r r o r 6 2 1 3 5 .2 7 2 7 2 7 3 2 .1 8 1 8 1 8 2
C o r r e c te d T o ta l 6 5 9 6 3 .7 7 2 7 2 7 3
R -S q u a r e C o e ff V a r R o o t M S E B P F S M e a n
0 .8 5 9 6 4 3 4 .7 4 3 9 6 4 1 .4 7 7 0 9 8 3 1 .1 3 6 3 6
S o u r c e D F T y p e I S S M e a n S q u a r e F V a lu e P r > F
L E G S 1 8 5 .2 2 7 2 7 2 7 8 5 .2 2 7 2 7 2 7 3 9 .0 6 < .0 0 0 1
W E IG H T 2 7 4 3 .2 7 2 7 2 7 3 3 7 1 .6 3 6 3 6 3 6 1 7 0 .3 3 < .0 0 0 1
S o u r c e D F T y p e III S S M e a n S q u a r e F V a lu e P r > F
L E G S 1 8 5 .2 2 7 2 7 2 7 8 5 .2 2 7 2 7 2 7 3 9 .0 6 < .0 0 0 1
W E IG H T 2 7 4 3 .2 7 2 7 2 7 3 3 7 1 .6 3 6 3 6 3 6 1 7 0 .3 3 < .0 0 0 1
T h e G L M P ro c e d u re
D e p e n d e n t V a r ia b le : B P F S
F it D ia g n o s tic s fo r B P F S
3 2 2
2
1 1
1
R S tu d e n t
R S tu d e n t
R e s id u a l
0 0 0
-1
-1 -1
-2
-3 -2 -2
2 6 2 8 3 0 3 2 3 4 3 6 2 6 2 8 3 0 3 2 3 4 3 6 0 .0 6 0 .0 8 0 .1 0 0 .1 2
P r e d ic t e d V a lu e P r e d ic t e d V a lu e L e v e ra g e
0 .0 6
2 3 5
R e s id u a l
0 .0 4
C o o k 's D
B P F S
0 3 0
0 .0 2
-2
2 5
0 .0 0
-2 -1 0 1 2 2 5 3 0 3 5 0 2 0 4 0 6 0
Q u a n t ile P r e d ic t e d V a lu e O b s e r v a t io n
2 5 F it – M e a n R e s id u a l
2 0 4
2 O b s e rv a tio n s 6 6
1 5
P e rc e n t
0 P a ra m e te rs 4
1 0 E rro r D F 6 2
-2 M S E 2 .1 8 1 8
5 -4 R -S q u a re 0 .8 5 9 6
A d j R -S q u a re 0 .8 5 2 9
0 -6
-4 .4 -2 0 .4 2 .8 0 .0 0 .4 0 .8 0 .0 0 .4 0 .8
R e s id u a l P r o p o r t io n L e s s
These notes are only intended to provide an overview of the material. They are
supplemented by the discussion that takes place in the classroom, both in lectures
and in labs. The exercise sheets are an integral part of the subject and students
should attempt all of the questions.
Students who would like further reading about this topic have many options as this is
a standard topic that is covered in any book on designed experiments. Kuehl (2000)
and Montgomery (2007 and earlier editions) both cover this material in detail and
are well written. But any book that you find in the library that covers the material
in a way that you find helpful is suitable.
The ideas in this section can be extended to include more than two factors and the
sums of squares and ANOVA table can be calculated in a similar way to the way