## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

)

Asheber Abebe Discrete and Statistical Sciences Auburn University

ii

Contents

1 Completely Randomized Design 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . 1.2 The Fixed Eﬀects Model . . . . . . . . . . . . . . . 1.2.1 Decomposition of the Total Sum of Squares 1.2.2 Statistical Analysis . . . . . . . . . . . . . . 1.2.3 Comparison of Individual Treatment Means 1.3 The Random Eﬀects Model . . . . . . . . . . . . . 1.4 More About the One-Way Model . . . . . . . . . . 1.4.1 Model Adequacy Checking . . . . . . . . . 1.4.2 Some Remedial Measures . . . . . . . . . . 1 1 2 2 3 5 10 18 18 23 25 25 25 26 27 29 30 30 36 38 39 41 41 42 48 51 51 55 55 57 57 59 60 70 75 79

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

2 Randomized Blocks 2.1 The Randomized Complete Block Design . . . . . . 2.1.1 Introduction . . . . . . . . . . . . . . . . . . 2.1.2 Decomposition of the Total Sum of Squares . 2.1.3 Statistical Analysis . . . . . . . . . . . . . . . 2.1.4 Relative Eﬃciency of the RCBD . . . . . . . 2.1.5 Comparison of Treatment Means . . . . . . . 2.1.6 Model Adequacy Checking . . . . . . . . . . 2.1.7 Missing Values . . . . . . . . . . . . . . . . . 2.2 The Latin Square Design . . . . . . . . . . . . . . . 2.2.1 Statistical Analysis . . . . . . . . . . . . . . . 2.2.2 Missing Values . . . . . . . . . . . . . . . . . 2.2.3 Relative Eﬃciency . . . . . . . . . . . . . . . 2.2.4 Replicated Latin Square . . . . . . . . . . . . 2.3 The Graeco-Latin Square Design . . . . . . . . . . . 2.4 Incomplete Block Designs . . . . . . . . . . . . . . . 2.4.1 Balanced Incomplete Block Designs (BIBD’s) 2.4.2 Youden Squares . . . . . . . . . . . . . . . . . 2.4.3 Other Incomplete Designs . . . . . . . . . . . 3 Factorial Designs 3.1 Introduction . . . . . . . . . . . . . 3.2 The Two-Factor Factorial Design . 3.2.1 The Fixed Eﬀects Model . . 3.2.2 Random and Mixed Models 3.3 Blocking in Factorial Designs . . . 3.4 The General Factorial Design . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

iii

. .1 Introduction . . . . . . . . . . CONTENTS 83 83 83 83 86 89 90 99 101 101 101 103 104 104 107 109 129 129 129 134 140 146 151 157 157 157 158 158 162 166 170 174 174 180 . . . . . . . . . . . . . . . . . . . . . . . .1 Trend Analyses in One. . . . . . . . . . . . . . . . . . . . . 7. . . . . . . . . .1 Estimation : The Method of Least Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 The Huynh-Feldt Sphericity (S) Structure 5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Single Factor Designs with One Covariate . . . . . . . . . . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . 5. . . . . . . . . . . . . . . . .2 Nesting in the Treatment Structure . . . . . . .3 Tests of Hypotheses .2 One-Way Repeated Measurement Designs . . . . . . . . . . . . 4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7. . . . . . . . . . . . . . .1. . . .1 The Mixed RCBD . . . . . 4. . . . . . .3 Crossover Designs . . . . . . . . . 6. . . . . . . . . .3 Two-Way Repeated Measurement Designs . . . .1 Regression Components of the Between Treatment SS (SSB ) . .2 RM Designs . . . . . . . . . . . . . . . . . . . . .1. . .2. . . . . . . . . . . . . . . . . . . . . . . . . .2 The 2k Factorial Design . . . . . . 5. . . . . 6 More on Repeated Measurement Designs 6. . . . 4. . . . . . . . . . . . 5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1. . . . . . .1 Introduction . . . . . . . . . 185 . . . . .1 Two Groups. .1. . . . . . . . .2. . . . .3 ANCOVA in Randomized Complete Block Designs . . . . . . . . . . . . . . . . . . . . . . . 7. 4. . . . . . . . . . . . . . . . . . . . . . . 7. . . . . . . . . . . . . . .3 The 3k Design . . .1 Simple Linear Regression . . . . . . . . . . . One Covariate . . . . . . .4 The Unreplicated 2k Design 4. .4 ANCOVA in Two-Factor Designs . .2. . . . . . . . . . One Covariate . . . . . . . . . . 7. . . . . . . . . . . . . . . . . . .2. .2 Partitioning the Total SS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Multiple Groups. . . . . . . . . . . . . . . . . . . . . . . . . .2. 5 Repeated Measurement Designs 5. . . . . . .4 Two-Way Repeated Measurement Designs with Repeated Measures on 7 Introduction to the Analysis of Covariance 7. . . . . . . . . . . . . . . . . 6.and Two-way RM Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Both .1. . . . . . . . . . . . . . . 8 Nested Designs 181 8. . . . . .1 The 22 Design . . 7. . . . . . . . . . . . . . .3 The General 2k Design . . .2 The 23 Design . . . . . . . . . 5. . . . . . 6. . . . 6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 The One-Way RM Design : (S) Structure 5. .1. . . . . . . . 7. . . . . . .3 One-way RM Design : General . . . . . .2. . . 181 8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 The Split-Plot Design . . . . . . . . . . . . . . . . . . . .5. . . . . . . . . . . . . . . 6. . . . . . . . . . . . . . . . . . . . . . .iv 4 2k and 3k Factorial Designs 4. . . . . . . . . . . . . . . . . .1 Nesting in the Design Structure . . . . .5 The Johnson-Neyman Technique: Heterogeneous Slopes 7. . . .2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4. . . .

1 Introduction Suppose we have an experiment which compares k treatments or k levels of a single factor. assign the second treatment to n2 units randomly selected from the remaining n − n1 units. The model is called the one-way classiﬁcation analysis of variance (one-way ANOVA). We shall describe the observations using the linear statistical model yij = µ + τi + where • yij is the jth observation on treatment i.1) describes two diﬀerent situations : 1. • µ is a parameter common to all treatments (overall mean). and so on until the kth treatment is assigned to the ﬁnal nk units. . .1) is a random error component. . Fixed Eﬀects Model : The k treatments could have been speciﬁcally chosen by the experimenter. 1 . We can assign the ﬁrst treatment to n1 units randomly selected from among the n. . y1n1 k yk1 yk1 . The typical data layout for a one-way ANOVA is shown below: Treatment 2 ··· y21 y21 . . i = 1. The goal here is to test hypotheses about the treatment means and estimate the model parameters (µ. τi . and • ij ij . ni . • τi is a parameter unique to the ith treatment (ith treatment eﬀect). Suppose we have n experimental units to be included in the experiment. y2n2 1 y11 y11 . Conclusions reached here only apply to the treatments considered and cannot be extended to other treatments that were not in the study. j = 1. Such an experimental design is called a completely randomized design (CRD). which is assumed constant for all treatments. · · · .Chapter 1 Completely Randomized Design 1. · · · . and σ 2 ). . In this model the random errors are assumed to be normally and independently distributed with mean zero and variance σ 2 . (1. k. yknk The model in Equation (1.

i=1 Denote by µi the mean of the ith treatment. )2 . Random : Choose ﬁve machines to represent machines as a class. Here are a few examples taken from Peterson : Design and Analysis of Experiments: 1. j=1 y. 3. Fixed : Measure the rate of production of ﬁve particular machines. Random Eﬀects Model : The k treatments could be a random sample from a larger population of treatments. at random. Further. let yi. H0 HA An equivalent set of hypotheses is H0 HA : τ1 = τ2 = · · · = τk = 0 : τi = 0 for at least one i : µ1 = µ2 = · · · = µk : µi = µj for at least one i. at random. ¯ measures the total variability in the data. three fungicides from a group of similar fungicides to study the action. = ¯ 1 n k ni yij i=1 j=1 The total sum of squares (corrected) given by k ni SST = i=1 j=1 (yij − y. Fixed : A scientist develops three new fungicides. µi = E(yij ) = µ + τi . i = 1. may be decomposed as . The τi are random variables. COMPLETELY RANDOMIZED DESIGN 2.1 Decomposition of the Total Sum of Squares k i=1 In the following let n = ni . Conclusions here extend to all the treatments in the population. Random : A scientist is interested in the way a fungicide works. k. We test hypotheses about the variability of τi . τi . The treatment eﬀects. 2. are expressed as deviations from the overall mean. The total sum of squares. His interest is in these fungicides only. We are interested in testing the equality of the k treatment means. we are not interested in the particular ones in the model. Random : Select. He selects. = ¯ 1 ni ni yij .. thus.. so that k τi = 0 .2 CHAPTER 1. Fixed : Conduct an experiment to obtain information about four speciﬁc soil types. four soil types to represent all soil types. · · · .2. j 1. 1. SST .2 The Fixed Eﬀects Model In this section we consider the ANOVA for the ﬁxed eﬀects model.

1.2. THE FIXED EFFECTS MODEL

3

k

ni

k

k

ni

(yij − y.. )2 = ¯

i=1 j=1 i=1

ni (¯i. − y.. )2 + y ¯

i=1 j=1

(yij − yi. )2 ¯

**The proof is left as an exercise. We will write SST = SSB + SSW ,
**

i ni (¯i. −¯.. )2 is called the between treatments sum of squares and SSW = i=1 j=1 (yij − y y where SSB = 2 yi. ) is called the within treatments sum of squares. ¯ One can easily show that the estimate of the common variance σ 2 is SSW /(n − k). Mean squares are obtained by dividing the sum of squares by their respective degrees of freedoms as

k i=1

k

n

M SB = SSB /(k − 1),

M SW = SSW /(n − k) .

1.2.2

Testing

Statistical Analysis

Since we assumed that the random errors are independent, normal random variables, it follows by Cochran’s Theorem that if the null hypothesis is true, then F0 = if F0 > Fk−1,n−k (α) . The following ANOVA table summarizes the test procedure: Source Between Within (Error) Total Estimation Once again consider the one-way classiﬁcation model given by Equation (1.1). We now wish to estimate the model parameters (µ, τi , σ 2 ). The most popular method of estimation is the method of least squares (LS) which determines the estimators of µ and τi by minimizing the sum of squares of the errors

k ni 2 ij i=1 j=1 k ni

M SB M SW

follows an F distribution with k − 1 and n − k degrees of freedom. Thus an α level test of H0 rejects H0

df k−1 n−k n−1

SS SSB SSW SST

MS M SB M SW

F0 F0 = M SB /M SW

L=

=

i=1 j=1

(yij − µ − τi )2 .

Minimization of L via partial diﬀerentiation provides the estimates µ = y.. and τi = yi. − y.. , for ˆ ¯ ˆ ¯ ¯ i = 1, · · · , k. By rewriting the observations as yij = y.. + (¯i. − y.. ) + (yij − yi. ) ¯ y ¯ ¯ one can easily observe that it is quite reasonable to estimate the random error terms by eij = yij − yi. . ¯ These are the model residuals.

4

CHAPTER 1. COMPLETELY RANDOMIZED DESIGN Alternatively, the estimator of yij based on the model (1.1) is yij = µ + τi , ˆ ˆ ˆ

which simpliﬁes to yij = yi. . Thus, the residuals are yij − yij = yij − yi. . ˆ ¯ ˆ ¯ An estimator of the ith treatment mean, µi , would be µi = µ + τi = yi. . ˆ ˆ ˆ ¯ Using M SW as an estimator of σ 2 , we may provide a 100(1 − α)% conﬁdence interval for the treatment mean, µi , yi. ± tn−k (α/2) M SW /ni . ¯ A 100(1 − α)% conﬁdence interval for the diﬀerence of any two treatment means, µi − µj , would be yi. − yj. ± tn−k (α/2) ¯ ¯ M SW (1/ni + 1/nj )

We now consider an example from Montgomery : Design and Analysis of Experiments. Example The tensile strength of a synthetic ﬁber used to make cloth for men’s shirts is of interest to a manufacturer. It is suspected that the strength is aﬀected by the percentage of cotton in the ﬁber. Five levels of cotton percentage are considered: 15%, 20%, 25%, 30% and 35%. For each percentage of cotton in the ﬁber, strength measurements (time to break when subject to a stress) are made on ﬁve pieces of ﬁber. 15 7 7 15 11 9 The corresponding ANOVA table is Source Between Within (Error) Total df 4 20 24 SS 475.76 161.20 636.96 MS 118.94 8.06 F0 F0 = 14.76 20 12 17 12 18 18 25 14 18 18 19 19 30 19 25 22 19 23 35 7 10 11 15 11

Performing the test at α = .01 one can easily conclude that the percentage of cotton has a signiﬁcant eﬀect on ﬁber strength since F0 = 14.76 is greater than the tabulated F4,20 (.01) = 4.43. The estimate of the overall mean is µ = y.. = 15.04. Point estimates of the treatment eﬀects are ˆ ¯ τ1 = y1. − y.. = 9.80 − 15.04 = −5.24 ˆ ¯ ¯ τ2 = y2. − y.. = 15.40 − 15.04 = 0.36 ˆ ¯ ¯ τ3 = y3. − y.. = 17.60 − 15.04 = −2.56 ˆ ¯ ¯ τ4 = y4. − y.. = 21.60 − 15.04 = 6.56 ˆ ¯ ¯ τ5 = y5. − y.. = 10.80 − 15.04 = −4.24 ˆ ¯ ¯ A 95% percent CI on the mean treatment 4 is 21.60 ± (2.086) 8.06/5 , which gives the interval 18.95 ≤ µ4 ≤ 24.25.

1.2. THE FIXED EFFECTS MODEL

5

1.2.3

Comparison of Individual Treatment Means

**Suppose we are interested in a certain linear combination of the treatment means, say,
**

k

L=

i=1

li µi ,

**where li , i = 1, · · · , k, are known real numbers not all zero. The natural estimate of L is
**

k k

ˆ L=

i=1

li µi = ˆ

i=1

li yi. . ¯

**Under the one-way classiﬁcation model (1.1), we have : ˆ 1. L follows a N (L, σ 2 2. √
**

ˆ L−L P 2 M SW ( k li /ni ) i=1 k 2 i=1 li /ni ),

**follows a tn−k distribution,
**

k 2 i=1 li /ni ),

ˆ 3. L ± tn−k (α/2) M SW ( 4. An α-level test of

H0 HA is ˆ L M SW (

: L=0 : L=0

k 2 i=1 li /ni )

> tn−k (α/2) .

**A linear combination of all the treatment means
**

k

φ=

i=1

ci µi

is known as a contrast of µ1 , · · · , µk if

k i=1 ci

**= 0. Its sample estimate is ˆ φ=
**

k

ci yi. . ¯

i=1

**Examples of contrasts are µ1 − µ2 and µ1 − µ. Consider r contrasts of µ1 , · · · , µk , called planned comparisons, such as,
**

k k

φi =

s=1

cis µs with

s=1

cis = 0 for i = 1, · · · , r ,

and the experiment consists of H0 : φ 1 = 0 HA : φ1 = 0 ··· ··· H0 : φ r = 0 HA : φr = 0

• if F0 < Fk−1.6 Example The most common example is the set of all k 2 CHAPTER 1. then reject H0 : µi = µj . • if |tij | > tn−k (α/2).025) M SW (1/5 + 1/5) = 2.086 2(8.75 . Stage 1 : Test H0 : µ1 = · · · = µk with F0 = M SB /M SW .n−k (α). − yj.n−k (α). for all i = j.05 is LSD = t20 (. | y ¯ M SW (1/ni + 1/nj ) • if |tij | < tn−k (α/2). then accept H0 : µi = µj . − yj. The following gives a summary of the steps. The ANOVA F -test rejected H0 : µ1 = · · · = µ5 . | > y ¯ LSD. : µi = µj : µi = µj pairs with |tij | = |¯i. where LSD = tn−k (α/2) M SW (1/ni + 1/nj ) . ¯ ¯ M SW (1/ni + 1/nj ) and comparing it to tn−k (α/2).06) = 3.75 would imply that the corresponding pair of population means are signiﬁcantly diﬀerent. An experimentwise 2 error occurs if at least one of the null hypotheses is declared signiﬁcant when H0 : µ1 = · · · = µk is known to be true. then go to Stage 2. − yj. The 5 = 10 pairwise diﬀerences among the treatment means 2 are . Example Consider the fabric strength example we considered above. The LSD at α = . we wish to test H0 : µi = µj . 5 Thus any pair of treatment averages that diﬀer by more than 3. This could be done using the t statistic t0 = yi. Stage 2 : Test H0 HA for all k 2 then declare H0 : µ1 = · · · = µk true and stop. The Least Signiﬁcant Diﬀerence (LSD) Method Suppose that following an ANOVA F test where the null hypothesis is rejected. µk . COMPLETELY RANDOMIZED DESIGN pairwise tests H0 HA : µi = µj : µi = µj for 1 ≤ i < j ≤ k of all µ1 . • if F0 > Fk−1. An equivalent test declares µi and µj to be signiﬁcantly diﬀerent if |¯i. · · · . The experiment consists of all k pairwise tests.

0 −2. ¯ ¯ y3.6 y4. The Scheﬀ´ method e controls the experimentwise error rate at level α. ¯ 9. ¯ ¯ y2. ¯ ¯ y3. ¯ 15.6 − 10. ¯ 10.8 y2.6 − 10.8 − 10. Sometimes we also ﬁnd that the LSD fails to ﬁnd any signiﬁcant pairwise diﬀerences while the F -test declares signiﬁcance. ¯ ¯ y2.8 − 21. ¯ ¯ y1.6 = 15.6 = 15.6 = 9. and the experiment consists of H0 : φ 1 = 0 HA : φ1 = 0 ··· ··· H0 : φ r = 0 HA : φr = 0 The Scheﬀ´ method declares φi to be signiﬁcant if e ˆ |φi | > Sα. − y3.8 − 15. This is due to the fact that the ANOVA F -test considers all possible comparisons.0∗ 6. Scheﬀ´’s Method for Comparing all Contrasts e Often we are interested in comparing diﬀerent combinations of the treatment means. ¯ ¯ y2.8∗ −1. ¯ ¯ y1. − y5.6 = 9.4 = 9.6 As k gets large the experimentwise error becomes large.1.i . THE FIXED EFFECTS MODEL y1. − y2.6 = 17.8 = 17. ¯ ¯ y4. not just pairwise comparisons.6∗ −7. r .i = (k − 1)Fk−1.6 − 21.6∗ −4. − y4.8 = −5.4 − 10.8 − 17.4 − 17.2∗ 4. where ˆ φi = and k k cis ys.8∗ 10.8 = 15. − y4. ¯ s=1 Sα. ¯ 21. − y5.8∗ 7 Using underlining the result may be summarized as y1. · · · .2. Scheﬀ´ (1953) has e proposed a method for comparing all possible contrasts between treatment means. . − y3.n−k (α) M SW s=1 (c2 /ni ) .4 − 21.8 = 21. − y5. − y4.2 −6.8 y5. ¯ ¯ = = = = = = = = = = 9. Consider the r contrasts k k φi = s=1 cis µs with s=1 cis = 0 for i = 1. consider the fabric strength data and suppose that we are interested in the contrasts φ1 = µ1 + µ3 − µ4 − µ5 and φ2 = µ1 − µ4 . is Example As an example. − y5.8∗ −11.4 y3. ¯ 17. ¯ ¯ y1.

06 1 1 + = 5. that is. µi and µj . − y4.01.1 = (k − 1)Fk−1.n−k (. we ﬁnd that q4.2 = (k − 1)Fk−1.43) 8. we ﬁnd that the following pairs of means do not signiﬁcantly diﬀer: µ1 and µ5 µ5 and µ2 µ2 and µ3 µ3 and µ4 Notice that this result diﬀers from the one reported by the LSD method.n−k (α) M SW 1 1 + 2 ni nj .01.01. to be signiﬁcantly diﬀerent if the absolute value of their sample diﬀerences exceeds Tα = qk. . The Tukey-Kramer Method The Tukey-Kramer procedure declares two means.2 . Thus. = −11.06(1 + 1 + 1 + 1)/5 = 10. = 5.20 (. | y ¯ exceeds 8. From the studentized range distribution table. the mean strengths of treatments 1 and 4 diﬀer signiﬁcantly. ¯ ¯ We compute the Scheﬀ´ 1% critical values as e k S. where qk.00 ¯ ¯ ¯ ¯ and ˆ φ2 = y1. would be declared signiﬁcantly diﬀerent if |¯i.43) 8. since |φ2 | > S. T.01) M SW s=1 (c2 /n2 ) 2s = 4(4. − y4.05 = 4.8 The sample estimates of these contrasts are CHAPTER 1. + y3.23.80 . a pair of means.58 ˆ Since |φ1 | < S. we conclude that the contrast φ1 = µ1 + µ3 − µ4 − µ5 is not signiﬁcantly diﬀerent ˆ from zero.69 and k S.1 .37 . − y5.05) = 4. − yj. Example Reconsider the fabric strength example.06(1 + 1)/5 = 7.01. we conclude that φ2 = µ1 − µ2 is signiﬁcantly diﬀerent from zero.01) M SW s=1 (c2 /n1 ) 1s = 4(4. However.n−k (α) is the α percentile value of the studentized range distribution with k groups and n − k degrees of freedom.n−k (.23 2 5 5 Using this value. µi and µj . COMPLETELY RANDOMIZED DESIGN ˆ φ1 = y1.

Lr is α ˆ Li ± tn−k 2r for i = 1. Exercise : Use underlining to summarize the results of the Bonferroni testing procedure. Our purpose here is to ﬁnd a set of (1 − α)100% simultaneous conﬁdence intervals for the k − 1 pairwise diﬀerences comparing treatment to control. · · · . The value dk−1. vs. The proof of this result is left as an exercise. · · · . We wish to test H0 : µ1 = · · · = µ5 at .0025) = 3.05 level of signiﬁcance. HA : µi = µj k M SW j=1 2 lij /nj . µk are k − 1 treatment means. Dunnett’s Method for Comparing Treatments to a Control Assume µ1 is a control mean and µ2 . for 1 ≤ i < j ≤ k. · · · .153 M SW (2/5) = 5. · · · . We may use this inequality to make simultaneous inference about linear combinations of treatment means in a one-way ﬁxed eﬀects ANOVA set up. · · · . Dunnett’s method rejects the null hypothesis H0 : µi = µ1 at level α if |¯i. | > dk−1. · · · . k. µi − µ1 . So the test rejects H0 : µi = µj in favor of HA : µi = µj if |¯i. r. This is done using t20 (. Ak be k arbitrary events with P (Ai ) ≥ 1 − α/k.153 . . · · · . − y1. Lr be r linear combinations of µ1 . r. k. A2 .66 . ¯ for i = 1. Let A1 . y ¯ for i = 2.2. | y ¯ α > tn−k M SW (1/ni + 1/nj ) 2 k 2 . Example Consider the tensile strength example considered above.n−k (α) M SW (1/ni + 1/n1 ) . − yj. | exceeds y ¯ 3. − yj. A Bonferroni α-level test of H0 : µ1 = µ2 = · · · = µk is performed by testing H0 : µi = µj with tij |¯i.05/(2 ∗ 10)) = t20 (. THE FIXED EFFECTS MODEL The Bonferroni Procedure 9 We start with the Bonferroni Inequality. for i = 2. A (1 − α)100% simultaneous conﬁdence interval for L1 .n−k (α) is read from a table. µk where Li = j=1 lij µj and Li = j=1 lij yj. L2 .1. There is no need to perform an overall F -test. · · · . Then P (A1 ∩ A2 ∩ · · · ∩ Ak ) ≥ 1 − α. k k ˆ Let L1 . · · · .

Assumptions 1. Such a model is known as a random eﬀects model. στ ). y ¯ Only the diﬀerences y3. i = 1.05) So the test rejects H0 : µi = µ5 if |¯i. the hypothesis of interest is 2 H0 : σ τ = 0 versus 2 HA : στ > 0 . If the τi are independent of the variance of an observation will be 2 Var(yij ) = σ 2 + στ . k. M SW (2/5) = 4. ij .8 and y4. i. · · · . | > 4. Our purpose is to estimate (and test. j = 1. if any) the variability among the treatments in the population. The ij are random errors which follow the normal distribution with mean 0 and common variance σ 2 . · · · . 2 The two variances.8 indicate any signiﬁcant diﬀerence. − y5. ni . = 6. The usual partition of the total sum of squares still holds: SST = SSB + SSW . Thus we ¯ ¯ ¯ ¯ conclude µ3 = µ5 and µ4 = µ5 . Testing is performed using the same F statistic that we used for the ﬁxed eﬀects model: F0 = An α-level test rejects H0 if F0 > Fk−1. τi .10 Example CHAPTER 1.76 . The mathematical representation of the model is the same as the ﬁxed eﬀects model: yij = µ + τi + ij . The treatment eﬀects. then we claim that there is a signiﬁcant diﬀerence among all the treatments.05) = 2. = 10. Thus the critical diﬀerence is d4.3 The Random Eﬀects Model The treatments in an experiment may be a random sample from a larger population of treatments. 2 2 If the hypothesis H0 : στ = 0 is rejected in favor of HA : στ > 0.76 1. Since we are interested in the bigger population of treatments.e.65. σ 2 and στ are known as variance components. except for the assumptions underlying the model. − y5. τi ∼ N (0.20 (. 2. The Dunnett critical value is d4. The estimators of the variance components are M SB M SW σ 2 = M SW ˆ .n−k (α). − y5. are a random sample from a population that is normally distributed with 2 2 mean 0 and variance στ . COMPLETELY RANDOMIZED DESIGN Consider the tensile strength example above and let treatment 5 be the control.20 (.

19 22.96 . n0 k ni − i=1 k 2 i=1 ni k i=1 ni .12 (.94 MS 29.75 111. 2 + σ2 σ τ 2 2 A 100(1 − α)% conﬁdence interval for στ /(σ 2 + στ ) is L U . that is the result of the diﬀerences among the treatments: 2 στ .1.90 ˆ and στ = ˆ2 29. The process engineer suspects that. They would like the looms to be homogeneous so that they obtain a fabric of uniform strength. Example A textile company weaves a fabric on a large number of looms. we conclude that the looms in the plant diﬀer signiﬁcantly. The data are given in the following table: Looms 1 2 3 4 The corresponding ANOVA table is Source Between (Looms) Within (Error) Total df 3 12 15 SS 89. The following example is taken from from Montgomery : Design and Analysis of Experiments.n−k (1 − α/2) .68 Observations 1 2 3 4 98 97 99 96 91 90 93 92 96 95 97 95 95 96 99 98 Since F0 > F3. The variance components are estimated by σ 2 = 1.73 − 1.90 F0 15. 1+L 1+U where L= and U= 1 n0 1 n0 . . 4 . M SB 1 −1 M SW Fk−1.73 1.90 = 6. THE RANDOM EFFECTS MODEL and στ = ˆ2 where n0 = 1 k−1 11 M SB − M SW . Var(yij ).05). To investigate this. in addition to the usual variation in strength within samples of fabric from the same loom. there may also be signiﬁcant variations in strength between looms.n−k (α/2) M SB 1 −1 M SW Fk−1. We are usually interested in the proportion of the variance of an observation. he selects four looms at random and makes four strength determinations on the fabric manufactured on each loom.3.

39 .47 − 1 = 0. 0 1 -1 -1.90 1 4.12 (.86 = 79%) is attributable to the diﬀerence among looms. poorly trained operators.22 = 0. From properties of the F distribution we have that Fa. QUIT. the variance of any observation on strength is estimated by σ 2 + στ = 8. QUIT. PROC GLM.625/1.3 (. From the F table we see that F3. RANDOM OFFICER. The engineer must now try to isolate the causes for the diﬀerence in loom performance (faulty set-up.73 1. MODEL TS=GROUP. MEANS GROUP/ CLDIFF BON CONTRAST ’PHI1’ GROUP 1 ESTIMATE ’PHI1’ GROUP 1 CONTRAST ’PHI2’ GROUP 1 ESTIMATE ’PHI2’ GROUP 1 RUN.192. Thus L= and U= 1 4 29. MODEL RATING=OFFICER. . A random eﬀects model may be analyzed using the RANDOM statement to specify the random factor: . RUN.625 which gives the 95% conﬁdence interval (0.95) . We conclude that the variability among looms accounts for between 39 and 95 percent of the variance in the observed strength of fabric produced. 2 2 Lets now ﬁnd a 95% conﬁdence interval for στ /(σ 2 + στ ).124/21. OPTIONS LS=80 PS=66 NODATE.625 = 0.124 1 4 29. Most of this variability ˆ ˆ2 (about 6. ). PROC GLM DATA=A1. CLASS OFFICER. Using SAS The following SAS code may be used to analyze the tensile strength example considered in the ﬁxed eﬀects CRD case. CLASS GROUP. DATA MONT. 0 1 -1 -1.a (1 − α). CARDS. .192 − 1 = 20.124 = 0.12 CHAPTER 1.025) = 1/5.96/8. . 0 0 -1 0.025) = 4. 7 1 7 1 15 1 11 1 9 1 12 2 17 2 12 2 18 2 18 2 14 3 18 3 18 3 19 3 19 3 19 4 25 4 22 4 19 4 23 4 7 5 10 5 11 5 15 5 11 5 . 20. /* print the data */ PROC PRINT DATA=MONT.b (α) = 1/Fb.73 1. TUKEY SCHEFFE LSD DUNNETT(’5’).12 (. COMPLETELY RANDOMIZED DESIGN Thus.975) = 1/F12. RUN.90 1 0. 0 0 -1 0.86. INPUT TS GROUP@@.47 and F3.

THE RANDOM EFFECTS MODEL SAS Output The SAS System Obs 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 TS 7 7 15 11 9 12 17 12 18 18 14 18 18 19 19 19 25 22 19 23 7 10 11 15 11 GROUP 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 2 1 13 The SAS System The GLM Procedure Class Level Information Class GROUP Levels 5 Values 1 2 3 4 5 Number of observations The SAS System The GLM Procedure Dependent Variable: TS Sum of Squares 475.0001 .2000000 25 3 Source Model Error DF 4 20 Mean Square 118.0600000 F Value 14.7600000 161.1.9400000 8.3.76 Pr > F <.

945 -5.0001 Source GROUP DF 4 Type III SS 475.745 9.745 7.600 -10.545 11.345 9.055 -7.9600000 CHAPTER 1.545 -0.200 6.855 1.745 -1.76 Pr > F <.255 5.05 20 8.055 -3. GROUP Comparison 4 4 4 4 3 3 3 3 2 2 2 2 5 5 5 5 3 2 5 1 4 2 5 1 4 3 5 1 4 3 2 1 Difference Between Means 4.855 4.800 -4.200 -2.200 10.055 -9.08596 3.945 10.055 8.345 -7.545 8.000 2.945 14.345 -2. Alpha Error Degrees of Freedom Error Mean Square Critical Value of t Least Significant Difference 0.839014 TS Mean 15.800 -6.200 4.9400000 F Value 14.545 -2.800 7.455 7.87642 Root MSE 2.055 -0. COMPLETELY RANDOMIZED DESIGN R-Square 0.7455 Comparisons significant at the 0.545 -10.855 -14.04000 Source GROUP DF 4 Type I SS 475.745 *** *** *** *** *** *** *** *** *** *** *** *** *** .746923 Coeff Var 18.545 3.000 6.255 2.800 -4.600 1.600 5.455 1.76 Pr > F <.800 -6.14 Corrected Total 24 636.9400000 F Value 14. not the experimentwise error rate.055 4.945 0.06 2.0001 4 t Tests (LSD) for TS NOTE: This test controls the Type I comparisonwise error rate.7600000 The SAS System The GLM Procedure Mean Square 118.545 15.7600000 Mean Square 118.800 11.545 -8.000 95% Confidence Limits 0.05 level are indicated by ***.

06 4.173 -12.373 7.05 20 8.973 -5.427 -11.173 -9.055 -4.827 3.800 -4.745 *** *** *** 15 The SAS System The GLM Procedure Tukey’s Studentized Range (HSD) Test for TS NOTE: This test controls the Type I experimentwise error rate.773 0.173 -10.173 -0. THE RANDOM EFFECTS MODEL 1 1 1 1 4 3 2 5 -11.000 -15.573 12.800 -5.973 -4.545 -9.173 9.600 -10.427 0.600 5.800 -6.373 -17.600 -1.427 2.973 10.373 -3.200 6.373 0. 5 Alpha Error Degrees of Freedom Error Mean Square Critical Value of Studentized Range Minimum Significant Difference 0.973 -6.800 11.200 4.173 1.055 -1.427 6.227 4.3.373 Comparisons significant at the 0.855 2.373 *** *** *** *** *** *** *** *** *** *** *** *** The SAS System The GLM Procedure Bonferroni (Dunn) t Tests for TS NOTE: This test controls the Type I experimentwise error rate.427 -1.600 1.800 -7.373 -6.427 -9.345 -4.573 -7. but it generally 6 .573 16.545 -11.05 level are indicated by ***.173 -13.427 -2.000 -11.173 17.773 6.000 2.000 Simultaneous 95% Confidence Limits -1.227 -16.200 -2.200 10.800 -5.173 1.173 13.600 -1.800 -7.000 6.800 -6. GROUP Comparison 4 4 4 4 3 3 3 3 2 2 2 2 5 5 5 5 1 1 1 1 3 2 5 1 4 2 5 1 4 3 5 1 4 3 2 1 4 3 2 5 Difference Between Means 4.800 -4.373 11.573 -0.23186 5.827 5.745 -8.800 7.1.427 -0.373 9.

but it generally has a higher Type II error rate than Tukey’s for all pairwise comparisons.862 -7.200 -2.462 13.138 -9.662 -6.538 3.462 17.662 0.800 -4.0796 Comparisons significant at the 0. 7 Alpha Error Degrees of Freedom Error Mean Square Critical Value of F Minimum Significant Difference 0.600 1.200 6.600 -1.138 -2.06 3.138 0.138 6.462 1.862 -1.662 11.262 -6.06 2.200 4.662 -3.262 -4.800 -7.662 -17. GROUP Comparison 4 4 4 4 3 3 3 3 2 2 2 2 5 5 5 5 1 1 1 1 3 2 5 1 4 2 5 1 4 3 5 1 4 3 2 1 4 3 2 5 Difference Between Means 4.800 11.462 -12.800 -4.062 -16.462 10.462 -0.600 5.05 20 8. . COMPLETELY RANDOMIZED DESIGN has a higher Type II error rate than Tukey’s for all pairwise comparisons.800 -6.462 -11.462 -10.6621 Comparisons significant at the 0.86608 6.262 11.062 4.138 1.262 -5.15340 5.000 6.000 Simultaneous 95% Confidence Limits -1.000 2.662 *** *** *** *** *** *** *** *** *** *** The SAS System The GLM Procedure Scheffe’s Test for TS NOTE: This test controls the Type I experimentwise error rate.662 7.138 2.462 -13.800 -5.000 -11.538 5.800 -6.200 10.600 -10.800 7.062 6.16 CHAPTER 1.05 20 8.862 12.138 -1.662 9.462 1.05 level are indicated by ***.05 level are indicated by ***.138 -11.862 16.062 -0. Alpha Error Degrees of Freedom Error Mean Square Critical Value of t Minimum Significant Difference 0.

800 -5.880 -13.360 3.080 -17.280 16.040 2.720 0. 8 Alpha Error Degrees of Freedom Error Mean Square Critical Value of Dunnett’s t Minimum Significant Difference 0.800 -6.120 4. GROUP Comparison 4 3 2 1 5 5 5 5 Difference Between Means 10.800 -4.160 -5.06 2.760 15.480 -0.800 7.1.200 4.080 12.720 1.720 -0.200 10.880 10.600 -1.880 -10.560 11.720 -1.480 5.800 6.080 -5.680 -7.720 5.080 *** *** *** *** *** *** *** *** *** *** The SAS System The GLM Procedure Dunnett’s t Tests for TS NOTE: This test controls the Type I experimentwise error for comparisons of all treatments against a control.600 5.000 Simultaneous 95% Confidence Limits 6.200 -2.080 0.800 -4.7602 Comparisons significant at the 0.880 -12.080 -3.280 -8.800 11.05 level are indicated by ***.720 -12.200 6.800 -6.880 -0.480 -16.000 Simultaneous 95% Confidence Limits -2.600 1.080 10.880 17.600 -1. THE RANDOM EFFECTS MODEL 17 GROUP Comparison 4 4 4 4 3 3 3 3 2 2 2 2 5 5 5 5 1 1 1 1 3 2 5 1 4 2 5 1 4 3 5 1 4 3 2 1 4 3 2 5 Difference Between Means 4.000 -11.680 11.120 3.05 20 8.000 6.720 -10.560 9.880 0.080 8.280 12.760 *** *** .65112 4.800 -7.720 1.000 2.280 -1.480 7.880 -11.680 -5.880 2.3.600 -10.680 -4.040 -0.800 4.880 13.

i. . QUIT.79555005 t Value -1.57 Pr > |t| 0. ˆ ¯ The Normality Assumption The simplest check for normality involves plotting the empirical quantiles of the residuals against the expected quantiles if the residuals were to follow a normal distribution.05) indicate that the residuals are fairly normal. TITLE1 ’STRENGTH VS. PROC GPLOT DATA=MONT. N (0. Consider the one-way CRD model i = 1. ni .1000000 F Value 3.88 43. where it is assumed that ij ∼i.2500000 348.0000000 -11. The results from the QQ-plot as well as the formal tests (α = . This is known as the normal QQ-plot. 7 1 7 1 15 1 11 1 9 1 12 2 17 2 12 2 18 2 18 2 14 3 18 3 18 3 19 3 19 3 19 4 25 4 22 4 19 4 23 4 7 5 10 5 11 5 15 5 11 5 . N (0.d.8000000 Standard Error 2. CARDS. PLOT TS*GROUP/FRAME. we additionally assume that 2 τi ∼i. RUN.19 Pr > F 0.4.2500000 348. Anderson-Darling.d. COMPLETELY RANDOMIZED DESIGN The SAS System The GLM Procedure 9 Dependent Variable: TS Contrast PHI1 PHI2 DF 1 1 Contrast SS 31.1000000 Mean Square 31. Shapiro-Wilk. Diagnostics depend on the residuals. στ ) independently of ij . INPUT TS GROUP@@. SYMBOL1 V=CIRCLE I=NONE. PERCENTAGE’.18 CHAPTER 1.i. σ 2 ).4 1. Example The following SAS code and partial output checks the normality assumption for the tensile strength example considered earlier.97 -6. Other formal tests for normality (Kolmogorov-Smirnov. SAS Code OPTIONS LS=80 PS=66 NODATE. DATA MONT. eij = yij − yij = yij − yi. · · · .0630 <. In the random eﬀects model.0630 <.1 More About the One-Way Model Model Adequacy Checking yij = µ + τi + ij . k.0001 Parameter PHI1 PHI2 Estimate -5. j = 1. · · · .0001 1. Cramer-von Mises) may also be performed to assess the normality of the residuals.53929124 1.

SYMBOL1 V=CIRCLE I=SM50.40000 Variability Std Deviation Variance Range Interquartile Range 2. RUN. BY PRED.2 . Partial Output The UNIVARIATE Procedure Variable: RES Moments N Mean Std Deviation Skewness Uncorrected SS Coeff Variation 25 0 2. PROC GPLOT DATA=DIAG. QQPLOT RES/NORMAL (L=1 MU=EST SIGMA=EST). TITLE1 ’RESIDUAL PLOT’. Sum Weights Sum Observations Variance Kurtosis Corrected SS Std Error Mean 25 0 6. TITLE1 ’QQ-PLOT OF RESIDUALS’. . PROC SORT DATA=DIAG. PLOT RES*PRED/FRAME. OUTPUT OUT=DIAG R=RES P=PRED. RUN. QUIT. QUIT. RUN. QUIT.11239681 161.1. PROC UNIVARIATE DATA=DIAG NORMAL.40000 -3.4. RUN.8683604 161. QUIT.71667 9.59165 6. MORE ABOUT THE ONE-WAY MODEL 19 PROC GLM.71666667 -0. VAR RES.59165327 0. CLASS GROUP. MODEL TS=GROUP.00000 NOTE: The mode displayed is the smallest of 7 modes with a count of 2.2 0.00000 0.00000 4.51833065 Basic Statistical Measures Location Mean Median Mode 0.

080455 0.1775 .9896 Tests for Normality Test Shapiro-Wilk Kolmogorov-Smirnov Cramer-von Mises Anderson-Darling --Statistic--W D W-Sq A-Sq 0.1818 0.5 0.0000 0.0885 0.5 -----p Value-----Pr > |t| Pr >= |M| Pr >= |S| 1.20 CHAPTER 1.2026 0.4244 0.518572 -----p Value-----Pr Pr Pr Pr < > > > W D W-Sq A-Sq 0.943868 0.162123 0. COMPLETELY RANDOMIZED DESIGN Tests for Location: Mu0=0 Test Student’s t Sign Signed Rank -Statistict M S 0 2.

The hypothesis of interest is 2 2 2 H0 : σ1 = σ2 = · · · = σk versus 2 2 HA : σi = σj for at least one pair i = j . The graphical tool we shall utilize in this class is the plot of residuals versus predicted values. The test statistic is q B0 = 2.3026 c where k q = (n − k) log10 M SW − i=1 2 (ni − 1) log10 Si .4. MORE ABOUT THE ONE-WAY MODEL 21 Constant Variance Assumption Once again there are graphical and formal tests for checking the constant variance assumption. One procedure for testing the above hypothesis is Bartlett’s test.1.

4 Mean Square 22. The tests provide no evidence that indicates the failure of the constant variance assumption. and then running the usual ANOVA F -test using the transformed observations.9198 . CLASS GROUP. Partial SAS Output The GLM Procedure Levene’s Test for Homogeneity of TS Variance ANOVA of Squared Deviations from Group Means Sum of Squares 91. Levene’s test proceeds by computing dij = |yij − mi | . RUN. MEANS GROUP/HOVTEST=BARTLETT HOVTEST=LEVENE. The following modiﬁcation to the proc GLM code given above generates both Bartlett’s and Levene’s tests. Partial SAS Code PROC GLM. The plot of residuals versus predicted values (see above) indicates no serious departure from the constant variance assumption. MODEL TS=GROUP. where mi is the median of the observations in group i. k−1 Bartlett’s test is too sensitive deviations from normality.6224 1015.7704 Bartlett’s Test for Homogeneity of TS Variance Source GROUP DF 4 Chi-Square 0.22 CHAPTER 1. So.9331 Pr > ChiSq 0. QUIT. A test which is more robust to deviations from normality is Levene’s test. COMPLETELY RANDOMIZED DESIGN 1 c=1+ 3(k − 1) We reject H0 if B0 > χ2 (α) k−1 k i=1 1 1 − ni − 1 n−k where χ2 (α) is read from the chi-square table. dij . yij . instead of the original observations.45 Pr > F 0.9056 50. it should not be used if the normality assumption is not satisﬁed.7720 Source GROUP Error DF 4 20 F Value 0. Example Once again we consider the tensile strength example.

28.0 33. is the sum of the ranks of group i. QUIT.5 20. The Kruskal-Wallis test statistic is KW0 = 1 S2 k i=1 2 n(n + 1)2 Ri.03 and KW0 = 19.4.0 14.1. and S2 = The test rejects H0 : µ1 = · · · = µk if KW0 > χ2 (α) . The SAS procedure NPAR1WAY may be used to obtain the Kruskal-Wallis test. we may wish to use nonparametric alternatives to the F -test. OPTIONS LS=80 PS=66 NODATE.5 24. of the observations are given in the following table: 15 2.0 113. Thus we 4 reject the null hypothesis and conclude that the treatments diﬀer.5 7.0 16.5 16. PROC NPAR1WAY WILCOXON. − ni 4 where Ri.5 16. DATA MONT.5 20.5 25.2 Some Remedial Measures The Kruskal-Wallis Test When the assumption of normality is suspect. k−1 Example For the tensile strength data the ranks.0 4.0 7.01) = 13.5 85. we ﬁrst rank all the observations. CARDS. .0 12.5 66.0 25 11. Rij .0 9.0 35 2.5 20 9. yij .5 1 n−1 k ni 2 Rij − i=1 j=1 n(n + 1)2 4 We ﬁnd that S 2 = 53. To perform the Kruskal-Wallis test.0 20. 7 1 7 1 15 1 11 1 9 1 12 2 17 2 12 2 18 2 18 2 14 3 18 3 18 3 19 3 19 3 19 4 25 4 22 4 19 4 23 4 7 5 10 5 11 5 15 5 11 5 .0 Ri. 27. in increasing order. CLASS GROUP. From the chi-square table we get χ2 (.0 23.4.0 30 20.5 16. VAR TS.0 12.5 7.0 2. INPUT TS GROUP@@.25.0 5. MORE ABOUT THE ONE-WAY MODEL 23 1. The Kruskal-Wallis test is one such procedure based on the rank transformation. Say the ranks are Rij . RUN.

A simple method of choosing the appropriate transformation is to plot log Si versus log yi.60 5 5 33. Kruskal-Wallis Test Chi-Square DF Pr > Chi-Square 19.634434 6.50 2 5 66.00 65.00 65.634434 13.20 3 5 85.0 14.0 14. 1/y.70 Average scores were used for ties. arcsin( y).00 4 5 113.634434 5.0008 Variance Stabilizing Transformations There are several variance stabilizing transformations one might consider in the case of heterogeneity of variance (heteroscedasticity). The common transformations are √ √ √ y. log(y).0 14.50 65.0637 4 0.24 CHAPTER 1. COMPLETELY RANDOMIZED DESIGN The NPAR1WAY Procedure Wilcoxon Scores (Rank Sums) for Variable TS Classified by Variable GROUP Sum of Expected Std Dev Mean GROUP N Scores Under H0 Under H0 Score --------------------------------------------------------------------1 5 27.634434 17. We then choose the transformation depending on the slope of the relationship. 1/ y . It uses the maximum likelihood method to simultaneously estimate the transformation parameter as well as the overall mean and the treatment eﬀects.00 65.0 14. The ¯ following table may be used as a guide: Slope 0 1/2 1 3/2 2 Transformation No Transformation Square root Log Reciprocal square root Reciprocal A slightly more involved technique of choosing a variance stabilizing transformation is the Box-Cox transformation. or regress ¯ log Si versus log yi.634434 22.50 65. .0 14. .

Naturally. · · · . and Related Designs 2. • βj is the eﬀect of the jth block. The simplest design that would accomplish this is known as a randomized complete block design (RCBD).1) is the random error term associated with the ijth observation. . . The following layout shows a RCBD with k treatments and b blocks. b (2. Block b Treatment 1 y11 y12 y13 . the treatment. .1 2. . There is one observation per treatment in each block and the treatments are run in a random order within each block. One of the ways in which this could be achieved is through blocking. Each block is divided into k subblocks of equal size. .Chapter 2 Randomized Blocks. i = 1.1 The Randomized Complete Block Design Introduction In a completely randomized design (CRD). ykb The statistical model for RCBD is yij = µ + τi + βj + where • µ is the overall mean. • τi is the ith treatment eﬀect. . The random error component arises because of all the variables which aﬀect the dependent variable except the one controlled variable. This creates diﬀerences among the blocks and makes observations within a block similar. the experimenter wants to reduce the errors which account for diﬀerences among observations within each treatment. 25 . · · · . treatments are assigned to the experimental units in a completely random manner. and • ij ij . .1. j = 1. Within each block the k treatments are assigned at random to the subblocks. Block 1 Block 2 Block 3 . y1b Treatment 2 y21 y22 y23 . This is done by identifying supplemental variables that are used to group experimental subjects that are homogeneous with respect to that variable. The design is ”complete” in the sense that each block contains all the k treatments. . Latin Squares. k. y2b ··· ··· ··· ··· ··· Treatment k yk1 yk2 yk3 .

i. − y. Deﬁne 1 yi. and the sum of squares due to error. the sum of squares due to the blocking. = ¯ b y. − y.. and ∼i. · · · . Symbolically.j + y. = ¯ i=1 i=1 j=1 1 b b y.j ¯ j=1 (yij − y. = ¯ One may show that k b k b yij . )2 + k y ¯ j=1 b (¯.d N (0. SST = SSTreatments + SSBlocks + SSE The degrees of freedom are partitioned accordingly as (n − 1) = (k − 1) + (b − 1) + (k − 1)(b − 1) . )2 = b ¯ i=1 j=1 i=1 k (¯i. )2 y ¯ + i=1 j=1 (yij − yi.. k 1 k 1 n yij .j = ¯ y.26 CHAPTER 2.1. )2 ¯ ¯ ¯ Thus the total sum of squares is partitioned into the sum of squares due to the treatments. b yij = 1 k k yi. i=1 k b j = 1.. RANDOMIZED BLOCKS We make the following assumptions concerning the RCBD model: • • • ij k i=1 τi b j=1 = 0.j − y.. · · · . βj = 0.. We are mainly interested in testing the hypotheses H0 : µ1 = µ2 = · · · = µk HA : µi = µj for at least one pair i = j Here the ith treatment mean is deﬁned as µi = 1 b b (µ + τi + βj ) = µ + τi j=1 Thus the above hypotheses may be written equivalently as H0 : τ1 = τ2 = · · · = τk = 0 HA : τi = 0 for at least one i 2. σ 2 ).2 Decomposition of the Total Sum of Squares b Let n = kb be the total number of observations. j=1 k i = 1.

¯ ¯ i = 1.1. ˆ ¯ ¯ ˆ βj = y.1. ˆ ¯ τi = yi. However.. a large value of FB would indicate that the blocking variable is probably having the intended eﬀect of reducing noise.j − y.j − y. · · · . The following ”cleanliness” readings (higher=cleaner) were obtained using a special device for three diﬀerent types of common stains. ¯ ¯ ¯ ¯ ¯ = yi..3 Testing Statistical Analysis The test for equality of treatment means is done using the test statistic F0 = where M STreatments = An α level test rejects H0 if F0 > Fk−1.j − y.1). The ANOVA table for RCBD is Source Treatments Blocks Error Total df k−1 b−1 (k − 1)(b − 1) n−1 SS SSTreatments SSBlocks SSE SST MS M STreatments M SBlocks M SE F -statistic M STreatments M SE M SBlocks FB = M SE M STreatments M SE SSE ... k j = 1.(k−1)(b−1) (α) . we can see that the estimated values of yij are ˆ yij = µ + τi + βj ˆ ˆ ˆ = y.2.. and βj are obtained via minimization of the sum of squares of the errors k b 2 ij i=1 j=1 k b L= The solution is = i=1 j=1 (yij − µ − τi − βj )2 . ¯ ¯ ¯ Example An experiment was designed to study the performance of four diﬀerent detergents for cleaning clothes. + yi. τi . + y. Estimation Estimation of the model parameters is performed using the least squares procedure as in the case of the completely randomized design. THE RANDOMIZED COMPLETE BLOCK DESIGN 27 2.. − y.. b From the model in (2. · · · . + y. (k − 1)(b − 1) SSTreatments k−1 and M SE = F0 = Since there is no randomization of treatments across blocks the use of FB = M SBlocks /M SE as a test for block eﬀects is questionable. Is there a signiﬁcant diﬀerence among the detergents? . µ = y. − y. The estimators of µ.

1666667 Mean Square 36.6 19/6 which has a p-value < .9722222 67.0022 Coeff Var 3. CLASS STAIN SOAP.1666667 Type III SS 110.53 F Value 11.68 Pr > F 0.1388889 F Value 15.78 21.0833333 18.762883 DF 3 2 DF 3 2 Root MSE 1.0018 Type I SS 110. The corresponding output is The GLM Procedure Dependent Variable: Y Source Model Error Corrected Total R-Square 0.78 21.28 Stain 1 45 47 48 42 182 Stain 2 43 46 50 37 176 CHAPTER 2. 1 1 45 1 2 47 1 3 48 1 4 42 2 1 43 2 2 46 2 3 50 2 4 37 3 1 51 3 2 52 3 3 55 3 4 49 .0063 0.0018 Pr > F 0.5833333 The SAS Type I analysis gives the correct F = 11. CARDS. An incorrect analysis of the data using a one-way ANOVA set up (ignoring the blocking factor) is .0063. Thus we claim that there is a signiﬁcant diﬀerence among the four detergents. INPUT STAIN SOAP Y @@. RUN. RANDOMIZED BLOCKS Stain 3 51 52 55 49 207 Total 139 145 153 128 565 Detergent Detergent Detergent Detergent Total 1 2 3 4 Using the formulæ for SS given above one may compute: SST = 265 SSTreatments = 111 SSBlocks = 135 SSE = 265 − 111 − 135 = 19 Thus F0 = 111/3 = 11. DATA WASH.53 Pr > F 0.01.928908 Source SOAP STAIN Source SOAP STAIN DF 5 6 11 Sum of Squares 246. QUIT.9166667 135. The following SAS code gives the ANOVA table: OPTIONS LS=80 PS=66 NODATE. MODEL Y = SOAP STAIN.771691 Y Mean 47.2166667 3.9166667 Mean Square 49.8333333 264.5833333 Mean Square 36.08333 F Value 11. PROC GLM.9166667 135.0063 0.9722222 67.78 with a p-value of .

CLASS SOAP.139 ˆ2 σr = ˆ2 (b − 1)M SBlocks + b(k − 1)M SE (2)(67.139) = = 14. .9166667 154.2.0000000 264. dfb = (k − 1)(b − 1) = 6.1.2048 Notice that H0 is not rejected indicating no signiﬁcant diﬀerence among the detergents. From the ANOVA table for the RCBD we see that M SE = 3.58) + (3)(3)(3.92 Pr > F 0. R is the increase in the number of replications required if a CRD to achieve the same precision as a RCBD.139. respectively.86 kb − 1 12 − 1 (b − 1)M SBlocks + b(k − 1)M SE kb − 1 The relative eﬃciency of RCBD to CRD is estimated to be (dfb + 1)(dfr + 3) σr ˆ2 ˆ R= · 2 (dfb + 3)(dfr + 1) σb ˆ (6 + 1)(8 + 3)(14.1.5 as many replications to obtain the same precision as obtained by blocking on stain types.86) = = 4. 2. MODEL Y = SOAP. THE RANDOMIZED COMPLETE BLOCK DESIGN PROC GLM.2500000 F Value 1. we may estimate σr and σb as σb = M SE ˆ2 σr = ˆ2 Example Consider the detergent example considered in the previous section. dfr = kb − k = 8 Thus σb = M SE = 3.5 (6 + 3)(8 + 1)(3.9722222 19. 29 The corresponding output is The GLM Procedure Dependent Variable: Y Source Model Error Corrected Total DF 3 8 11 Sum of Squares 110. A natural question to ask is ”How much more eﬃcient is the RCBD compared to a CRD?” One way to deﬁne this relative eﬃciency is 2 (dfb + 1)(dfr + 3) σr R= · 2 (dfb + 3)(dfr + 1) σb 2 2 where σr and σb are the error variances of the CRD and RCBD.4 Relative Eﬃciency of the RCBD The example in the previous section shows that RCBD and CRD may lead to diﬀerent conclusions.9166667 Mean Square 36. and dfr and dfb are the corresponding error degrees of freedom. RUN. QUIT.139) This means that a CRD will need about 4. 2 2 Using the ANOVA table from RCBD.

¯ We also have T.90)(1. This makes the test on the treatment means less sensitive. ¯ 46. The sample treatment means are y1. A quick graphical check for nonadditivity is to plot the residuals. then. We use the regression approach of testing by ﬁtting the full and reduced models. 2 b where qk. RANDOMIZED BLOCKS Another natural question is ”What is the cost of blocking if the blocking variable is not really important.67 y1. ¯ y2. i. according to our model.6 (.33 48. We start out by ﬁtting the model yij = µ + τi + βj + γτi βj + Then testing the hypothesis H0 : γ = 0 is equivalent to testing the presence of nonadditivity.05) 3. Thus in all the equations we replace ni by b. µi and µj . ¯ y4. = 46. The only diﬀerence here is that we use the number of blocks b in place of the common sample size. we are interested in multiple comparisons to ﬁnd out which treatment means diﬀer. to be signiﬁcantly diﬀerent if the absolute value of their sample diﬀerences exceeds M SE 2 Tα = qk. In reality. ¯ y3.e. ˆ yij .33. This setup rules out the possibility of interactions between blocks and treatments. diﬀerences among the means will remain undetected. = 42. We may use any of the multiple comparison procedures discussed in Chapter 1. the way the treatment aﬀects the outcome may be diﬀerent from block to block.5 Comparison of Treatment Means As in the case of CRD. the expected increase of the response in block 1 and treatment 1 is 6. ˆ A formal test is Tukey’s one degree of freedom test for nonadditivity.00. 2.30 CHAPTER 2. ¯ 42.33. i. Suppose we wish to make pairwise comparisons of the treatment means via the Tukey-Kramer procedure. = 51.1.(k−1)(b−1) (α) .139/3 = (4. The Tukey-Kramer procedure declares two treatment means.6 Model Adequacy Checking Additivity The initial assumption we made when considering the model yij = µ + τi + βj + ij is that the model is additive.(k−1)(b−1) (α) is the α percentile value of the studentized range distribution with k groups and (k − 1)(b − 1) degrees of freedom. = 48. Example Once again consider the detergent example of the previous section. Here is the procedure: ij .00 y2. If the ﬁrst treatment increases the expected response by 2 and the ﬁrst block increases it by 4.67. ¯ 2.1. versus the ﬁtted values. ¯ y3.33 51.023) = 5. Any nonlinear pattern indicates nonadditivity. Notice that we are using (k − 1)(b − 1) degrees of freedom in the RCBD as opposed to kb − k in the case of a CRD. if blocking was not necessary?” The answer to this question lies in the diﬀering degrees of freedom we use for the error variable.05 = q4. Thus we lose b − 1 degrees of freedom unnecessarily.0127 Thus using underlining y4.e. eij = yij − yij .

ﬁt the model eij = α + γrij + Let γ be the estimated slope. corresponding to observation ij in resulting from ﬁtting the model above. • Let zij = yij and ﬁt ˆ2 zij = µ + τi + βj + ij ˆ • Let rij = zij − zij be the residuals from this model. cards. respectively. 25 5 3 1 Pressure 30 35 40 4 6 3 1 4 2 1 3 1 45 5 3 2 . model Im = Temp Pres. quit. The data is given below.2.40. output. We will use temperature as a blocking variable. ˆ • The sum of squares due to nonadditivity is k b 2 rij i=1 j=1 ij SSN = γ 2 ˆ • The test statistic for nonadditivity is F0 = Example SSN /1 (SSE − SSN )/[(k − 1)(b − 1) − 1] The impurity in a chemical product is believed to be aﬀected by pressure.e. Temp 100 125 150 The following SAS code is used. proc print. class Temp Pres. run.1. end. Options ls=80 ps=66 nodate. i. Input Temp @. title "Tukey’s 1 DF Nonadditivity Test". • Regress eij on rij . THE RANDOMIZED COMPLETE BLOCK DESIGN • Fit the model yij = µ + τi + βj + ij 31 ˆ • Let eij and yij be the residual and the ﬁtted value. Do Pres = 25.30. proc glm. Input Im @. Data Chemical.35.45. 100 5 4 6 3 5 125 3 1 4 2 3 150 1 1 3 1 2 .

32 output out=out1 predicted=Pred. run.95 10. Example Consider the detergent example above.948516 Source Temp Pres Psquare Source Temp Pres Psquare Coeff Var 17. proc glm.76786 DF 7 7 14 Sum of Squares 35. quit.36 with 1 and 7 degrees of freedom.32 1.90147783 36. The graphic tools are the QQ-plot and the histogram of the residuals.09852217 F Value 42. quit.33333333 11. CHAPTER 2. model Im = Temp Pres Psquare. run.66666667 2.25864083 1. Normality The diagnostic tools for the normality of the error terms are the same as those use in the case of the CRD. The following SAS code gives the normality diagnostics.09852217 Thus F0 = 0.5660. . run.4634 0.09852217 Type III SS 1.09624963 0.42 Pr > F 0.933333 Mean Square 11. set out1.27406241 0.03185550 1. Psquare = Pred*Pred.5660 Pr > F 0. It has a p-value of 0. class Temp Pres.1690 0.09852217 Mean Square 0. Thus we have no evidence to declare nonadditivity. quit. RANDOMIZED BLOCKS The following is the corresponding output.93333333 Im Mean 2.5660 Mean Square 5.68 0.00455079 0.60000000 0.0042 0.27163969 F Value 18.521191 DF 2 4 1 DF 2 4 1 Type I SS 23. /* Form a new variable called Psquare.62932041 0. */ Data Tukey. Formal tests like the Kolmogorov-Smirnov test may also be used to assess the normality of the errors.01 0.90000000 0. Tukey’s 1 DF Nonadditivity Test The GLM Procedure Dependent Variable: Im Source Model Error Corrected Total R-Square 0.36 F Value 2.0001 0.0005 Root MSE 0.36 Pr > F 0.

RUN.2. RUN. CARDS. PROC UNIVARIATE NOPRINT. CLASS STAIN SOAP. The associated output is (ﬁgures given ﬁrst): . INPUT STAIN SOAP Y @@. PROC GPLOT. QQPLOT RES / NORMAL (L=1 MU=0 SIGMA=EST). MEANS SOAP/ TUKEY LINES. OUTPUT OUT=DIAG R=RES P=PRED. 1 1 45 1 2 47 1 3 48 1 4 42 2 1 43 2 2 46 2 3 50 2 4 37 3 1 51 3 2 52 3 3 55 3 4 49 . QUIT. PLOT RES*STAIN. MODEL Y = SOAP STAIN. PLOT RES*SOAP. 33 PROC GLM.1. HIST RES / NORMAL (L=1 MU=0 SIGMA=EST). DATA WASH. RUN. QUIT. THE RANDOMIZED COMPLETE BLOCK DESIGN OPTIONS LS=80 PS=66 NODATE. QUIT. PLOT RES*PRED.

34 CHAPTER 2.333 46.252775 Goodness-of-Fit Tests for Normal Distribution Test Cramer-von Mises Anderson-Darling ---Statistic---W-Sq A-Sq 0.333 42.05 Error Degrees of Freedom 6 Error Mean Square 3.01685612 0.13386116 -----p Value----Pr > W-Sq Pr > A-Sq >0. but it generally has a higher Type II error rate than REGWQ. Tukey Groupi ng A A A A A Mean 51.250 >0.0076 Means with the same letter are not significantly different. . Alpha 0. RANDOMIZED BLOCKS Tukey’s Studentized Range (HSD) Test for Y NOTE: This test controls the Type I experimentwise error rate.667 N 3 3 3 3 SOAP 3 2 1 4 B B B RCBD Diagnostics The UNIVARIATE Procedure Fitted Distribution for RES Parameters for Normal Distribution Parameter Mean Std Dev Symbol Mu Sigma Estimate 0 1.89559 Minimum Significant Difference 5.250 Th QQ-plot and the formal tests do not indicate the presence of nonnormality of the errors.000 48.138889 Critical Value of Studentized Range 4.

and residuals versus predicted values. The spread of the residuals seems to diﬀer from detergent to detergent. THE RANDOMIZED COMPLETE BLOCK DESIGN Constant Variance 35 The tests for constant variance are the same as those used in the case of the CRD. We may need to transform the values and rerun the analsis. The plots we need to examine in this case are residuals versus blocks.1. such as Levene’s test or perform graphical checks to see if the assumption of constant variance is satisﬁed.2. residuals versus treatments. . One may use formal tests. The plots below (produced by the SAS code above) suggest that there may be nonconstant variance.

We then substitute yij and carry out the ANOVA as usual.17 6 kTi.2 = 4(91) + 3(139) − 528 = 42.1. We proceed to estimate the rest in a similar fashion.j is the total for block j. = 91. The analysis is then performed as if the block or treatment is nonexistent.. Since the substituted value adds no practical information to the design. The estimate is y4. We repeat this process until convergence. however. • T. i.e. then we substitute a value yij = where • Ti. diﬀerence between consecutive estimates is small. One way to proceed would be to use a multiple regression analysis. each treatment appears once in every block. T. When more than one value is missing. it should not be used in computations of means. be a loss of one degree of freedom from both the total and error sums of squares. Suppose y4. (k − 1)(b − 1) . T. We then estimate the one missing value using the procedure above.36 CHAPTER 2. say yij . Example Consider the detergent comparison example. is the grand total. for instance.7 Missing Values In a randomized complete block design.j − T. There will.. and • T.2 = 139.2 = 37 is missing. We ﬁrst guess the values of all except one. If several observations are missing from a single block or a single treatment group. A missing observation would mean a loss of the completeness of the design. Another way would be to estimate the missing value. Note that the totals (without 37) are T4. If only one value is missing. we usually eliminate the block or treatment in question. is the total for treatment i. they may be estimated via an iterative process. + bT. when performing multiple comparisons. We then estimate the second one using the one estimated value and the remaining guessed values. = 528. RANDOMIZED BLOCKS 2..

7398167 Mean Square 23.9768528 53.1564917 Root MSE 0. The SAS System The GLM Procedure Dependent Variable: Y Source Model Error Corrected Total R-Square 0. 1 1 45 1 2 47 1 3 48 1 4 42 2 1 43 2 2 46 2 3 50 2 4 .956218 Y Mean 47. QUIT.30 Pr > F 0. We then need to modify the F value by hand using the correct degrees of freedom.05) = 5.9305583 107.92 Pr > F 0.41.5 (. QUIT. THEN Y = 42.8699083 So.1.0008 0.51417 F Value 26. MODEL Y = SOAP STAIN.17. The following is the associated SAS output. the correct F value is F0 = 71.9305583 107.4861167 185.9768528 53. 3 1 51 3 2 52 3 3 55 3 4 49 .93/3 = 21.7398167 Type III SS 71. .92 F Value 26.012490 DF 3 2 DF 3 2 Type I SS 71. DATA WASH.84 5. RUN. PROC GLM.0008 0.2.0001 Pr > F 0.0001 Mean Square 35.6703750 5. SET WASH.22 58. IF Y = . OPTIONS LS=80 PS=66 NODATE.49/5 which exceeds the tabulated value of F3. RUN. CARDS. INPUT STAIN SOAP Y @@.970370 Source SOAP STAIN Source SOAP STAIN DF 5 6 11 Sum of Squares 179.0002 Coeff Var 2. THE RANDOMIZED COMPLETE BLOCK DESIGN 37 Now we just plug in this value and perform the analysis.8699083 Mean Square 23. /* Replace the missing value with the estimated value. The following SAS code will perform the RCBD ANOVA.9143528 F Value 39. */ DATA NEW.9340750 0.22 58. CLASS STAIN SOAP.

PROC TABULATE. B. However. and treatments. row and column hereafter. RUN. These observations are then placed in a p × p grid made up of. TABLE ROWS. and only once. QUIT. Consider a situation where we have two blocking variables. RUN. OUTPUT OUT=LAT44 ROWS NVALS=(1 2 3 4) COLS NVALS=(1 2 3 4) TMTS NVALS=(1 2 3 4). PROC PLAN SEED=12345. and D and two factors to control. we need p2 observations. ------------------------------------------------------A 4 BY 4 LATIN SQUARE DESIGN ______________________________________ | | COLS | | |___________________| | | 1 | 2 | 3 | 4 | | |____|____|____|____| | |TMTS|TMTS|TMTS|TMTS| | |____|____|____|____| | |Sum |Sum |Sum |Sum | |__________________|____|____|____|____| |ROWS | | | | | |__________________| | | | | |1 | 1| 2| 3| 4| |__________________|____|____|____|____| |2 | 2| 3| 4| 1| |__________________|____|____|____|____| |3 | 3| 4| 1| 2| |__________________|____|____|____|____| |4 | 4| 1| 2| 3| |__________________|____|____|____|____| . VAR TMTS. Say we have 4 treatments.2 The Latin Square Design The RCBD setup allows us to use only one factor as a blocking variable.38 CHAPTER 2. p rows and p columns. TREATMENTS TMTS=4 CYCLIC. C. COLS*TMTS. in each row and column. To build a Latin square design for p treatments. RANDOMIZED BLOCKS 2. QUIT. OPTIONS LS=80 PS=66 NODATE. FACTORS ROWS=4 ORDERED COLS=4 ORDERED /NOPRINT. A. in particular the Latin square design. in such a way that each treatment occurs once. TITLE ’A 4 BY 4 LATIN SQUARE DESIGN’. The following SAS code gives the above basic 4 × 4 design. A basic 4 × 4 Latin square design is Column 2 3 B C C D D A A B Row 1 2 3 4 1 A B C D 4 D A B C The SAS procedure Proc PLAN may be used in association with Proc TABULATE to generate designs. sometimes we have two or more factors that can be controlled. One design that handles such a case is the Latin square design. CLASS ROWS COLS.

· · · .. − y.... and • ijk ∼i. 2. .. − y. − y.. )2 y ¯ (yijk − yi. σ 2 ). ) ¯ ¯ ¯ y eijk We have SST = SSRow + SST rt + SSCol + SSE where SST = SSRow = p SST rt = p SSCol = p SSE = (yijk − y.. )+ y ¯ τj + ˆ (¯. )+ y ¯ ˆ βk + (yijk − yi.. • βk is the kth block 2 (column) eﬀect.j. − y. and error..2. )+ y ¯ αi + ˆ (¯.1 Statistical Analysis The total sum of squares. )2 y ¯ (¯.. • αi is the ith block 1 (row) eﬀect.. − y. )2 ¯ ¯ ¯ y Thus the ANOVA table for the Latin square design is Source Treatments Rows Columns Error Total df p−1 p−1 p−1 (p − 2)(p − 1) p2 − 1 SS SST rt SSRow SSCol SSE SST MS M ST rt M SRow M SCol M SE F -statistic F0 = M ST rt M SE The test statistic for testing for no diﬀerences in the treatment means is F0 .. + ¯ µ+ ˆ (¯i. and treatments.k − y. partitions into sums of squares due to columns... THE LATIN SQUARE DESIGN The statistical model for a Latin square design is i = 1. An intuitive way of identifying the components is (since all the cross products are zero) yijk = = y. p j = 1. p 39 yijk = µ + αi + τj + βk + ijk where • µ is the grand mean. SST ..j.. An α level test rejects null hypothesis if F0 > Fp−1.2..j. )2 y ¯ (¯. There is no interaction between rows. · · · . rows.. )2 ¯ (¯i. · · · .j.. − y.... − y.i.2.k + 2¯. • τj is the jth treatment eﬀect.. the model is completely additive.. treatments. − y...k − y. columns.d. The only diﬀerence is that b is replaced by p and the error degrees of freedom becomes (p − 2)(p − 1) instead of (k − 1)(b − 1)..k + 2¯.(p−2)(p−1) (α).. p k = 1. N (0. Multiple comparisons are performed in a similar manner as in the case of RCBD.

RUN. INPUT CARDS.1875000 54.5625000 49.8750000 247.0026 <. MODEL MILK = DIET PERIOD COW.0625000 18.05 . QUIT. OPTIONS LS=80 PS=66 NODATE.38 22. DATA NEW.40 Example CHAPTER 2.0026 <.38 22.901388 MILK Mean 35.6875000 Mean Square 13.17 Pr > F 0.6875000 147.0625000 18.6875000 147.0012 Pr > F 0.980298 Source DIET PERIOD COW Source DIET PERIOD COW DF 9 6 15 Sum of Squares 242. CLASS COW DIET PERIOD.2291667 Mean Square 13.0001 0. During each lactation period the cows receive a diﬀerent diet.525780 DF 3 3 3 DF 3 3 3 Type I SS 40.1875000 54.4375000 Root MSE 0.68750 F Value 16.0002 Coeff Var 2. Period 1 2 3 4 1 A=38 B=32 C=35 D=33 Cow 2 3 B=39 C=45 C=37 D=38 D=36 A=37 A=30 B=35 4 D=41 A=30 B=32 C=33 The following gives the SAS analysis of the data. Lactation period and cows are used as blocking variables.44 F Value 16. Assume that there is a washout period between diets so that previous diet does not aﬀect future results. A 4 × 4 Latin square design is implemented.69 60.9513889 0. ------------------------------------------------Dependent Variable: MILK Source Model Error Corrected Total R-Square 0. RANDOMIZED BLOCKS Consider an experiment to investigate the eﬀect of four diﬀerent diets on milk production of cows.5625000 4. COW PERIOD DIET MILK @@.69 60.44 Pr > F 0. RUN. 1 1 1 2 1 2 3 1 3 4 1 4 . QUIT.0001 0.8125000 F Value 33. 38 39 45 41 1 2 3 4 2 2 2 2 2 3 4 1 32 37 38 30 1 2 3 4 3 3 3 3 3 4 1 2 35 36 37 32 1 2 3 4 4 4 4 4 4 1 2 3 33 30 35 33 PROC GLM.2291667 Tukey’s Studentized Range (HSD) Test for MILK Alpha 0. There are four cows in the study.6875000 Type III SS 40. MEANS DIET/ LINES TUKEY.5625000 49.0012 Mean Square 26.

7500 N 4 4 4 4 DIET 3 4 2 1 Thus there diet has a signiﬁcant eﬀect (p-value=0..2064 41 Means with the same letter are not significantly different. RCBDcol ) = pM SE Similarly. .2.2 Missing Values Missing values are estimated in a similar manner as in RCBD’s. the estimated relative eﬃciency of a Latin square design with respect to a CRD M SRow + M SCol + (p − 1)M SE ˆ R(Latin. we employ an iterative procedure similar to the one in RCBD. RCBDrow ) = pM SE Furthermore.8125) Thus a RCBD design with just cows as blocks would cost about 16 times as much as the present Latin square design to achieve the same sensitivity. .0000 34. RCBDcows ) = = = 15. the estimated relative eﬃciency of a Latin square design with respect to a RCBD with the columns omitted and the rows as blocks is M SCol + (p − 1)M SE ˆ R(Latin. and grand totals of the available observations. 2. CRD) = (p + 1)M SE For instance. + T. Tukey Grouping A A A B B B Mean 37.3 Relative Eﬃciency The relative eﬃciency of the Latin square design with respect to other designs is considered next. THE LATIN SQUARE DESIGN Error Degrees of Freedom Error Mean Square Critical Value of Studentized Range Minimum Significant Difference 6 0.k ) − 2T.5000 37. The estimated relative eﬃciency of a Latin square design with respect to a RCBD with the rows omitted and the columns as blocks is M SRow + (p − 1)M SE ˆ R(Latin.8125) ˆ R(Latin. The same result holds for diets A and B. . are the row i. treatment j.j..0026) on milk production.2. (p − 1)(p − 2) where Ti.5000 33.8125 4. we see that if we just use cows as blocks. considering the milk production example..k . The Tukey-Kramer multiple comparison procedure indicates that diets C and D do not diﬀer signiﬁcantly. it is estimated by yijk = p(Ti.2. and T. All other pairs are declared to be signiﬁcantly diﬀerent. If only yijk is missing. T. T..89559 2.85 pM SE 4(. 2...06 + 3(.2. If more than one value is missing. + T.j.. column k. respectively.. we get M SPeriod + (p − 1)M SE 49.

diﬀerent columns : 3 . same columns : 3 1 2 3 1 A B C 1 C B A 1 B A C 2 B C A 2 B A C 2 A C B 3 C A B 3 A C B 3 C B A replication 1 4 5 6 2 7 8 9 diﬀerent rows.4 Replicated Latin Square Replication of a Latin square is done by forming several Latin squares of the same dimension.42 CHAPTER 2. same rows. This may be done using • same row and column blocks • new rows and same columns • same rows and new columns • new rows and new columns Examples The following 3×3 Latin square designs are intended to illustrate the techniques of replicating Latin squares.2. same columns : 1 A B C 1 C B A 1 B A C 2 B C A 2 B A C 2 A C B 3 C A B 3 A C B 3 C B A replication 1 1 2 3 1 2 3 2 1 2 3 diﬀerent rows. RANDOMIZED BLOCKS 2.

.l − y. )2 ¯ (¯.. Here ψl represents the eﬀect of the lth replicate... · · · . r where r is the number of replications. ...2. p l = 1.. − y. · · · . )2 y ¯ k=1 r SSCol = np SSRep = p2 l=1 (¯.2. The associated ANOVA table is Source Treatments Rows Columns Replicate Error Total df p−1 p−1 p−1 r−1 (p − 1)[r(p + 1) − 3] rp2 − 1 SS SST rt SSRow SSCol SSRep SSE SST MS M ST rt M SRow M SCol M SRep M SE F -statistic F0 = M ST rt M SE where SST = p (yijkl − y... However. it adds a parameter (or parameters) in our model.j.. − y. The analysis of variance depends on the type of replication... THE LATIN SQUARE DESIGN 1 A B C 4 C B A 7 B A C 2 B C A 5 B A C 8 A C B 3 C A B 6 A C B 9 C B A replication 1 43 1 2 3 4 5 6 2 7 8 9 3 Replication increases the error degrees of freedom without increasing the number of treatments. p j = 1.. )2 y ¯ (¯.. · · · .. · · · . Replicated Square This uses the same rows and columns and diﬀerent randomization of the treatments within each square. p yijkl = µ + αi + τj + βk + ψl + ijkl k = 1.... )2 y ¯ and SSE is found by subtraction. The statistical model is i = 1. − y. thus increasing the complexity of the model... )2 y ¯ j=1 p SST rt = np SSRow = np i=1 p (¯i..k.

SINGLE LATIN SQUARES 1 ------------------------------------------.0 1 1 2 3 28.47 F Value Pr > F 0. INPUT SQUARE COL ROW TREAT YIELD. MODEL YIELD= SQUARE COL ROW TREAT.0076 Pr > F Mean Square 10. MODEL YIELD= COL ROW TREAT.SQUARE=1 -------------------------------------------The GLM Procedure Dependent Variable: YIELD Source Model Error Corrected Total R-Square 0.72222222 38.3 1 2 1 3 25.7 2 2 2 1 24. RUN. PROC GLM.43 Pr > F 0.0137 Coeff Var 1.96777778 11.3 2 3 1 1 25.9 2 2 1 2 28.0123 0.4 2 1 2 2 31.5 2 3 3 2 29. QUIT.22000000 0. DATA ADDITIVE. PROC GLM. A B & C) were tested for gas eﬃciency by three drivers (ROWS) using three diﬀerent tractors (COLUMNS).56222222 Type III SS Mean Square 0.29555556 64.70333333 0.1325 0. CLASS COL ROW TREAT.6 1 2 3 2 28. TITLE ’REPLICATED LATIN SQUARES SHARING BOTH ROWS AND COLUMNS’.28111111 Mean Square .3 2 2 3 3 29. CLASS SQUARE COL ROW TREAT. The variable measured was the yield of carbon monoxide in a trap.1 2 1 1 3 32.86111111 19.14777778 F Value 72. RUN.51555556 Root MSE 0.7 2 1 3 1 24. Here is the SAS analysis.0 1 2 2 1 23. QUIT.3 1 3 2 2 28.995419 Source COL ROW TREAT Source DF 2 2 2 DF DF 6 2 8 Sum of Squares 64.5 1 3 3 3 30. CARDS.8 2 3 2 3 30.4 1 3 1 1 21. RANDOMIZED BLOCKS Three gasoline additives (TREATMENTS.7 1 1 3 1 25.384419 YIELD Mean 26. The experiment was repeated twice. TITLE ’SINGLE LATIN SQUARES’. 1 1 1 2 26.93555556 23. BY SQUARE.44 Example CHAPTER 2.55 80.460434 Type I SS 1.32222 F Value 6.26 130.2 .

93555556 23.65333333 REPLICATED LATIN SQUARES SHARING BOTH ROWS AND COLUMNS The GLM Procedure Dependent Variable: YIELD Source Model Error Corrected Total R-Square 0.01444444 7.39388889 .3012222 F Value 8.0114 0.516978 YIELD Mean 27.60055556 47.24000000 1.56 20.65333333 Mean Square 3.78777778 Type III SS 22.0114 0.72222222 38.60055556 47.2.74333333 1.2563 0.0003 Mean Square 18.19 Pr > F 0.793725 YIELD Mean 28.56 1.94 45.74333333 1.42778 F Value 9.00055556 8.2244 0.8576984 2.0215 5 Mean Square 11.44666667 57.56222222 0.0018 Coeff Var 5.0038889 23.39388889 Mean Square 22.56 1.0076 ------------------------------------------.00722222 3.20111111 94.96777778 11.1325 0.20666667 0.94 1.20111111 94.78777778 Mean Square 22.48 F Value 5.22333333 28.981606 Source COL ROW TREAT Source COL ROW TREAT DF 2 2 2 DF 2 2 2 DF 6 2 8 Sum of Squares 67.0122222 155.79 Pr > F 0.74 1.0215 Pr > F 0.50000000 Root MSE 0.1441 0.2.56 20.44666667 57.00722222 3.3399 0.00055556 4.94 1.0003 Pr > F 0.0123 0.781748 Type I SS 7.00055556 4.2244 0.26 130.74 1.0542 Coeff Var 2.60 F Value 9.94 45.SQUARE=2 -------------------------------------------The GLM Procedure Dependent Variable: YIELD Source Model Error Corrected Total R-Square 0.47 0.48666667 2.30666667 Type III SS 7. THE LATIN SQUARE DESIGN 45 COL ROW TREAT 2 2 2 1.530809 Type I SS 22.86111111 19.48 Pr > F 0.63000000 F Value 17.00055556 8.0161111 Root MSE 1.1441 0.30666667 Mean Square 3.48666667 2.53333 F Value 5.22333333 28.60 Pr > F 0.28111111 6.3399 0.851549 Source SQUARE COL ROW TREAT Source SQUARE COL ROW TREAT DF 1 2 2 2 DF 1 2 2 2 DF 7 10 17 Sum of Squares 132.01444444 7.2563 0.26000000 68.55 80.

RANDOMIZED BLOCKS In this instance the rows of diﬀerent squares are independent but the columns are shared. · · · . The model is i = 1.7 1 1 3 1 25.0 1 1 2 3 28.9 2 2 4 2 28. QUIT.3 2 3 4 1 25. DATA ADDITIVE. RUN.8 2 3 5 3 30. The associated ANOVA table is Source Treatments Rows Columns Replicate Error Total df p−1 r(p − 1) p−1 r−1 (p − 1)[rp − 2] rp2 − 1 SS SST rt SSRow SSCol SSRep SSE SST MS M ST rt M SRow M SCol M SRep M SE F -statistic F0 = M ST rt M SE Example We will reanalyze the previous example.3 1 2 1 3 25. The statistical model in shows this by nesting row eﬀects within squares. · · · . · · · . i. INPUT SQUARE COL ROW TREAT YIELD. TITLE ’REPLICATED LATIN SQUARES SHARING ONLY COLUMNS’.4 2 1 5 2 31.18 Pr > F <.0001 7 . the treatment eﬀect may be diﬀerent for each square.4 1 3 1 1 21.46 Replicated Rows CHAPTER 2. p j = 1.3 2 2 6 3 29.7746296 F Value 33.6 1 2 3 2 28. CARDS. CLASS SQUARE COL ROW TREAT.2 . PROC GLM. REPLICATED LATIN SQUARES SHARING ONLY COLUMNS The GLM Procedure Dependent Variable: YIELD Source Model DF 9 Sum of Squares 150. p l = 1.3 1 3 2 2 28.0 1 2 2 1 23.e diﬀerent rows but same columns.7 2 1 6 1 24. p k = 1. Thus.7 2 2 5 1 24.5 2 3 6 2 29. MODEL YIELD= SQUARE COL ROW(SQUARE) TREAT.1 2 1 4 3 32. r yijkl = µ + αi(l) + τj + βk + ψl + ijkl where αi(l) represents the eﬀect of row i nested within replicate (square) l.5 1 3 3 3 30.9716667 Mean Square 16. this time assuming we have six diﬀerent drivers. · · · . 1 1 1 2 26.

6 1 2 3 2 28.0161111 Root MSE 0.52 7. DATA ADDITIVE.4 2 4 5 2 31.0002 0.00722222 6. INPUT SQUARE COL ROW TREAT YIELD.42778 F Value 43.16888889 94. CARDS.78777778 Mean Square 22.7 1 1 3 1 25.54222222 47. · · · .39388889 Replicated Rows and Columns The diﬀerent Latin squares are now independent.973910 Source SQUARE COL ROW(SQUARE) TREAT Source SQUARE COL ROW(SQUARE) TREAT 8 17 4.592351 DF 1 2 4 2 DF 1 2 4 2 YIELD Mean 27.1 2 4 4 3 32. r yijkl = µ + αi(l) + τj + βk(l) + ψl + ijkl The associated ANOVA table is Source Treatments Rows Columns Replicate Error Total df p−1 r(p − 1) r(p − 1) r−1 (p − 1)[r(p − 1) − 1] rp2 − 1 SS SST rt SSRow SSCol SSRep SSE SST MS M ST rt M SRow M SCol M SRep M SE F -statistic F0 = M ST rt M SE Example Lets reanalyze the previous example.94 93.7 2 4 6 1 24.93 12. · · · .0127 0.0 1 1 2 3 28.4 1 3 1 1 21.00055556 8. this time assuming six diﬀerent drivers and six diﬀerent tractors.78777778 Type III SS 22. · · · .3 1 3 2 2 28.52 7.00055556 8.75 F Value 43.94 93.01444444 26. The statistical model is i = 1. p j = 1.39388889 Mean Square 22.0002 0.0127 0.7 2 5 5 1 24.711024 0. 1 1 1 2 26.00055556 4.3 1 2 1 3 25.0 1 2 2 1 23.00722222 6.0001 Type I SS 22.54222222 47.5055556 Coeff Var 2.2.16888889 94.75 Pr > F 0.0014 <.5 1 3 3 3 30.00055556 4.3 . p k = 1.9 2 5 4 2 28.0001 Pr > F 0. THE LATIN SQUARE DESIGN 47 Error Corrected Total R-Square 0. · · · .0014 <.2.93 12. p l = 1. Both row and column eﬀects are nested within the squares.01444444 26.0444444 155.

3 The Graeco-Latin Square Design Graeco-Latin squares are used when we have three blocking variables. RANDOMIZED BLOCKS PROC GLM.78777778 Type III SS 22. with the exception of p = 6. columns. 5 6 6 6 6 4 5 6 3 1 3 2 29.42222222 26.39388889 2.89 107.35555556 6. In a Graeco-Latin square. An example of a Graeco-Latin square for p = 4 is Cγ Dδ Aα Bβ The statistical model is Aβ Bα Cδ Dγ Dα Cβ Bγ Aδ Bδ Aγ Dβ Cα .00055556 2.662906 YIELD Mean 27.0001 Pr > F 0. and only once.0004 0.36 14. and with each Greek letter.06 5.0161111 Root MSE 0. and treatments. each row.0350 0.6366667 155.3 25. in each column.78777778 Mean Square 22.00055556 2.2 CHAPTER 2.85 F Value 50.36 14.0004 0.0002 16 Coeff Var 2.39388889 Mean Square 22.416915 Type I SS 22.52 Pr > F 0.16888889 94.0350 0. CLASS SQUARE COL ROW TREAT.35555556 6.54222222 47.48 2 2 2 2 .42778 F Value 50.0029 <. REPLICATED LATIN SQUARES (INDEPENDENT) The GLM Procedure Dependent Variable: YIELD Source Model Error Corrected Total R-Square 0.85 Pr > F 0.16888889 94.8526768 0. RUN.0001 Mean Square 13.8 30. MODEL YIELD= SQUARE COL(SQUARE) ROW(SQUARE) TREAT.4394444 F Value 31.54222222 47.982991 Source SQUARE COL(SQUARE) ROW(SQUARE) TREAT Source SQUARE COL(SQUARE) ROW(SQUARE) TREAT DF 1 4 4 2 DF 1 4 4 2 DF 11 6 17 Sum of Squares 152. Thus we investigate four factors: rows.0029 <.00055556 9.06 5.00055556 9. Greek letters. Greek letters are used to represent the blocking in the third direction.42222222 26. Graeco-Latin squares exist for each p ≥ 3.89 107.5 29. each treatment appears once. TITLE ’REPLICATED LATIN SQUARES (INDEPENDENT)’. QUIT.3794444 2.

. )2 ¯ (¯. p l = 1.. − y.3.... (2) diﬀerences among stores.k.... . The following example is taken from Petersen : Design and Analysis of Experiments (1985). THE GRAECO-LATIN SQUARE DESIGN i = 1. The following table contains the results of his trial. − y. p k = 1. ﬁve diﬀerent stores assigned to the column classiﬁcation..l − y.. C.. These included: (1) day of the week. There were a number of sources of variation.. Example A food processor wanted to determine the eﬀect of package design on the sale of one of his products. The associated ANOVA table is Source Treatments Greek letters Rows Columns Error Total df p−1 p−1 p−1 p−1 (p − 1)(p − 3) p2 − 1 SS SST rt SSG SSRow SSCol SSE SST MS M ST rt M SG M SRow M SCol M SE F -statistic F0 = M ST rt M SE where SST = p (yijkl − y. )2 y ¯ j=1 p SST rt = p SSRow = p i=1 p (¯i. He decided to conduct a trial using a Graeco-Latin square design with ﬁve weekdays corresponding to the row classiﬁcation. D. He had ﬁve designs to be tested : A. and (3) eﬀect of shelf height... · · · ..... )2 y ¯ k=1 r SSCol = p SSG = p l=1 (¯. · · · .j. p j = 1. B. and ﬁve shelf heights corresponding to the Greek letter classiﬁcation. )2 y ¯ and SSE is found by subtraction. E.. )2 y ¯ (¯. · · · . · · · . − y. Day Mon Tue Wed Thur Fri 1 E α (238) D δ (149) B (222) C β (187) A γ (65) 2 C δ (228) B β (220) E γ (295) A (66) D α (118) Store 3 B γ (158) A α (92) D β (104) E δ (242) C (279) 4 D (188) C γ (169) A δ (54) B α (122) E β (278) 5 A β (74) E (282) C α (213) D γ (90) B γ (176) The following is the SAS analysis.. p 49 yijkl = µ + αi + τj + βk + ψl + ijkl where ψl is the eﬀect of the lth Greek letter..2.

9600 115462.1366 . 1 1 5 1 238 1 2 3 4 228 1 3 2 3 158 1 4 4 5 188 1 5 1 2 74 2 1 4 4 149 2 2 2 2 220 2 3 1 1 92 2 4 3 3 169 2 5 5 5 282 3 1 2 5 222 3 2 5 3 295 3 3 4 2 104 3 4 1 4 54 3 5 3 1 213 4 1 3 2 187 4 2 1 5 66 4 3 5 4 242 4 4 2 1 122 4 5 4 3 90 5 1 1 3 65 5 2 4 1 118 5 3 3 5 279 5 4 5 2 278 5 5 2 4 176 . MODEL Y = ROW COL TRT GREEK. RANDOMIZED BLOCKS The SAS System The GLM Procedure Dependent Variable: Y Source Model Error Corrected Total R-Square 0.5400 2213.50 OPTIONS LS=80 PS=66 NODATE.5600 1544. CLASS ROW COL TRT GREEK.42 31.3600 F Value 1.7919 <.946929 Source ROW COL TRT GREEK Source ROW COL TRT GREEK DF 16 8 24 Sum of Squares 131997. PROC GLM.1600 8852.9600 115462.1600 Mean Square 1534. DATA GL.9200 139395.92 2 Pr > F 0.6400 386.64304 DF 4 4 4 4 DF 4 4 4 4 Type I SS 6138.8400 7397. CARDS.0400 Pr > F 0.1366 Pr > F 0.66 0.2400 28865. RUN.7919 <.0019 Coeff Var 17.1600 8852.0400 Mean Square 1534.1600 Type III SS 6138.40954 Y Mean 172.2400 28865.66 0. INPUT ROW COL TRT GREEK Y.7600 Root MSE 30.5600 1544.42 31. QUIT.39 F Value 1.5400 2213.39 Mean Square 8249.0001 0.21 2.6400 386.8650 924.7400 F Value 8. CHAPTER 2.2510 0.21 2.2510 0.0001 0.

and C where two treatments are run in every block. Thus using three blocks 1 A B block 2 B C 3 A C 3 2 =3 Now consider ﬁve treatments. The statistical model is i = 1. B. k yij = µ + τi + βj + ij j = 1. 51 2.4. • Let λ be the number of times each pair of treatments appear together in the same block. however.2. INCOMPLETE BLOCK DESIGNS There are highly signiﬁcant diﬀerences in mean sales among the ﬁve package designs. 1 A C D 2 B D E 3 B C E 4 C D E block 5 6 A A B B C .1 Balanced Incomplete Block Designs (BIBD’s) In a BIBD setup. A. · · · . · · · . each block is selected in a balanced manner so that any pair of treatments occur together the same number of times as any other pair. D. and E where 3 treatments appear per block. There are ways of choosing 2 out of three.D 7 A D E 8 B C D k a 9 A B E 10 A C E Usually. Suppose there are k treatments and b blocks. B. . One way to construct a BIBD is by using k blocks and assigning diﬀerent combination of treatments a to every block. • Let r be the number of blocks in which each treatment appears. The following two examples illustrate this procedure. A.4. Thus the total sample size is ab = kr. Each block can hold a treatments. C. Examples Consider three treatments. where a < k. We can also show that λ(k − 1) = r(a − 1) and thus λ = r(a − 1)/(k − 1). Designs where only some of the treatments appear in every block are known as incomplete block designs. We use 10 blocks. b with the same assumptions as the RCBD model.4 Incomplete Block Designs Some experiments may consist of a large number of treatments and it may not be feasible to run all the treatments in all the blocks. blocks. BIBD’s may be obtained using fewer than Statistical Analysis We begin by introducing some notation. 2.

while making all pairwise comparisons. 0 otherwise b φij T. )2 ¯ • SSBlocks = k • Let Ti.j − y. we have SST = SSTrt(adj) + SSBlocks + SSE where • SST = k i=1 b j=1 (yij b y j=1 (¯. λb 1 Qj = T. if |ˆi − τj | exceeds τ ˆ Bonferroni: and 2a qk. i=1 The standard error of the adjusted treatment i mean. we declare treatment i and j to be signiﬁcantly diﬀerent. and error. For instance. and φij = Let Qi = Ti. τi . M SE λk 2 The following example is taken from em Montgomery: Design and Analysis of Experiments.. The diﬀerence here is that the sum of squares due to treatments needs to be adjusted for incompleteness.j − r Multiple Comparisons φij Ti. RANDOMIZED BLOCKS We partition the total sum of squares in the usual manner. respectively..kr−k−b+1 (α) √ . Thus. λk Thus individual as well as simultaneous inference may be made concerning the treatment means. − 1 a 1 if trt i in block j .52 CHAPTER 2. ˆ ¯ where τi = ˆ aQi . Tukey: tkr−k−b+1 α 2 k 2 M SE 2a λk . with MEER=α.j j=1 The quantity Qi is the ith treatment total minus the average of the block totals containing treatment i.. λk k ˆ βj = rQj . The corresponding ANOVA table is Source Treatments Blocks Error Total df k−1 b−1 kr − k − b + 1 kr − 1 SS SSTrt(adj) SSBlocks SSE SST MS M STrt(adj) M SBlocks M SE F -statistic F0 = MS Trt(adj) M SE Estimates of the model are µ = y. is ˆ aM SE . . Now k a i=1 Q2 i SSTrt(adj) = λk • SSE = SST − SSTrt(adj) − SSBlocks . blocks. into sum of square due to treatments. and T. )2 ¯ − y.j be the ith treatment and the jth block totals.

INCOMPLETE BLOCK DESIGNS Example 53 A chemical engineer thinks that the time of reaction for a chemical process is a function of the type of catalyst employed. and observing the reaction time.75000000 Mean Square 12.95833333 F Value 19. each batch is only large enough to permit three catalysts to be run. λ = 2. MODEL TIME = CATALYST BATCH. The experimental procedure consists of selecting a batch of raw material. INPUT CATALYST BATCH TIME. The following SAS code is used to analyze the above data. Treatment (Catalyst) 1 2 3 4 Block (Batch) 1 2 3 4 73 74 71 75 67 72 73 75 68 75 72 75 Thus k = 4. OPTIONS LS=80 PS=66 NODATE. DATA CHEM. b = 4. CLASS CATALYST BATCH. Since variations in the batches of raw material may aﬀect the performance of the catalysts. the engineer decides to use batches of raw materials as blocks. The associated SAS output is The SAS System The GLM Procedure Dependent Variable: TIME Source Model DF 6 Sum of Squares 77. CARDS. However. 1 1 73 1 2 74 1 4 71 2 2 75 2 3 67 2 4 72 3 1 73 3 2 75 3 3 68 4 1 75 4 3 72 4 4 75 . QUIT. RUN. PROC GLM. loading the pilot plant.2. CONTRAST ’1 VS 2’ CATALYST 1 -1 0 0.94 Pr > F 0.4. LSMEANS CATALYST / TDIFF PDIFF ADJUST=BON STDERR. This is known as a symmetric design since k = b. LSMEANS CATALYST / TDIFF ADJUST=TUKEY. Four catalysts are being investigated. r = 3. The following table summarizes the results.0024 52 . a = 3. applying each catalyst in a separate run of the pilot plant. ESTIMATE ’1 VS 2’ CATALYST 1 -1 0 0.

0415 0.08333333 Type III SS 22.0000 4.0001 LSMEAN Number 1 2 3 4 Least Squares Means for Effect CATALYST t for H0: LSMean(i)=LSMean(j) / Pr > |t| Dependent Variable: TIME i/j 1 2 3 4 0.0000000 Standard Error 0.08333333 The SAS System Mean Square 3.0000000 LSMEAN Number 1 2 3 4 3 -0.58333333 22.83378 0.89514 0.89514 1.89 F Value 11.89 Pr > F 0.0284 The SAS System The GLM Procedure Least Squares Means Adjustment for Multiple Comparisons: Tukey-Kramer CATALYST 1 2 3 4 TIME LSMEAN 71.88888889 22.112036 DF 3 3 DF 3 3 Root MSE 0.806226 TIME Mean 72.35806 0.0001 <.0010 53 Type I SS 11.29669 0.53709 4 -5.0130 -4.02777778 Mean Square 7.0001 <.50000 F Value 5.0107 0.54 CHAPTER 2.4868051 Pr > |t| <.0000 4.191833 0.0464 Least Squares Means for Effect CATALYST t for H0: LSMean(i)=LSMean(j) / Pr > |t| Dependent Variable: TIME i/j 1 2 0.3750000 71.8085 -0.0000 0.0000000 75.895144 1.66666667 66.0001 <.83378 .6250000 72.0000 -0.67 33.53709 1.0464 54 4 -5.296689 0.3750000 71.25000000 81.00000000 0.0000 5.19183 0.4868051 0.35806 1.537086 1.9825 3 -0.75000000 66.0284 -4.0209 1 2 -0.19183 0.4868051 0.358057 1.959877 Source CATALYST BATCH Source CATALYST BATCH 5 11 3.6250000 72.0010 Pr > F 0.0209 -4.0000000 75.358057 1 2 -0.02777778 The GLM Procedure Least Squares Means Adjustment for Multiple Comparisons: Bonferroni CATALYST 1 2 3 4 TIME LSMEAN 71.4868051 0. RANDOMIZED BLOCKS Error Corrected Total R-Square 0.0000 0.98 33.65000000 Coeff Var 1.833775 0.

4. Row 1 2 3 4 5 1 A B C D E Column 2 3 B C C D D E E A A B 4 D E A B C A Youden square may be considered as a symmetric BIBD with rows corresponding to blocks and each treatment occurring exactly once in each position of the block. 4 columns.000 y4.0281 55 3 4 Parameter 1 VS 2 Estimate -0. and 5 rows.8085 5. The following example shows a Youden square with 5 treatments.0175 -4.0130 0. in which the number of columns is not equal to the number of rows. ¯ 75.375 y2.7349 We may see from the output that F0 = 11.9462 0. Both the Bonferroni and the Tukey-Kramer procedures give us y1.3 Other Incomplete Designs There are other incomplete designs that will not be discussed here.13 Pr > F 0.4.7349 4.29669 0.000 2. ¯ 72.9825 0.537086 0. Thus we declare the catalysts to be signiﬁcantly diﬀerent.2.36 Pr > |t| 0. 2. These include the partially balanced incomplete block design and lattice designs such as square. ¯ 71. INCOMPLETE BLOCK DESIGNS 0. .08333333 Standard Error 0.0281 55 0.625 y3.191833 0. ¯ 71.0107. cubic. and rectangular lattices.9462 4.833775 0.2 Youden Squares These are incomplete Latin squares.25000000 t Value -0.4.296689 0.67 with a p-value of 0.08333333 Mean Square 0.0175 The SAS System The GLM Procedure Dependent Variable: TIME Contrast 1 VS 2 DF 1 Contrast SS 0.69821200 F Value 0.895144 0.

56 CHAPTER 2. RANDOMIZED BLOCKS .

Then the experiment is run once. The following plots. with a and b levels. The eﬀect of factor A is the change in response due to a chang in the level of A. 57 B1 20 40 B2 25 10 . consider a two factor experiment in which the two factors A and B have two levels each. the main eﬀect of B is 40 + 30 30 + 20 − = 10 2 2 In this case there is no interaction since the eﬀect of factor A is the same at all levels of B: 40 − 30 = 10 and 30 − 20 = 10 B1 30 40 B2 20 30 Sometimes the eﬀect of the ﬁrst factor may depend on the level of the second factor under consideration. Each replicate in a two-factor factorial design will contain all the a × b treatment combinations. For instance. we may have two factors A and B. This is know as the main eﬀect of factor A.Chapter 3 Factorial Designs 3. increasing factor A from level 1 to level 2 causes an increase of 10 units in the response. In a similar fashion. A1 A2 The average eﬀect of factor A is then 40 + 30 30 + 20 − = 10 2 2 Thus.1 Introduction This chapter focuses on the study of the eﬀects of two or more factors using factorial designs. A factorial design is a design in which every combination of the factors is studied in every trial (replication). A1 A2 The eﬀect of factor A at the ﬁrst level of B is 40 − 20 = 20 . say. respectively. The eﬀect of A depends on the level of B chosen. display the two situations. The following is the resulting output. known as proﬁle plots. The following table shows two interacting factors. For example. and at the second level 10 − 25 = −15 .

106. 74. 82. 155. 115 174. Material 1 2 3 15 130. 104. 58 25. 160 Temperature 65 34. Four batteries are tested at each combination of plate material and temperature. 180 150. and all 36 tests are run in random order. The results are shown below. 80.58 CHAPTER 3. 75 136. 60 . 82. Example The maximum output voltage of a particular battery is thought to be inﬂuenced by the material used in the plates and the temperature in the location at which the battery is installed. and the experiment repeated 4 times. FACTORIAL DESIGNS B2 Response B1 A1 A2 Factor A B1 Response B2 A1 A2 Factor A The following example taken from Montgomery : Design and Analysis of Experiments considers two factors. 70. 168. each with 3 levels. 122. 45 96. 58. 139 80 20. 126 138. 120. 40. 70. 110. 159. 188. 150.

· · · . j). a The statistical model is i = 1. . Average Output Voltage 100 150 Type 3 0 50 Type 1 Type 2 40 50 60 70 80 90 Temperature 3. τi is the eﬀect of the ith level of factor A.1) where µ is the overall mean. a j = 1. Let the factors be A and B each with a and b levels. . · · · . and ijk is the random error associated with the kth replicate in cell (i. · · · .3. βj is the eﬀect of the jth level of factor B. y12n y221 . y22n ya21 . THE TWO-FACTOR FACTORIAL DESIGN 59 As the following proﬁle plot shows the interaction between temperature and material type may be signiﬁcant. yabn yijk = µ + τi + βj + (τ β)ij + ijk . . y11n y211 . · · · . ya2n ··· b y1b1 . · · · . Suppose the experiment is run n times at each combination of the levels of A and B. The proﬁle plot is constructed using the average response for each cell. · · · . n 1 y111 . · · · . · · · . The following table displays the data arrangement of such an experiment. y1bn y2b1 .2 The Two-Factor Factorial Design We shall now study the statistical properties of the two-factor design. ya1n Factor B 2 y121 . We will perform formal statistical tests to determine whether the interaction is signiﬁcant in the next section. y2bn yab1 .2. · · · . · · · . b k = 1. · · · . (3. Factor A 1 2 . · · · . (τ β)ij is the eﬀect of the interaction between the ith level of factor A and the jth level of factor B. y21n ya11 .

.. − y. · · · . · · · . b (τ β)ij = yij.. ˆ ¯ τi = yi. ˆ ˆ ˆ ¯ Thus. ..1). + y. The model residuals are obtained as eijk = yijk − yijk = yijk − yij. we can easily see that the observation yijk is estimated by ˆ yijk = µ + τi + βj + (τ β)ij = yij. ˆ ¯ Inference In the two-factor ﬁxed eﬀects model. − y.1) along with the assumptions τi = i=1 j=1 βj = i=1 (τ β)ij = j=1 (τ β)ij = 0 . we are interested in the hypotheses A main eﬀect: H0 : τ1 = · · · = τa = 0 HA : at least one τi = 0 B main eﬀect: H0 : β1 = · · · = βb = 0 HA : at least one βj = 0 AB interaction eﬀect: H0 : (τ β)11 = · · · = (τ β)ab = 0 HA : at least one (τ β)ij = 0 The following is the two-factor ﬁxed eﬀects ANOVA table: Source A B AB Error Total df a−1 b−1 (a − 1)(b − 1) ab(n − 1) abn − 1 SS SSA SSB SSAB SSE SST MS M SA M SB M SAB M SE F -statistic M SA FA = M SE FB = M SB M SE FAB = M SAB M SE . ˆ ¯ ¯ ˆj = y. β ¯ ¯ i = 1. . · · · . b i = 1. ¯ ¯ ¯ ¯ Using the model in (3.60 CHAPTER 3.. a j = 1. − yi..j. Estimation The estimators of the model parameters are obtained via the least squares procedure.. FACTORIAL DESIGNS 3. every observation in cell (i.1 The Fixed Eﬀects Model a b a b The ﬁxed eﬀects model is given by (3. .. j) is estimated by the cell mean... · · · . σ 2 ). a j = 1. and ijk ∼iid N (0. They are µ = y.2.j. − y.

)2 ¯ i=1 j=1 k=1 SST = We then declare the A. + y. MODEL LIFE=MAT TEMP MAT*TEMP.ab(n−1) (α). or FAB > F(a−1)(b−1). QUIT.j.. QUIT.. The following SAS code may be used to produce the two-factor factorial ANOVA table along with an interaction plot (given above). DATALINES. 1 1 130 1 1 155 . )2 y ¯ j=1 a b SSB = an SSAB = n i=1 j=1 a b (¯ij. PROC MEANS NOPRINT. CLASS MAT TEMP..00 Pr > F <.0001 ... Example Consider the battery life experiment given above and assume that the material type and temperatures under consideration are the only ones we are interested in. The corresponding SAS output is Dependent Variable: LIFE Source Model DF 8 Sum of Squares 59416.. THE TWO-FACTOR FACTORIAL DESIGN where a 61 SSA = bn i=1 b (¯i.. − y. FB > Fb−1. PLOT MN*TEMP=MAT. PROC GPLOT. )2 y ¯ ¯ ¯ n SSE = i=1 j=1 k=1 a b n (yijk − yij. DATA BATTERY.22222 Mean Square 7427. respectively.. − y. BY MAT TEMP. RUN. PROC GLM.02778 F Value 11. 3 3 60 . INPUT MAT TEMP LIFE.j. − y..2. )2 ¯ (yijk − y. OUTPUT OUT=OUTMEAN MEAN=MN. )2 y ¯ (¯.ab(n−1) (α). − yi. RUN.3. B.. OPTIONS PS=66 LS=80 NODATE. RUN. or AB eﬀect to be signiﬁcant if FA > Fa−1..ab(n−1) (α). SYMBOL I=JOIN.. VAR LIFE. QUIT.

98486 LIFE Mean 105.21296 Coeff Var 24.0020 <.36111 2403.91 28.72222 9613.j. ’s).. ’s) and the means of factor B pooled over all y levels of factor A (¯.0186 Type I SS 10683.77778 Mean Square 5341. ¯ 1 10. ’s). = 20 ¯ 40.91 28.62 CHAPTER 3.36111 2403. ¯ 30 30 y. On the other hand. we declare that both main eﬀects as well as the interaction eﬀect are signiﬁcant. = 30 ¯ .86111 19559.62372 DF 2 2 4 DF 2 2 4 Root MSE 25.56 F Value 7.86111 19559. y The following example shows the need for comparing the means of one factor within each level of the other. 10.5278 F Value 7. 50.765210 Source MAT TEMP MAT*TEMP Source MAT TEMP MAT*TEMP 27 35 18230. 20. 50 y12.97 3.0186 Pr > F 0. 20 y22.77778 Type III SS 10683. 40..44444 Therefore.75000 77646.j. B A 1 2 y. In the case where the interaction between factors A and B is not signiﬁcant.72222 9613.97 3. we need to y compare the means of one factor within each level of the other factor (¯ij. Multiple Comparisons The manner in which we perform multiple comparisons is dependent on whether or not the interaction eﬀect is signiﬁcant. = 40 ¯ 0.0001 0.56 Pr > F 0.72222 39118. = 50 ¯ 35 2 30.72222 39118. we may compare the means of factor A pooled over all levels of factor B (¯i.0020 <. FACTORIAL DESIGNS Error Corrected Total R-Square 0.0001 0. = 10 ¯ 25 yi.97222 675. if the interaction between A and B is signiﬁcant.. 30 y11. 60 y21..44444 Mean Square 5341.

.3. The standard errors are SE(¯ij. .2. The Tukey-Kramer method declares τi to be signiﬁcantly diﬀerent from τi in level j of factor B if |¯ij. ) = y ¯ and 2M SE na Thus. when interaction is not present. = 30. ¯ ¯ Notice that within each level of B. − y12.ab(n−1) (α) √ 2 2M SE . ¯ ¯ Generally.j. however. j ).. the a factor A means may be compared via the Tukey-Kramer procedure as : declare τi to be signiﬁcantly diﬀerent from τi if SE(¯. ) = y ¯ |¯i.. − y11. | > y ¯ qb. n where (i. we declare βj to be signiﬁcantly diﬀerent from βj if |¯.. = 0. = −30 . − yi j. nb 2M SE nb Similarly.ab(n−1) (α) √ 2 2M SE . | > y ¯ qa. the eﬀect of A is nonzero. THE TWO-FACTOR FACTORIAL DESIGN 63 B1 Response B2 A1 A2 Factor A The interaction plot shows that factor A may have an eﬀect on the response. − y1.j ..j . − yi j . | > y ¯ qab. − y. ) = y ¯ 2M SE . j) = (i . na When interaction is present we use the ab cell means. we use the following standard errors in our comparisons of means: SE(¯i. n The following SAS code provides the Tukey-Kramer intervals for the battery life example considered above. y21.j..ab(n−1) (α) √ 2 2M SE . − yi . y2. − y. for example. − yi . ¯ ¯ y22.

0001 1 2 3.0628 3.09272 0.64 PROC GLM.500000 144.604127 <.578956 0.0044 -4.951318 0. ---------------------------------------------------------------Adjustment for Multiple Comparisons: Tukey MAT 1 2 3 LIFE LSMEAN 83.2718 3 -3.0001 4.0044 -7. RUN.57896 0.500000 155.37236 0.500000 LSMEAN Number 1 2 3 4 5 6 7 8 9 Least Squares Means for Effect MAT*TEMP .083333 LSMEAN Number 1 2 3 CHAPTER 3.0010 Adjustment for Multiple Comparisons: Tukey MAT 1 1 1 2 2 2 3 3 3 TEMP 1 2 3 1 2 3 1 2 3 LIFE LSMEAN 134.60413 <.51141 0.333333 125.750000 57.092717 0.750000 85.833333 107. MODEL LIFE=MAT TEMP MAT*TEMP.166667 LSMEAN Number 1 2 3 Least Squares Means for Effect TEMP t for H0: LSMean(i)=LSMean(j) / Pr > |t| Dependent Variable: LIFE i/j 1 2 3 -3.95132 0.583333 64.166667 108. LSMEANS MAT | TEMP /TDIFF ADJUST=TUKEY.0014 1 2 -2.51141 0.2718 Adjustment for Multiple Comparisons: Tukey TEMP 1 2 3 LIFE LSMEAN 144.0010 3 7.372362 0. QUIT.250000 57.000000 145.750000 49. FACTORIAL DESIGNS Least Squares Means for Effect MAT t for H0: LSMean(i)=LSMean(j) / Pr > |t| Dependent Variable: LIFE i/j 1 2 3 2.0014 -1. CLASS MAT TEMP.750000 119.0628 1.

53749 0.0172 1.52389 0.42179 1.0475 1.0000 5.347209 0.503427 0.523887 0.14312 0.0000 0.0065 -4.4354 1.23836 0.0004 3.0000 -3.279077 0.0022 0.415038 0.34721 0.0015 1.0005 1.9953 -4.9953 -3.38793 0.0000 4.9999 -4.0172 1.59867 0.63969 0.82332 0.0005 -0.959283 0.0018 -4.802964 0.095243 1.0001 -0.18383 0.9997 -1.81637 0.544245 0.86404 0.5819 -5.0604 Notice that SAS has labelled the 9 cells consecutively as .0019 0.3.387926 0.5819 3.0022 0.0172 7 -0.0000 4.54425 0.86404 0.95928 0.0000 9 2.70772 0.0475 -0.42179 1.95928 0.5819 7 0.0604 -3.0172 5 1.23836 0.0460 -0.0004 -1. THE TWO-FACTOR FACTORIAL DESIGN t for H0: LSMean(i)=LSMean(j) / Pr > |t| Dependent Variable: LIFE i/j 1 2 3 4 5 6 7 -4.0001 3.183834 0.782605 0.0065 0.72133 0.81657 0.9991 -1.27908 0.8347 4 -0.31979 0.360815 0.639488 0.2017 -1.823323 0.01361 1.9616 -0.0018 3 4.959283 0.8823 -1.0000 5.0014 -4.142915 0.0003 -5.0743 8 9 3.0460 -3.63949 0.9997 -3.9165 -5.41504 0.680408 0.401533 0.0019 4 -1.143117 0.2017 2 4.09524 1.14291 0.8823 -5.81657 0.2.63969 0.8282 3 4.0006 6 4.0067 -0.5819 -3.816368 0.6420 Least Squares Means for Effect MAT*TEMP t for H0: LSMean(i)=LSMean(j) / Pr > |t| Dependent Variable: LIFE i/j 8 9 6 5.9995 -4.707721 0.80296 0.6420 -1.9999 1 2 4.435396 1.2179 0.9616 -5.0000 5.68041 0.8347 3.319795 0.0003 3.36082 0.823323 0.9995 -2.20429 0.537493 0.0015 0.9991 5 0.0006 8 -0.204294 0.2179 0.0067 1.0014 1.59867 0.0743 Adjustment for Multiple Comparisons: Tukey Least Squares Means for Effect MAT*TEMP t for H0: LSMean(i)=LSMean(j) / Pr > |t| Dependent Variable: LIFE i/j 8 9 1 0.40153 0.50343 0.8282 -1.78261 0.82332 0.9165 65 Least Squares Means for Effect MAT*TEMP t for H0: LSMean(i)=LSMean(j) / Pr > |t| Dependent Variable: LIFE i/j 1 2 3 4 5 6 7 5.013606 1.721327 0.

PROC UNIVARIATE NOPRINT.75 2 5 8 3 6 9 CHAPTER 3. FACTORIAL DESIGNS y11. QUIT. RUN.75 y21. ¯ 155. PLOT RES*TEMP. ¯ 57. ¯ 49. ¯ 57. and • factor B. RUN. RUN. ¯ 134.5 Model Diagnostics Diagnostics are run the usual way via residual analysis.50 57.00 y12. SYMBOL V=CIRCLE. OUTPUT OUT=DIAG R=RES P=PRED. ¯ Graphical checks for equality of variances as well as unusual observations are plots of residuals versus • yij. MODEL LIFE=MAT TEMP MAT*TEMP. .25 y22. ¯ 57.75 144.75 y21. 145.75 y23. ¯ 119.50 y13.5 144. ¯ y33. 85. The graphical check for normality is the QQ-plot of the residuals.50 y22. ¯ y31.75 y32. PROC GLM. ¯ 155. ¯ 145. ¯ 134. ¯ • factor A. QUIT. CLASS MAT TEMP. For the battery life example the following SAS code may be used to produce the required plots.00 y11.25 y23. ¯ 119. PLOT RES*PRED.75 y33. QQPLOT RES / NORMAL (L=1 MU=0 SIGMA=EST). ¯ 49. PLOT RES*MAT. PROC GPLOT. HIST RES / NORMAL (L=1 MU=0 SIGMA=EST). Recall that the residuals for the two-factor factorial model are given by eijk = yijk − yij. .50 85. QUIT. ¯ y13.75 ¯ y32. Temperature within Material: Material = 1 Material = 2 Material = 3 Material within Temperature: Temperature = 1 Temperature = 2 Temperature = 3 y12.66 1 4 7 We use underlining to summarize the results. ¯ ¯ y31.

250 . The residual plots show a mild deviation from constant variance. We may need to transform the data.05586092 0. Goodness-of-Fit Tests for Normal Distribution Test Cramer-von Mises Anderson-Darling ---Statistic---W-Sq A-Sq 0. THE TWO-FACTOR FACTORIAL DESIGN 67 Formal tests for normality indicate no deviation from normality.250 >0.2. The QQ-plot shows no signs of nonnormality.3.34769847 -----p Value----Pr > W-Sq Pr > A-Sq >0.

FACTORIAL DESIGNS There is no command in SAS to perform Levene’s test for equality of variances.. DATA BATTERY..68 CHAPTER 3. 1 1 130 1 1 155 . RUN. CLASS CELL.1) + TEMP. CELL = 3*(MAT . INPUT MAT TEMP LIFE. MODEL LIFE=CELL. -------------------------------------------------Levene’s Test for Homogeneity of LIFE Variance ANOVA of Squared Deviations from Group Means . 3 3 82 3 3 60 . MEANS CELL/ HOVTEST=LEVENE. QUIT. The partial SAS code is given along with the output. PROC GLM. DATALINES. OPTIONS LS=80 PS=66 NODATE. The following trick of relabelling the cells and running a one-factor ANOVA model may be used to perform Levene’s test.

2107 . THE TWO-FACTOR FACTORIAL DESIGN Sum of Squares 5407436 12332885 Mean Square 675929 456774 69 Source CELL Error DF 8 27 F Value 1.3.48 Pr > F 0.2.

70

CHAPTER 3. FACTORIAL DESIGNS

3.2.2

Random and Mixed Models

We shall now consider the case where the levels of factor A or the levels of factor B are randomly chosen from a population of levels. The Random Eﬀects Model In a random eﬀects model the a levels of factor A and the b levels of factor B are random samples from populations of levels. The statistical model is the same as the one given in (3.1) where τi , βj , (τ β)ij , and ijk 2 2 2 are randomly sampled from N (0, στ ), N (0, σβ ),N (0, στ β ), and N (0, σ 2 ) distributions, respectively. Thus, the variance of any observation is 2 2 2 V ar(yijk ) = στ + σβ + στ β + σ 2 The hypotheses of interest are A main eﬀect:

2 H0 : σ τ = 0 2 HA : σ τ = 0

B main eﬀect:

2 H0 : σβ = 0 2 HA : σβ = 0

**AB interaction eﬀect:
**

2 H0 : στ β = 0 2 HA : στ β = 0

**The ANOVA table needs some modiﬁcations. This is seen examining the expected mean squares.
**

2 2 E(M SA ) = σ 2 + nστ β + bnστ 2 2 E(M SB ) = σ 2 + nστ β + anσβ 2 E(M SAB ) = σ 2 + nστ β

**E(M SE ) = σ 2 Therefore, the two-factor random eﬀects ANOVA table is:
**

Source A B AB Error Total df a−1 b−1 (a − 1)(b − 1) ab(n − 1) abn − 1 SS SSA SSB SSAB SSE SST MS M SA M SB M SAB M SE F -statistic

M SA FA = M SAB M SB FB = M SAB FAB = M SAB M SE

From the expected mean squares, we get the estimates of the variance components as σ 2 = M SE ˆ M SAB − M SE στ β = ˆ2 n M SA − M SAB στ = ˆ2 bn M SB − M SAB σβ = ˆ2 an

3.2. THE TWO-FACTOR FACTORIAL DESIGN Example

71

Consider the battery life example once again. This time assume that the material types and temperatures are randomly selected out of several possibilities. We may then use the RANDOM statement in PROC GLM of SAS to analyze the data as a random eﬀects model. Here are the SAS code and associated output.

OPTIONS LS=80 PS=66 NODATE; DATA BATTERY; INPUT MAT TEMP LIFE; DATALINES; 1 1 130 1 1 155 ....... 3 3 60 ; PROC GLM; CLASS MAT TEMP; MODEL LIFE=MAT TEMP MAT*TEMP; RANDOM MAT TEMP MAT*TEMP / TEST; RUN; QUIT; --------------------------------------------------------The GLM Procedure Source MAT TEMP MAT*TEMP Type III Expected Mean Square Var(Error) + 4 Var(MAT*TEMP) + 12 Var(MAT) Var(Error) + 4 Var(MAT*TEMP) + 12 Var(TEMP) Var(Error) + 4 Var(MAT*TEMP) The GLM Procedure Tests of Hypotheses for Random Model Analysis of Variance Dependent Variable: LIFE Source MAT TEMP Error: MS(MAT*TEMP) Source MAT*TEMP Error: MS(Error) DF 2 2 4 DF 4 27 Type III SS 10684 39119 9613.777778 Type III SS 9613.777778 18231 Mean Square 5341.861111 19559 2403.444444 Mean Square 2403.444444 675.212963 F Value 3.56 Pr > F 0.0186 F Value 2.22 8.14 Pr > F 0.2243 0.0389

Notice that variability among material types is the only factor that is not signiﬁcant. The estimates of the components of variance are (values in parentheses are percent contributions of the components) σ 2 = M SE = 675.21 (24.27%) ˆ 2403.44 − 675.21 M SAB − M SE = = 432.06 (15.53%) στ β = ˆ2 n 4 M SA − M SAB 5341.86 − 2403.44 στ = ˆ2 = = 244.87 (8.80%) bn 12 M SB − M SAB 19559 − 2403.44 σβ = ˆ2 = = 1429.58 (51.39%) an 12

72 Mixed Models

CHAPTER 3. FACTORIAL DESIGNS

Let us now consider the case where one factor is ﬁxed and the other is random. Without loss of generality, assume that factor A is ﬁxed and factor B is random. When a factor is random, its interaction with any other factor is also random. The statistical model, once again, has the same form given in (3.1). This time we assume that • τi are ﬁxed eﬀects such that τi = 0,

ijk

2 2 • βj ∼iid N (0, σβ ), (τ β)ij ∼iid N (0, a−1 στ β ), and a

∼iid N (0, σ 2 ).

**The hypotheses of interest are H0 : τ1 = · · · = τa = 0, The expected mean squares are given by
**

2 E(M SA ) = σ 2 + nστ β + 2 E(M SB ) = σ 2 + anσβ 2 E(M SAB ) = σ 2 + nστ β 2 H0 : σβ = 0, 2 H0 : στ β = 0

bn τi2 a−1

E(M SE ) = σ 2 The ﬁxed factor eﬀects are estimated the usual way as µ = y... , ˆ ¯ and the variance components are estimated as M SAB − M SE , n The ANOVA table for the mixed model is σ 2 = M SE , ˆ στ β = ˆ2

Source A (ﬁxed) B (random) AB Error Total df a−1 b−1 (a − 1)(b − 1) ab(n − 1) abn − 1 SS SSA SSB SSAB SSE SST

τi = yi.. − y... ˆ ¯ ¯

and

σβ = ˆ2

M SB − M SE an

MS M SA M SB M SAB M SE

F -statistic

M SA FA = M SAB M SB FB = M SE FAB = M SAB M SE

Example Consider the battery life example and assume that temperature is a random factor while material type is a ﬁxed factor. We use PROC MIXED in SAS to generate the output. PROC GLM does not provide the correct analysis!

OPTIONS LS=80 PS=66 NODATE; DATA BATTERY; INPUT MAT TEMP LIFE; DATALINES; 1 1 130 1 1 155 ....... 3 3 60 ; PROC MIXED COVTEST; CLASS MAT TEMP;

41258855 327.66 432. Covariance Parameter Estimates Cov Parm TEMP MAT*TEMP Residual Estimate 1429. THE TWO-FACTOR FACTORIAL DESIGN MODEL LIFE=MAT.67 Pr Z 0. ----------------------------------------------------------------------The Mixed Procedure Model Information Data Set Dependent Variable Covariance Structure Estimation Method Residual Variance Method Fixed Effects SE Method Degrees of Freedom Method WORK.1560 0.22 Pr > F 0.3.00000000 3 4 12 1 36 36 0 36 Convergence criteria met.9 334. QUIT.06 675.21 Standard Error 1636.35 183.BATTERY LIFE Variance Components REML Profile Model-Based Containment 73 Class Level Information Class MAT TEMP Levels 3 3 Values 1 2 3 1 2 3 Dimensions Covariance Parameters Columns in X Columns in Z Subjects Max Obs Per Subject Observations Used Observations Not Used Total Observations Iteration History Iteration 0 1 Evaluations 1 1 -2 Res Log Like 352.2243 327.09 427.01 3.7 331.77 Z Value 0.9 333.0001 Fit Statistics -2 Res Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) The Mixed Procedure Type 3 Tests of Fixed Effects Effect MAT Num DF 2 Den DF 4 F Value 2. RANDOM TEMP MAT*TEMP.1911 0.91147422 Criterion 0.2.2 . RUN.87 1.

74 CHAPTER 3. however. .1911 and 0. FACTORIAL DESIGNS From the output we observe that the ﬁxed eﬀect (MAT) is not signiﬁcant. Neither of the random eﬀects are signiﬁcant (p-values of 0. Proc Mixed uses the restricted maximum likelihood (REML) technique to estimate the variance components. show that the estimates of the variance components obtained here are identical to those using the expected mean squares. the results are not the same. When there is imbalance. In a balanced design the REML method gives identical estimates as those obtained using the expected mean squares. Exercise: For this example.1560).

6) T2 P2 (9. The model assumes that treatment-block interactions are negligible. and k = 1. T2 P1 . j = 1.3 Blocking in Factorial Designs Blocking may be implemented in factorial designs using the same principles.3.5) T1 P1 (14. The ﬁeld layout and the yield in kg/ha are shown below: I T2 P2 (8.1) BLOCK II III T2 P1 (11. b.1) T1 P2 (17. Plant type (T ) at two levels • T1 = short.3. T2 P2 . He thought that the plant types might respond diﬀerently to fertilization. δk is the eﬀect of the kth block.1) T1 P1 (14. so.6) T2 P3 (16.6) T1 P2 (18.2) T1 P2 (17. n.9) T2 P2 (12. T1 P3 .3) T2 P1 (11.6) T2 P2 (10. · · · . erect 2.7) T1 P3 (18.k − y. T1 P2 .5) ..8) T2 P3 (17. he decided to do a factorial experiment with two factors: 1. The ANOVA table for a random eﬀects model is Source A B AB Blocks Error Total where SSBlocks = ab k=1 df a−1 b−1 (a − 1)(b − 1) n−1 (ab − 1)(n − 1) abn − 1 SS SSA SSB SSAB SSBlocks SSE SST n MS M SA M SB M SAB M SBlocks M SE F -statistic M SA FA = M SAB M SB FB = M SAB FAB = M SAB M SE (¯.6) T1 P3 (18. Phosphorous rate (P ) at three levels • P1 = none • P2 = 25kg/ha • P3 = 50kg/ha Using the full factorial set of combinations he had six treatments: T1 P1 . · · · . bushy • T2 = tall.5) T2 P1 (12.. · · · .5) T2 P3 (15.2) T1 P1 (13.3) T2 P3 (16. )2 .6) T1 P3 (17.7) T2 P1 (12. In a randomized block design.2) T1 P2 (17. every block contains all possible treatment combinations.1) IV T1 P3 (18..0) T1 P1 (11. y ¯ Example An agronomist wanted to study the eﬀect of diﬀerent rates of phosphorous fertilizer on two types of broad bean plants. a. T2 P3 He conducted the experiment using a randomized block design with four blocks of six plots each. BLOCKING IN FACTORIAL DESIGNS 75 3. The statistical model for a two-factor blocked factorial design with 1 replication per block is yijk = µ + τi + βj + (τ β)ij + δk + ijk where i = 1.

18. 17.9 15.8 P3 18.6.5. 9.6 Phosphorous P2 17.6. FACTORIAL DESIGNS The data layout is (observations in a cell are in increasing block order I. 18.3. IV) Type T1 T2 P1 11.6.2.1. 18.7. 12. 14. 11.6.0.1.7. 16. III.5 11.2.6. 10. 14. II. 13.1 8.76 CHAPTER 3. 16.5 The following SAS code and output give the analysis. .3. 12.2. 17.5. 17. 12.1. 17.

5 1 1 2 13.6 1 3 3 18.3 2 2 2 10.7 2 3 2 16.6 1 1 3 14.1 1 3 1 18.7 2 3 3 16.0 2 1 2 11.5 2 2 3 9. DATA FERT.2 2 1 3 12.6 2 2 1 8. INPUT TYPE PH BLOCK YIELD.3.3 1 1 4 14. 77 . 1 1 1 11. RUN.3.6 1 2 3 17.8 2 3 1 15.5 1 2 1 17.2 1 3 4 18. CLASS TYPE PH BLOCK.1 1 2 2 17.9 2 1 1 11.6 1 2 4 18. MODEL YIELD=BLOCK TYPE|PH.5 .6 2 3 4 17. QUIT.1 2 1 4 12.2 1 3 2 17. BLOCKING IN FACTORIAL DESIGNS OPTIONS LS=80 PS=66 NODATE. PROC GLM.1 2 2 4 12. DATALINES.

3762500 Root MSE 0. Short. bushy plants seem to show their greatest yield increase with the ﬁrst increment of added phosphorous.0001 Pr > F 0.195813 DF 3 1 2 2 DF 3 1 2 2 Type I SS 13.44041667 77.10583333 Mean Square 4.33 38.0001 The interaction between plant type and phosphorous level is signiﬁcant.0024 <. The eﬃciency needs to be investigated further.78 ------------------------------------------------The GLM Procedure Dependent Variable: YIELD Source Model Error Corrected Total R-Square 0.13 F Value 7.81 86.5784167 F Value 50.68 133.05291667 Pr > F 0.760537 YIELD Mean 14. while tall.93625000 22.81 86.0001 <.87250000 44.0001 Coeff Var 5.40041667 99.3375000 0. Blocking seems to be working here judging from the corresponding F value.964350 Source BLOCK TYPE PH TYPE*PH Source BLOCK TYPE PH TYPE*PH DF 8 15 23 Sum of Squares 234.7000000 8.05291667 Mean Square 4. This means that all comparisons of means of one factor would have to be done within the levels of the other factor.63750 F Value 7.32125000 77.10583333 Type III SS 13.68 133.44041667 77. erect plants seem to show no yield increase with 25kg/ha of phosphorous.72 CHAPTER 3.40041667 49.32125000 77.40041667 99. The rates of phosphorous fertilizer and the type of plant both aﬀect yield signiﬁcantly. Diﬀerent plant types respond diﬀerently to diﬀerent levels of the fertilizer.87250000 44.0001 <.0024 <.0001 <.40041667 49. FACTORIAL DESIGNS Pr > F <. The main eﬀects are also signiﬁcant.93625000 22.0001 <.6762500 243.33 38.13 Mean Square 29. .

· · · . · · · . consider a two-factor factorial experiment with 3 levels of factor A and 2 levels of factor B. A special case is the 3 factor factorial design with factors A. · · · . We need two or more replications to be able to test for all possible interactions. i = 1. · · · . b. and l = 1. · · · . The statistical model is yi1 i2 ···it l = µ + τ1i1 + τ2i2 + · · · + τtit + (τ1 τ2 )i1 i2 + · · · (τt−1 τt )it−1 it + · · · + (τ1 τ2 · · · τt )i1 i2 ···it + i1 i2 ···it l where i1 = 1. · · · . · · · . We will use Latin letters to represent the 3 × 2 = 6 treatment combinations. The only diﬀerence here is that every factor combination is considered to be a treatment. respectively. In general. f2 . j = 1. The statistical model is yijkl = µ + τi + βj + γk + δl + (τ β)ij + where • τi . k = 1. c. etc.4 The General Factorial Design Consider an experiment in which we have t factors F1 . i2 = 1. and C with levels a. · · · . f1 . are the eﬀects of the kth row and the lth column. The statistical model is yijkl = µ + τi + βj + γk + (τ β)ij + (τ γ)ik + (βγ)jk + (τ βγ)ijk + where i = 1.3. a. a is the eﬀect of the ith level of factor A. and c. A A1 A1 A2 A2 A3 A3 B B1 B2 B1 B2 B1 B2 Treatment A B C D E F Column 3 4 C D D E E F F A A B B C We then form the 6 × 6 basic Latin square cyclically as Row 1 2 3 3 3 3 1 A B C D E F 2 B C D E F A 5 E F A B C D 6 F A B C D E We then randomize the rows and the columns. respectively. b is the eﬀect of the jth level of factor B. respectively. · · · . ab. • γk and δl . j = 1. THE GENERAL FACTORIAL DESIGN A Factorial Experiment with Two Blocking Factors 79 This is dealt with by implementing a Latin square design in a similar manner as one factor experiments. Considering a ﬁxed eﬀects model. B. k. ft levels. the ANOVA table is ijkl . b. · · · . n. consider two factors : factor A with a levels and factor B with b levels. l = 1. • βj .4. · · · . ijkl 3. To illustrate this. Ft with f1 .

This experiment is run twice and the deviations from the target volume are recorded. Pressure (B) 25 psi 30 psi Line Speed (C) Line Speed (C) 200 250 200 250 -3. INPUT CARB PRES SPEED VOL. 3 6. 11 Carbonation (A) 10 12 14 We will use SAS to analyze the data. PROC GLM.80 Source A B C AB AC BC ABC Error Total df a−1 b−1 c−1 (a − 1)(b − 1) (a − 1)(c − 1) (b − 1)(c − 1) (a − 1)(b − 1)(c − 1) abc(n − 1) abcn − 1 SS SSA SSB SSC SSAB SSAC SSBC SSABC SSE SST MS CHAPTER 3. DATA BOTTLE. two levels of B and two levels of C are considered to set up a 3 × 2 × 2 factorial experiment. 1 0. 5 5. 0 1. RUN. . FACTORIAL DESIGNS F -statistic M SA FA = M SE FB = M SB M SE FC = M SC M SE FAB = M SAB M SE FAC = M SAC M SE FBC = M SBC M SE S FABC = MMABC SE M SA M SB M SC M SAB M SAC M SBC M SABC M SE The following example is taken from Montgomery : Design and Analysis of Experiments Example A soft drink bottler is studying the eﬀect of percent carbonation (A). 9 10. and line speed (C) on the volume of beverage packaged in each bottle. MODEL VOL = CARB|PRES|SPEED. 6 7. -1 -1. operating pressure (B). The data are given below. CARDS. 1 1 1 -3 1 1 1 -1 1 1 2 -1 1 1 2 0 1 2 1 -1 1 2 1 0 1 2 2 1 1 2 2 1 2 1 1 0 2 1 1 1 2 1 2 2 2 1 2 1 2 2 1 2 2 2 1 3 2 2 2 6 2 2 2 5 3 1 1 5 3 1 1 4 3 1 2 7 3 1 2 6 3 2 1 7 3 2 1 9 3 2 2 10 3 2 2 11 . CLASS CARB PRES SPEED. 4 7. 1 2. 0 -1. 1 2. The SAS code and output are as follows: OPTIONS LS=80 PS=66 NODATE. Three levels of A.

. CLASS CARB PRES SPEED. all pairwise comparisons comparisons of the levels of factor A.3750000 2.0416667 0.0001 <.5833333 1. all the main eﬀects appear to be signiﬁcant.41 64.3750000 5.7500000 45.37500000 LSMEAN Number 1 2 3 Least Squares Means for effect CARB Pr > |t| for H0: LSMean(i)=LSMean(j) Dependent Variable: VOL i/j 1 2 3 1 <.11 Pr > F <.7083333 Mean Square 126.05.3.0001 <. LSMEANS CARB/PDIFF ADJUST=TUKEY.1250000 8.76 Pr > F <.71 31.8295455 0.47 0.5000000 336.5416667 F Value 42. RUN.0558 0.6250000 Type I SS 252.0001 81 F Value 178.0001 <.0001 3 <. MODEL VOL = CARB|PRES|SPEED.50000000 7.0416667 0. THE GENERAL FACTORIAL DESIGN QUIT. As an example we will run the Tukey-Kramer procedure on factor A.2916667 1.0001 <.06 3.0416667 0.4.12 0. QUIT.41 1. --------------------------------------------------------------Adjustment for Multiple Comparisons: Tukey CARB 1 2 3 VOL LSMEAN -0.3750000 45.0001 2 <.0001 0.0001 Thus. are signiﬁcantly diﬀerent at MEER = 0. One may perform multiple comparisons at the highest level of the factors.2486 0.0833333 Mean Square 29.0416667 1. none of the interactions is signiﬁcant.6715 0. percent carbonation.0001 0.6250000 22. PROC GLM of SAS is modiﬁed as follows: PROC GLM.2500000 22. However.4869 As we can see. ------------------------------------------------------------Source Model Error Corrected Total Source CARB PRES CARB*PRES SPEED CARB*SPEED PRES*SPEED CARB*PRES*SPEED DF 11 12 23 DF 2 1 2 1 2 1 2 Sum of Squares 328.50000000 2.

82 CHAPTER 3. FACTORIAL DESIGNS .

This involves a 2 × 2 × · · · × 2 = 2k factorial experiment. We shall consider special cases where k = 2. 1. to identify important factors. This is known as a 2k factorial design. 3 before looking at the general 2k factorial design. 4. 2) is known as a 3k factorial design. The experiment is replicated three times.2. A similar situation where each of the k factors has three levels (0. 4.1 The 22 Design Let A and B denote the two factors of interest with two levels (”low (−)” and ”high (+)”) each. As an example consider an experiment where the time a chemical reaction takes is investigated. The symbol (1) is used to represent the treatment where every factor is at a low level. The absence of letter indicates the factor is at a low level. Every interaction term has only 1 degree of freedom.2 The 2k Factorial Design This is particularly useful in the early stages of an experiment as a factor screening mechanism. The two factors of interest are reactant concentration (A at 15% (−) and 25% (+)) and catalyst (B with absence (−) and presence (+)).Chapter 4 2k and 3k Factorial Designs 4. denoted by + and −. Factor A B − − + − − + + + Replicate 1 2 3 28 25 27 36 32 32 18 19 23 31 30 29 83 Treatment (1) a b ab Total 80 100 60 90 .e. i. There are 22 = 4 treatments designated as: Treatment (1) a b ab A − + − + B − − + + The presence of a letter indicates that the factor is at a high level.1 Introduction Often we consider general factorial designs with k factors each with 2 levels.

5. The sum of squares for each factor and the interaction may be obtained in a very simple manner..34 SST = 323.5. .5.33 M SE = 3. −. .33 SSE = 31.33.5.34 The ANOVA table is Source A B AB Error Total df 1 1 1 4(n − 1) = 8 4n − 1 = 11 SS SSA = 208. −. For example.5).33 SSB = 75.84 CHAPTER 4. A similar notation is use for ¯ all the other means. ..5. SSA = n × (Main eﬀect of A)2 SSB = n × (Main eﬀect of B)2 SSAB = n × (Interaction eﬀect of AB)2 The total sum of squares is deﬁned as usual 2 2 n SST = (yijk − y. SSB = 75.5. )2 ¯ i=1 j=1 k=1 and the error sum of squares is obtained by subtraction as SSE = SST − SSA − SSB − SSAB .00 y ¯ y ¯ 2 2 Now the contrast coeﬃcients are (−.5). 2K AND 3K FACTORIAL DESIGNS Let y (A+ ) denote the mean of the response where factor A is at high level. −. this may be given as a contrast with coeﬃcients (−.00.15 FB = M SB = 19.33. .13 M SE . the contrast coeﬃcients become (.5). For the example above these yield SSA = 208.00 M SAB = 8. We can now deﬁne the main eﬀects of a factor.5. The AB interaction eﬀect is the average diﬀerence between the eﬀect of A at the high level of B and the eﬀect of A at the low level of B 1 1 {¯(A+ B+ ) − y (A− B+ )} − {¯(A+ B− ) − y (A− B− )} = 1.67 y ¯ y ¯ 2 2 Using the treatment means. SST = 323.00 SSAB = 8. Similarly.13 M SE FAB = M SAB = 2. SSE = 31. −.33 y ¯ y ¯ 2 2 Using the treatment means. The main eﬀect of A is y (A+ ) − y (A− ) ¯ ¯ This is equivalent to 1 1 {¯(A+ B+ ) + y (A+ B− )} − {¯(A− B+ ) + y (A− B− )} = 8.5. .33 M SB = 75.00.92 F -statistic M SA FA = M SE = 53. the main eﬀect of B is given by y (B+ ) − y (B− ) ¯ ¯ which is equivalent to 1 1 {¯(A+ B+ ) + y (A− B+ )} − {¯(A+ B− ) + y (A− B− )} = −5. SSAB = 8.5. y (A− B+ ) is the mean of the response in the case where factor A is at low ¯ level and factor B is at high level. Let n be the number of replicates in the study.00 MS M SA = 208.

0000000 8.1828 85 27 32 23 29 One may use a multiple regression model to estimate the eﬀects in a 22 design. The estimated model coeﬃcient are now ˆ β0 = y.3333333 75. MODEL Y = A|B.0000000 8.. ----------------------------------------------Source A B A*B DF 1 1 1 Type I SS 208. MODEL Y = X1 X2 X1X2. X1X2 = 2*X1*X2. ¯ ˆ β1 = Main eﬀect of A ˆ β2 = Main eﬀect of B ˆ β12 = AB Interaction eﬀect Using SAS OPTIONS LS=80 PS=66 NODATE. -1 -1 28 -1 -1 25 -1 -1 1 -1 36 1 -1 32 1 -1 -1 1 18 -1 1 19 -1 1 1 1 31 1 1 30 1 1 .19 19.15 2. PROC GLM. RUN. CARDS. X2 = B/2.3333333 75. This is done by forming two variables (x1 .5 = −.5 = . -1 -1 28 -1 -1 25 -1 -1 1 -1 36 1 -1 32 1 -1 -1 1 18 -1 1 19 -1 1 1 1 31 1 1 30 1 1 .3333333 F Value 53. DATA BOTTLE. QUIT.0024 0. QUIT. RUN.5 27 32 23 29 .13 Pr > F <. THE 2K FACTORIAL DESIGN Using SAS OPTIONS LS=80 PS=66 NODATE.0001 0. X1 = A/2. PROC GLM. INPUT A B Y @@. INPUT A B Y @@. DATA BOTTLE. x2 ) as A− A+ B− B+ and ﬁtting the regression model y = β0 + β1 x1 + β2 x2 + β12 (2x1 x2 ) + .4.2. CARDS. ---------------------------------------- x1 x1 x2 x2 = −.. CLASS A B.3333333 Mean Square 208.5 = .

1828 Pr > F <.0024 0.3333333 75.0000000 Root MSE 1.3333333 Mean Square 208.14260910 1.0001 0. Treatment (1) a b ab c ac bc abc I 1/4 1/4 1/4 1/4 1/4 1/4 1/4 1/4 A -1/4 1/4 -1/4 1/4 -1/4 1/4 -1/4 1/4 B -1/4 -1/4 1/4 1/4 -1/4 -1/4 1/4 1/4 Factorial Eﬀects AB C AC 1/4 -1/4 1/4 -1/4 -1/4 -1/4 -1/4 -1/4 1/4 1/4 -1/4 -1/4 1/4 1/4 -1/4 -1/4 1/4 1/4 -1/4 1/4 -1/4 1/4 1/4 1/4 BC 1/4 1/4 -1/4 -1/4 -1/4 -1/4 1/4 1/4 ABC -1/4 1/4 1/4 -1/4 1/4 -1/4 -1/4 1/4 For example.13 Pr > |t| <. the data is .14260910 1.38 1.902993 Source X1 X2 X1X2 Parameter Intercept X1 X2 X1X2 DF 3 8 11 Sum of Squares 291.3333333 323.0024 0.00000000 1.15 2.33333333 -5.0001 0. This setup uses a total of 23 = 8 experiments that are represented as Treatment (1) a b c A − + − − B − − + c C − − − + Treatment c ac bc abc A − + − + B − − + + C + + + + The following table gives the contrast coeﬃcients for calculating the eﬀects.6666667 31.50000000 8.2 The 23 Design In this subsection we consider 3-factor factorial experiments each with two levels.3333333 t Value 48.57130455 1.196571 DF 1 1 1 Estimate Type I SS 208. the main eﬀect of A is 1 [−¯(A− B− C− ) + y (A+ B− C− ) − y (A− B+ C− ) + y (A+ B+ C− ) y ¯ ¯ ¯ 4 −¯(A− B− C+ ) + y (A+ B− C+ ) − y (A− B+ C+ ) + y (A+ B+ C+ )] y ¯ ¯ ¯ The sum of squares are SSeﬀect = 2n(eﬀect)2 Consider the bottling experiment in Chapter 3.2.82 Pr > F 0.19 19.0000000 8.50000 F Value 53.14260910 27.9166667 F Value 24.0000000 8.0002 Coeff Var 7.66666667 4.3333333 75.14 7.2222222 3. 2K AND 3K FACTORIAL DESIGNS Dependent Variable: Y Source Model Error Corrected Total R-Square 0.86 CHAPTER 4.0001 <.29 -4.1828 Mean Square 97.46 Standard Error 0. With only the ﬁrst two levels of factor A.979057 Y Mean 27.

--------------------------------------------------Source Model Error Corrected Total R-Square 0. MODEL VOL = A|B|C.60 19.00000000 78.0005 0.00000000 1.60 32. one may consider the following coding: A− A+ B− B+ C− C+ x1 x1 x2 x2 x3 x3 = −. PROC GLM.935897 Source A B A*B C A*C B*C A*B*C DF 7 8 15 Sum of Squares 73.5 = .25000000 12.00000000 To employ multiple regression.0022 0.25000000 12.40 1.00000000 1.0943 0. -1 c: -1.00000000 20.25000000 0.4.5447 0.60 0.25000000 2.60 Pr > F <.5 = .05694 DF 1 1 1 1 1 1 1 Type I SS 36.40 3.5 = −.62500000 F Value 16.25000000 1.25000000 1.60 1. 1 ab: 2. 0 bc: 1.00000000 Root MSE 0.2.00000000 5. RUN. CARDS. CLASS A B C. 3 abc: 6.25000000 2.5 .5 = . 1 a: 0.2415 Mean Square 10.25000000 0.69 Pr > F 0. 0 b: -1. INPUT A B C VOL. 5 87 Carbonation (A) 10 12 The SAS analysis is given below: OPTIONS LS=80 PS=66 NODATE.790569 VOL Mean 1.00000000 Mean Square 36.0001 0. QUIT.00000000 20. 1 ac: 2.0003 Coeff Var 79. THE 2K FACTORIAL DESIGN Pressure (B) 25 psi 30 psi Line Speed (C) Line Speed (C) 200 250 200 250 (1): -3.000000 F Value 57. -1 -1 -1 -3 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 0 -1 1 -1 -1 -1 1 -1 0 -1 1 1 1 -1 1 1 1 1 -1 -1 0 1 -1 -1 1 1 -1 1 2 1 -1 1 1 1 1 -1 2 1 1 -1 3 1 1 1 6 1 1 1 5 .42857143 0.5 = −. DATA BOTTLE.2415 0.

5447 0.63 1.26 1. CARDS.39528 0. INPUT A B C VOL.50000 Standard Error 0.79057 1.25000 0. β12 is the AB interaction eﬀect. Considering the bottling experiment. PROC REG.00000 3.39528 0.59 5.25000 1.0005 0. X1X3 = 2*X1*X3.00000 2. RUN.39528 0.39528 0.05694 Mean Square 10.0022 0.62500 F Value 16.43 1.69 4.88 The regression model is CHAPTER 4. X3 = C/2.00000 78.39528 t Value 5. X1 = A/2. X1X2 = 2*X1*X2.0001 0.00000 5. the estimate of the regression coeﬃcients correspond to the eﬀects of the factors. For ˆ example. the following SAS code is an implementation the regression approach.19764 0.75000 0. --------------------------------------------------Dependent Variable: VOL Analysis of Variance Source Model Error Corrected Total DF 7 8 15 Sum of Squares 73.90 0.50000 0.0003 Root MSE Dependent Mean Coeff Var R-Square Adj R-Sq 0. 2K AND 3K FACTORIAL DESIGNS y = β0 + β1 x1 + β2 x2 + β3 x3 + β12 (2x1 x2 ) + β13 (2x1 x3 ) +β23 (2x2 x3 ) + β123 (4x1 x2 x3 ) + Once again.9359 0. QUIT.00000 79.8798 Parameter Estimates Variable Intercept X1 X2 X3 X1X2 X1X3 X2X3 X1X2X3 DF 1 1 1 1 1 1 1 1 Parameter Estimate 1.75000 0.2415 .06 7.39528 0. OPTIONS LS=80 PS=66 NODATE. X1X2X3 = 4*X1*X2*X3. DATA BOTTLE.39528 0. X2X3 = 2*X2*X3.2415 0.0943 0. -1 -1 -1 -3 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 0 -1 1 -1 -1 -1 1 -1 0 -1 1 1 1 -1 1 1 1 1 -1 -1 0 1 -1 -1 1 1 -1 1 2 1 -1 1 1 1 1 -1 2 1 1 -1 3 1 1 1 6 1 1 1 5 .0010 <.42857 0.00000 0. MODEL VOL = X1 X2 X3 X1X2 X1X3 X2X3 X1X2X3. X2 = B/2.69 Pr > F 0.26 Pr > |t| 0.

89 4. . SSFk−1 Fk . in the 22 design (k = 2). . the coeﬃcients are ±21−2 = ±2−1 = ±1/2. Fk k two-factor interactions 2 F 1 F2 F 1 F3 . We then ﬁt the regression equation y = β0 + β1 x1 + · · · + βk xk + β12 (2x1 x2 ) + · · · + βk−1. k = 1 1 2 3 k k-factor interaction.2.4. k = 1 k-factor interaction k F 1 F2 · · · Fk Error Total df 1 1 . . . . i. . . (1). k two-factor interaction eﬀects. The sums of squares are now 2 SSeﬀect = n × 2k−2 × (eﬀect) . . · · · . There are k = k main eﬀects. Thus the sum of the degrees of freedom due to the factors (main and interaction) is k i=1 k i = 2k − 1 and the total degrees of freedom are 2k n − 1. We will use the regression approach to estimate the eﬀects.e. · · · . . . . . Thus we get the error degrees of freedom to be (2k n − 1) − (2k − 1) = 2k (n − 1) . For example. . . xk each taking values ±1/2 depending on whether the corresponding factor is at high or low level. the coeﬃcients become ±1/4 as expected. . Similarly. For instance. . f1 . . . . k three-factor interaction eﬀects. Each main eﬀect as well as interaction eﬀect has one degree of freedom. . .3 The General 2k Design Consider k factors F1 . SSFk SSF1 F2 SSF1 F3 . . in the 23 design. in the bottling experiment. f1 . Fk−1 Fk . THE 2K FACTORIAL DESIGN One may obtain sums of squares using SS = 2n(eﬀect)2 . We will use contrast coeﬃcients ±21−k to linearly combine the 2k cell means. fk . The partial ANOVA table for the 2k design is Source k main eﬀects F1 F1 . 1 2k (n − 1) 2k n − 1 SS SSF1 SSF1 . . . . SSF1 F2 ···Fk SSE SST We denote the 2k treatments using the standard notation. As always contrasts are of interest since they represent factor eﬀects. · · · . We will now create k variables x1 . .2. 1 1 1 . Suppose the experiment is replicated n times. Fk each with 2 levels. 1 . . SSA = 2 × 2 × 32 = 36 .k (2xk−1 xk ) + .

1/1 The following table gives the results: A -1 1 -1 1 -1 1 -1 1 -1 1 -1 1 -1 1 -1 1 -1 1 -1 1 -1 1 -1 1 B -1 -1 1 1 -1 -1 1 1 -1 -1 1 1 -1 -1 1 1 -1 -1 1 1 -1 -1 1 1 C -1 -1 -1 -1 1 1 1 1 -1 -1 -1 -1 1 1 1 1 -1 -1 -1 -1 1 1 1 1 D -1 -1 -1 -1 -1 -1 -1 -1 1 1 1 1 1 1 1 1 -1 -1 -1 -1 -1 -1 -1 -1 E -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 1 1 1 1 1 1 1 Yield 246 303 276 336 258 344 265 313 249 310 318 363 212 249 283 219 379 326 344 349 389 359 283 363 . which in turn means that hypotheses cannot be tested. The multiplier is obtained as 2(number of factors in interaction−1) . 2hr • C = ﬁlter pressure : 1atm. He knew that he could run one trial per day in each vat. He selected the following factors: • A = reaction temperature : 300◦ C. The following example. Example A research chemist wanted to study the eﬀect of a number of factors on the yield of a new high-impact plastic. and dried. it is often impossible to ﬁnd enough subjects to replicate the design. i. In such a situation it is impossible to estimate the error. 2atm • D = drying temperature : 200◦ C. The chemist worked in a laboratory that contained 32 experimental vats and a ﬁlter and dryer with each vat. He decided to study ﬁve factors using a single replication of a 25 design. and the plastic settles to the bottom of the vat. ˆ Now the estimates. 150◦ C • B = reaction time : 4hr. taken from Petersen : Design and Analysis of Experiments. An approach is to ignore some high-order interactions.4 The Unreplicated 2k Design Since the number of factor level combinations in a 2k design may be large. illustrates these approaches.e. 4.2. 100◦ C • E = resin/extender ratio : 2/1. The process takes place in a heated reaction vat. The plastic is produced by mixing resin with an extender in a solvent. The materials are allowed to react for a period of time.90 CHAPTER 4. 2K AND 3K FACTORIAL DESIGNS +β12···k (2k−1 x1 x2 · · · xk ) + . Another approach is to use a QQ-plot to identify the signiﬁcant factors. are the eﬀects of the corresponding factor. β’s. ﬁltered on a screen. assume that interactions with three or higher factors do not exist.

without the line which contains the names of the variables. CD=C*D. MODEL YIELD = A B C D E AB AC AD AE BC BD BE CD CE DE ABC ABD ABE ACD ACE ADE BCD BCE BDE CDE ABCD ABCE ABDE ACDE BCDE ABCDE. ACE=A*C*E. BC=B*C. THE 2K FACTORIAL DESIGN -1 1 -1 1 -1 1 -1 1 -1 -1 1 1 -1 -1 1 1 -1 -1 -1 -1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 313 336 370 336 322 352 367 374 91 We will ﬁt the following regression model to estimate the eﬀects: y = β0 + β1 x1 + · · · + β12345 (24 x1 x2 x3 x4 x5 ) + We save the above data as it is in a ﬁle called chem. QUIT. ABCDE=A*B*C*D*E. ABE=A*B*E. BCDE = B*C*D*E. INPUT A B C D E YIELD. CDE=C*D*E.86694 . AB=A*B. BD=B*D. QUIT. ----------------------------------------------------The SAS System The REG Procedure Model: MODEL1 Dependent Variable: YIELD Analysis of Variance Source Model Error DF 31 0 Sum of Squares 73807 0 Mean Square 2380. ABCE=A*B*C*E. SET CHEM. ABD=A*B*D. DATA CHEM2.DAT".4. PROC REG. BCD=B*C*D. RUN. ABCD = A*B*C*D. BCE=B*C*E. F Value . ABC=A*B*C. ABDE=A*B*D*E. DATA CHEM. AE=A*E. Pr > F . The following code gives the analysis: OPTIONS LS=80 PS=66 NODATE. DE=D*E. AD=A*D. and we call the ﬁle using the SAS command inﬁle. INFILE "C:\ASH\S7010\SAS\CHEM. ACDE=A*C*D*E. 61 . RUN. BDE=B*D*E.dat. CE=C*E. ACD=A*C*D. ADE=A*D*E. QUIT.2. AC=A*C. RUN. BE=B*E.

. The estimates of the eﬀects are now the parameter estimate multiplied by 2.18750 3. . . . . . . . . .25000 8. . . . .62500 -1. .75000 3. .0000 . .31250 7. . .62500 -5. . . .00000 0. .75000 11. . . . . . . Pr > |t| .62500 -9. .5 = 15. . . . . .00000 31.25000 -1. . . . . . . . . . . . . .81250 3.56250 11.12500 3. .92 Corrected Total 31 73807 . .12500 2. . . . . . .25000 9.31250 2. t Value . . . . For instance.81250 -2. .93750 0. . . . . 315. . .18750 6.62500 -6. . . .50000 -6. . . . . .81250 Standard Error . . . . . . .00000 -7.31250 -6. . . Parameter Estimates Variable Intercept A B C D E AB AC AD AE BC BD BE CD CE DE ABC ABD ABE ACD ACE ADE BCD BCE BDE CDE ABCD ABCE ABDE ACDE BCDE ABCDE DF 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Parameter Estimate 315. . .93750 -4.81250 . the ABE eﬀect is 2 ∗ 7. .31250 -5. . R-Square Adj R-Sq CHAPTER 4. .93750 6. 2K AND 3K FACTORIAL DESIGNS Root MSE Dependent Mean Coeff Var 1.18750 6. . . . . . . .43750 -7.25000 -10.81250 11.

12500 420.50000 2850.12500 1922.00000 28.12500 128.00000 1682. RUN.1300 0.58 1.5484 Mean Square 3727.00000 28.12500 684. QUIT.12500 1922.34375 F Value 3.1300 0.72 28.0001 0.3252 0.50 2.5484 Pr > F 0.2378 0.12500 1152.50000 F Value 3.00000 32385.50 2.2. MODEL YIELD = A B C D E AB AC AD AE BC BD BE CD CE DE ABC ABD ABE ACD ACE ADE BCD BCE BDE CDE ABCD ABCE ABDE ACDE BCDE ABCDE.66 1.8760 0. DROP YIELD INTERCEPT _RMSE_.37500 17893.0767 0.4101 <. RUN.12500 128.96 0.61 2.76 0.50000 73806.12500 1152.4.12500 420.4101 <. DATA EFFECTS.76 0.66 1.72 3.7395 0.00000 3081.0001 0.38 Pr > F 0.3015 0. QUIT.12500 1922.3015 0.50000 1275.50000 1275.03 1.12500 1404.12500 1404.14 0.12500 420.2790 0.12500 1404.00000 4095.58 1.55 1. THE 2K FACTORIAL DESIGN 93 We will now analyze the data ignoring 3 and higher factor interactions.00000 32385.33 Pr > F 0.72 3.12500 800.26 1.50000 Mean Square 4005.12500 800.03 0.26 1.12500 128.03 1.50000 Type III SS 4005.12500 1152.11 0.50000 2850.55833 1118.72 28.12500 420. CLASS A B C D E AB AC AD AE BC BD BE CD CE DE.12500 684.3252 0.0737 0.2790 0.00000 4095.7395 0.12500 128.12500 1404. SET EFFECTS.50000 1275.61 2.50000 Mean Square 4005.00000 1682.00000 3081.12500 800.00000 4095.00000 1682.38 F Value 3.00000 4095.12500 1152.00000 3081.00000 32385. .14 0. The following partial SAS code follows the one given above.12500 1922.50000 1275.1164 0.2084 0.12500 684.87500 Type I SS 4005.2084 0.00000 32385.1164 0.0767 0.12500 800.8760 0. The following SAS code that provides the QQ-plot continues the above: PROC REG OUTEST=EFFECTS.00000 28.4454 0.96 0.4454 0.00000 3081.50000 2850. MODEL YIELD = A B C D E AB AC AD AE BC BD BE CD CE DE. ---------------------------------------------------------Dependent Variable: YIELD Source Model Error Corrected Total Source A B C D E AB AC AD AE BC BD BE CD CE DE Source A B C D E AB AC AD AE BC BD BE CD CE DE DF 15 16 31 DF 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 DF 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Sum of Squares 55913.12500 684.0111 The only signiﬁcant factor appears to be resin/extender ratio.03 0.00000 28.11 0.00000 1682.0737 0.55 1. Let us now plot the QQ-plot of the eﬀects in the full model in an eﬀort to identify the important factors.50000 2850.2378 0. PROC GLM. RUN.

VAR EFFECT. QUIT. PROC RANK DATA=EFFECTS NORMAL=BLOM. RUN. PROC TRANSPOSE DATA=EFFECTS OUT=EFFECTS. RUN. CHAPTER 4. GOPTION COLORS=(NONE). RANKS RANKEFF. RUN. DATA EFFECTS. RUN. BY EFFECT. DROP COL1. PLOT RANKEFF*EFFECT=_NAME_. QUIT. PROC GPLOT. 2K AND 3K FACTORIAL DESIGNS .94 QUIT. SET EFFECTS. QUIT. EFFECT = COL1*2. QUIT. RUN. PROC SORT DATA=EFFECTS. SYMBOL V=CIRCLE. QUIT.

4. THE 2K FACTORIAL DESIGN 95 .2.

BD=B*D. AC=A*C. CARDS. Example A chemical product is produced in a pressure vessel. QUIT. BCD=B*C*D. AD=A*D. -1 -1 -1 -1 -1 -1 -1 1 -1 -1 1 -1 -1 -1 1 1 -1 1 -1 -1 -1 1 -1 1 -1 1 1 -1 -1 1 1 1 1 -1 -1 -1 1 -1 -1 1 1 -1 1 -1 1 -1 1 1 1 1 -1 -1 1 1 -1 1 1 1 1 -1 1 1 1 1 . AB=A*B. The data are given below: A0 B0 D0 D1 C0 45 43 C1 68 75 C0 48 45 B1 C1 80 70 B0 C0 71 100 C1 60 86 A1 B1 C0 C1 65 65 104 96 This is an unreplicated 24 design.96 CHAPTER 4. DATA EFFECTS. SET EFFECTS. The following example is taken from Montgomery : Design and Analysis of Experiments. OPTIONS LS=80 PS=66 DATA FILTER. We start by drawing the QQ plot of the eﬀects to identify the potentially signiﬁcant factors. RUN. RUN. INPUT A B C D Y. . 45 43 68 75 48 45 80 70 71 100 60 86 65 104 65 96 DATA FILTER2. RUN. MODEL Y = A B C D AB AC AD BC BD CD ABC ABD ACD BCD ABCD. RUN. A factorial experiment was conducted in the pilot plant to study the factors thought to inﬂuence the ﬁltration rate of this product. ACD=A*C*D. DROP Y INTERCEPT _RMSE_. and stirring rate (D). reactant concentration (C). RUN. QUIT. QUIT. ABCD = A*B*C*D. QUIT. SET EFFECTS. DATA EFFECTS. The four factors are temperature (A). NODATE. pressure (B). PROC TRANSPOSE DATA=EFFECTS OUT=EFFECTS. CD=C*D. QUIT. PROC REG OUTEST=EFFECTS. 2K AND 3K FACTORIAL DESIGNS Once again the only variable that appears to be signiﬁcant is resin/extender ratio. SET FILTER. ABC=A*B*C. BC=B*C. ABD=A*B*D.

PROC GPLOT. QUIT. VAR EFFECT. QUIT. SYMBOL V=CIRCLE. DROP COL1. RUN.2. 97 . PROC RANK DATA=EFFECTS NORMAL=BLOM. BY EFFECT. GOPTION COLORS=(NONE). RANKS RANKEFF. QUIT. RUN. PROC SORT DATA=EFFECTS.4. RUN. RUN. PLOT RANKEFF*EFFECT=_NAME_. QUIT. THE 2K FACTORIAL DESIGN EFFECT = COL1*2.

2K AND 3K FACTORIAL DESIGNS .98 CHAPTER 4.

.562500 5.6475 0. Computations of eﬀects and sums of squares are direct extensions of the 2k case. 11.562500 1105.0031 <..062500 1314.562500 Mean Square 1870. 1.47 Pr > F <. The treatment 01. 22.062500 1314. 20.13 49.062500 855. for a 32 design the treatments are 00. CLASS A C D. PROC GLM DATA=FILTER. 21. D and 2 replications. ab.3. 12.062500 10. -----------------------------------------------------------Dependent Variable: Y Source Model Error Corrected Total Source A C A*C D A*D C*D A*C*D DF 7 8 15 DF 1 1 1 1 1 1 1 Sum of Squares 5551.38 58.562500 1105.500000 5730. .0001 0. We now have 3k treatments which we denote by k digit combinations of 0.27 0. QUIT. and 2 (high). 1 (intermediate).0001 0. Thus. 01.937500 Type I SS 1870. 02. a.0003 0. and 2 instead of the standard notation (1).23 0.57 38. . RUN. ignoring factor B and any interactions involving factor B we run the analysis. The ANOVA table for a 3k design with n replications is .562500 390. 10. AC. For example.062500 22. THE 3K DESIGN 99 The QQ-plot identiﬁes A.562500 390.5120 Mean Square 793. for instance. C. b. MODEL Y = A|C|D.562500 F Value 83. The following partial SAS code performs the analysis.062500 855. AD as signiﬁcant.062500 10. is the combination of the low level of factor A and the intermediate level of factor B.437500 179.0001 0. D.0001 4.37 17.3 The 3k Design We now consider the analysis of a k-factor factorial design where each factor has 3 levels: 0 (low). C.562500 5.437500 F Value 35. This means the resulting analysis is a 23 design with factors A.35 Pr > F <.4.

8 . 2k 3k (n − 1) 3k n − 1 SS SSF1 SSF1 . . . SSFk−1 Fk SSF1 F2 F3 SSF1 F2 F4 . Fk−2 Fk−1 Fk . . . . . . . . SSFk−2 Fk−1 Fk . . . 2 4 4 . . . . . SSF1 F2 ···Fk SSE SST .100 Source k main eﬀects F1 F1 . . Fk−1 Fk k three-factor interactions 3 F 1 F2 F3 F 1 F2 F4 . k = 1 k-factor interaction k F 1 F2 · · · Fk Error Total CHAPTER 4. Fk k two-factor interactions 2 F 1 F2 F 1 F3 . . . . . . . SSFk SSF1 F2 SSF1 F3 . . 2K AND 3K FACTORIAL DESIGNS df 2 2 . . 4 8 8 .

This situation is sometimes referred to as the one-way repeated measurement design. 5. · · · . yaj ) is 101 . · · · .Chapter 5 Repeated Measurement Designs 5. σ 2 ) random variables. · · · . one obtains and Thus.1 The Mixed RCBD Consider a single factor experiment with a levels of the factor. 2 • β1 . 2 σβ 2 σ 2 + σβ for i = i . All variances of observations are equal within a block and all covariances within a block are equal. • ij are ab iid N (0. Such observations are rarely independent as they are measured on the same subject. Under these conditions. The statistical model for the mixed randomized complete block design is yij = µ + τi + βj + where • a i=1 τi ij . ρ= 2 V ar(yij ) = σ 2 + σβ 2 Cov(yij . We assume that a random sample of b blocks (subjects) is available from a large population of blocks (subjects). βb is a random sample from a N (0. Let yij be the observation on level i of A for the jth subject. Suppose we have a blocking variable B that is random.1. yi j ) = σβ . and • all the b + ab random variables (blocks and errors) are independent. · · · .1 Introduction Sometimes we take observations repeatedly on the same experimental subjects under several treatments. σβ ) distribution. say A. i = 1. is the common correlation of measurements made on the same subject for any pair of distinct treatment A levels. b = 0. These designs are extensions of the randomized complete block design where blocks are random. Each of the a levels of factor A is observed with each subject. i and i . The variance-covariance matrix of observations on each subject yj = (y1j . a j = 1.

| y ¯ 2M SAB b > qa. . 2 ρσy 2 ρσy 2 σy . structure called the Huynh-Feldt sphericity (S) structure. 2 σy 2 ρσy . . 2 ρσy ··· ··· ··· 2 ρσy 2 ρσy . we have 1. . 2 σy α-level Tests Recalling the situation in the mixed two factor analysis. RCBD’s that satisfy the (S) condition are known as one-way repeated measurement (RM) designs. An α-level simultaneous Tukey-Kramer test of H0 : τi = τi for 1 ≤ i < i ≤ a is |¯i.(a−1)(b−1) (α) M SB > Fb−1. . where level 1 of a is control. | y ¯ 2M SAB b > t(a−1)(b−1) (α/2) 3.(a−1)(b−1) (α) M SAB It can be shown that these tests remain valid for a more general variance-covariance. . An α-level simultaneous Dunnett test of H0 : τi = τ1 for 2 ≤ i ≤ a. is |¯i. . An α-level test of H0 : τi = τi is |¯i. − yi . − yi . . . | y ¯ 2M SAB b 2 5. σaa Under the assumptions of the mixed RCBD Σ satisﬁes compound symmetry. . An α-level test of H0 : σβ = 0 is > da−1. σa2 ··· ··· ··· σ1a σ2a . An α-level test of H0 : τ1 = · · · = τa = 0 is M SA > Fa−1. . . .102 Σ= CHAPTER 5. REPEATED MEASUREMENT DESIGNS σ11 σ21 . Compound symmetry (CS) is the case where all variances are equal and all covariances are equal: Σ= 2 2 where σy = σ 2 + σβ . Σ. .(a−1)(b−1) (α) √ 2 4.(a−1)(b−1) (α) M SAB 2. It is recommended that all mixed RCBD’s be analyzed as one-way RM designs since we can test for the (S) condition in a similar manner as Levene’s test. − yi . σa1 σ12 σ22 .

The following example illustrates a typical RM design. This is the order eﬀect. It is assumed that D1 is administered and then enzyme X measured (y11 ). y31 for subject 1 cannot be obtained simultaneously. · · · . D1 . D2 is administered and enzyme X measured. or a diﬀerent dosages of the same drug. The a levels of A may be a diﬀerent drugs. If all the six subjects are treated with D1 followed by D2 followed by D3 . let t1 < t2 < · · · < ta be a times and let τi be the eﬀect of the drug D at time ti . it is important to control the possible order eﬀects. The major diﬀerence between mixed RCBD’s and RM designs is that in RM designs the levels of factor A cannot be observed simultaneously. this may be done as follows. or measurements of the same drug at the same dosage level over a period of a times. and D3 are to be compared with respect to suppression of enzyme X. . The following table is one possible randomization: subject 1 2 3 4 5 6 D1 D2 D1 D3 D3 D2 order D2 D3 D2 D1 D1 D3 D3 D1 D3 D2 D2 D1 Note that each drug is observed ﬁrst by two subjects. (D3 . and D3 from the fact that the drugs were given in the order (D1 . Subject 1 takes the drugs in the order D1 . In certain RM designs the order eﬀect is impossible to eliminate. Example Suppose three drugs D1 . D2 . Usually. In the above example. y21 . Assume each of n = 6 subjects is to be observed with each of the three drugs. D2 ) and randomly assign two subjects to each order. D3 . D3 . D2 . then how can one distinguish observed diﬀerences between D1 . D. is given to all subjects at the same dosage level. Assume the drug. the subjects are assumed to be randomly selected from a population of subjects. Hence. (D2 . etc. ta . D3 )? Are these diﬀerences due to true diﬀerences between the drugs or the order in which they are observed? In RM designs. say t1 . D2 . In all these cases the drug (dosage) mean responses are to be compared for these a levels. D2 . ONE-WAY REPEATED MEASUREMENT DESIGNS 103 5. the subject factor (B) will be considered random in RM designs. Consider three orders: (D1 . and third by two subjects. D2 . D3 ).2 One-Way Repeated Measurement Designs The major diﬀerence between the mixed RCBD and the one-way RM design is in the conceptualization of a ’block’. That is.5. In many cases the ’block’ is a human or an animal and all the a levels of A are observed on the same subject. which is produced and secreted by the liver. a test of H0 : τ 1 = · · · = τ a = 0 is needed. For example. After the eﬀect of D1 is worn-oﬀ.2. The following is an example for a = 4. This raises another issue to be considered and controlled in RM designs. D1 ). Note that y11 . second by two subjects.

is free of ¯ δ ˆ ˆ Cov(φc . a ˆ and φd = a i=1 di yi. All pairwise comparisons of treatments have the same variance V ar(¯i. yi j ) = 0. The more general (S) structure for Σ is now introduced. b ci = 0. observations taken on diﬀerent subjects are independent.2. . · · · . for j = j . measurements are made on each subject a = 4 times in the same order. The (S) structure implies the following (prove!): 1. It is given by a δ ˆ V ar(φ) = b ˆ 3. 5. − yi . is free of γ1 . i=1 5. · · · . Cov(yij . This correlation structure violates the (CS) structure assumed under the mixed RCBD. . • the b subjects are a random sample from a population of subjects following a normal distribution with . i.e. i i=1 a ¯ i=1 ci yi. φd ) = b ci di .2 The One-Way RM Design : (S) Structure The statistical model for the one way RM design under the (S) structure is yij = µ + τi + βj + where • a i=1 τi ij = 0. The variance of any sample contrast φ = a ¯ i=1 ci yi.2. observations observed closer together in time are more highly correlated than observations further apart in time. γa . REPEATED MEASUREMENT DESIGNS description enzyme X baseline measured prior to administration of drug D enzyme X measured 1 day after administration of drug D enzyme X measured 3 days after administration of drug D enzyme X measured 7 days after administration of drug D 2 3 Day 3 4 Day 7 Hence.1 The Huynh-Feldt Sphericity (S) Structure The Huynh-Feldt sphericity structure is given by σii = 2γi + δ γi + γi if i = i if i = i Of course. γa . say. φc = γ1 . With this type of design. ) = y ¯ ˆ 2.104 i 1 ti t1 t2 t3 t4 time start of study Day 1 CHAPTER 5. The covariance between any two contrasts. 2δ . It is given by c2 .

The order eﬀect was partially removed by selecting one of the four drug orders below for each dog: . Actually. Example In a pre-clinical trial pilot study b = 6 dogs are randomly selected and each dog is given a standard dosage of each of 4 drugs D1 . | y ¯ 2M SAB b > t(a−1)(b−1) (α/2) . − yi . b > qa.5. a test of H0 : τi = τ1 at MEER=α is |¯i. and Feldt. an α-level test of H0 : τi = τi is |¯i. assuming the eﬀect of previously-administered drugs have worn oﬀ. D3 . S. Journal of the American Statistical Association.2. yj = (y1j .(a−1)(b−1) (α) √ . H. Four measures of the stabilization of the heart function are obtained for each dog for each drug type. − yi . ”Conditions under which the mean square ratios in the repeated measurements designs have exact F distributions”.(a−1)(b−1) (α) y ¯ . L. The variance-covariance matrix of the observations on the same subject. For the proof of this. · · · . 65.(a−1)(b−1) (α) . ) ± qa. a set of (1 − α)100% Tukey-Kramer conﬁdence intervals for all (¯i. an α-level test of H0 : τ1 = · · · = τa = 0 is M SA /M SAB > Fa−1. 2 and 4. please see Huynh. in the one-way RM design. for all the tests to hold a necessary and suﬃcient condition is that Σ satisﬁes the (S) condition. − y1 ) ± da−1. satisﬁes the (S) condition. b i. The following example is taken from Mike Stoline’s class notes.(a−1)(b−1) (α) y ¯ i. D2 . These measures are diﬀerences between rates measured immediately prior the injection of the drug and rates measured one hour after injection in all cases. ONE-WAY REPEATED MEASUREMENT DESIGNS 1. (1970). Under the one-way RM model where Σ satisﬁes the (S) condition we have 1. These drugs are compounds that are chemically quite similar and each is hypothesized to be eﬀective in the stabilization of the heart function. − yi . These results depend on the (S) condition. yaj ). a set of (1 − α)100% Dunnet conﬁdence intervals for all a − 1 comparisons τi − τ1 (treatments versus control) is 2M SAB (¯i. | y ¯ 2M SAB b a 2 pairwise comparisons τi − τi is M SAB . 3. 2. 1582-1589.e. − y1 | y ¯ 2M SAB b > da−1.(a−1)(b−1) (α) . E(yij ) = µ + τi 105 2. and D4 .e. a test of H0 : τi = τi at MEER=α is |¯i.

106 drug order 1 2 3 4 CHAPTER 5.47 Pr > F <.86666667 4.5 Assuming that the (S) condition is satisﬁed.6 5.5 6 3 5.9 6 2 5.32791667 30. D2 D2 .0001 Mean Square 3.8 4.3 3.1 5 1 3.8 4. D1 .3 5 2 5.46 11. D2 .3 5.2 D4 4.3 6 1 3. D4 . OPTIONS LS=80 PS=66 NODATE.5 Level D3 5. INPUT DOG DRUG Y @@. --------------------------------------------------------Source Model Error Corrected Total Source DRUG DOG DF 8 15 23 DF 3 5 Sum of Squares 28.1 5. MODEL Y=DRUG DOG. we may use the following SAS code to perform the ANOVA F -test as well as follow-up Tukey-Kramer analysis.55000000 LSMEAN Number 1 2 3 4 Least Squares Means for effect DRUG Pr > |t| for H0: LSMean(i)=LSMean(j) Dependent Variable: Y .1 6.1 4 1 2.15519444 F Value 22.8 3 3 7.71 Pr > F <.52520833 0.8 4.9 2 2 5.2 .4 4 2 3. D3 .43486111 1.2 6. CARDS. D3 D1 . QUIT. CLASS DRUG DOG.4 3.38333333 5.0 5.77941667 F Value 41. D4 .2 5. 1 2 3 4 5 6 4 4 4 4 4 4 4.2 5. 1 1 2.6 1 3 5.6 3.9 5.0 3.2 5.0 3.6 1 2 4. D1 .5 PROC GLM.2 3 2 5. D3 . DATA RM1.9 Drug D2 4.1 5. D2 .20166667 2. D1 D4 .3 3 1 4.9 4.2 5 3 6.30458333 8. RUN.0001 0.8 4.52958333 Type III SS 19.9 4 3 5. D4 The data are (a large entry indicates high stabilization of heart-rate) Dog 1 2 3 4 5 6 D1 2.2 2 1 3.1 2 3 6.2 2.01666667 5.0001 Adjustment for Multiple Comparisons: Tukey DRUG 1 2 3 4 Y LSMEAN 3.0 5.8 3.3 7.89708333 Mean Square 6. REPEATED MEASUREMENT DESIGNS order of administration D3 . LSMEANS DRUG/PDIFF ADJUST=TUKEY.

e = 1?” The answer to this question is the Huynh-Feldt modiﬁcation of the Mauchly (1940) test for sphericity.e.3 3.0001 0.2 4. p-value < . The following SAS code produces Mauchly’s test as well as the G-G e-adjusted F -test.1 4.0001 0. DATA RM2.2134 0.5 .0 4.(a−1)(b−1)ˆ(α) . Hence. 2. we get the following: • Drug 3 is superior to all other drugs.3 5.0.2 5. e e M SAB (a − 1)[ The G-G e-adjusted test reduces to the usual F -test if e = 1.9 5. Example We will reconsider the dogs’ heart-rate example once more.3 5. INPUT D1 D2 D3 D4. • Drug 1 is inferior to the other drugs.0. .8 2. 5.3 One-way RM Design : General a2 (¯· − σ·· )2 σ ¯ 2 σij − 2a σj + a2 σ·· ] ¯2 ¯2 An estimate of the departure of Σ from the (S) structure is e= where • σ·· is the mean of all a2 entries of Σ. OPTIONS LS=80 PS=66 NODATE. • Drugs 2 and 4 are the same.8 3.0001). MODEL D1-D4 = /NOUNI.5 5. there is a signiﬁcant diﬀerence among the four drugs (F = 41.8 7. Whenever the (S) condition is not met we use the Greenhouse-Geisser (G-G) e-adjusted test of H0 : τ 1 = · · · = τ a = 0 given by M SA > F(a−1)ˆ.9 5.0095 0.2.5. The value of e satisﬁes 1/(a − 1) ≤ e ≤ 1.0001 <.6 4.2 4. The question now is ”How can it be determined whether Σ satisﬁes (S). we will always use the ˆ G-G e-adjusted test regardless of the result of Mauchly’s test.2 3.1 5.4 3. ONE-WAY REPEATED MEASUREMENT DESIGNS i/j 1 2 3 4 1 <.9 5.0002 4 0.6 5.2134 3 <.46.0006 2 <.0 3. The (S) condition is satisﬁed if and only if e = 1. PROC GLM.0001 0. CARDS. ¯ • σ· is the mean of the diagonal entries of Σ.0006 0.0095 0.0002 107 Thus.1 6.2. From the Tukey-Kramer procedure. i. ¯ ¯ • σj is the mean of row j entries of Σ.2 6. in this class.

0002 D14 6 1.1177 . INPUT D1 D2 D3 D4.85 0.8500000 0.4666667 0.2 6.6735).7576 1.32791667 Mean Square 6.0026 D23 6 0.5 . RUN. ---------------------------------------------------Sphericity Tests Variables Transformed Variates Orthogonal Components DF 5 5 Mauchly’s Criterion 0. D23 = D3-D2.0196 D24 6 -0. The G-G estimate of e is e = 0.1 4.2 4.5 5.3 3.0001 <.2472066 -1.30458333 2.15519444 F Value 41. The G-G e-adjusted test for H0 : τ1 = τ1 = τ3 = τ4 = 0 is rejected with p-value ˆ < .43486111 0.1 6.2392 0. D13 = D3-D1. This is done using PROC MEANS in SAS to get pairwise t-test statistics.0001 D13 6 2.0 3. VAR D12 D13 D14 D23 D24 D34. Follow-up t-tests may be performed without assuming equality of variances.1627138 0.2 3.0001 Adj Pr > F G .8 7.89 0.9 5.9 5.38 0.1173788 13.2108185 5.0001.92 <.4 3. D14 = D4-D1. REPEATED MEASUREMENT DESIGNS Chi-Square 6.1720748 Pr > ChiSq 0.6333333 0.1666667 0.53 0. PROC MEANS N MEAN STDERR T PRT. -------------------------------------------The MEANS Procedure Variable N Mean Std Error t Value Pr > |t| -------------------------------------------------------------------D12 6 1. QUIT.F <.46 Pr > F <. D12 = D2-D1.1 5.6 5.2 5.2513298 3. OPTIONS LS=80 PS=66 NODATE.2522124 9.108 REPEATED DRUG/PRINTE NOM.2 4.0001 Greenhouse-Geisser Epsilon Huynh-Feldt Epsilon 0.4833333 0.3 5.7586721 3. CARDS. D34 = D4-D3.G H .3 5.9 5. 2.7576.0 4. DATA RM2. D24 = D4-D2. RUN.6735 The GLM Procedure Repeated Measures Analysis of Variance Univariate Tests of Hypotheses for Within Subject Effects Source DRUG Error(DRUG) Source DRUG Error(DRUG) DF 3 15 Type III SS 19.8 3.4225 Thus the test for H0 : e = 1 is not rejected using Mauchly’s criterion (p-value = 0.8 2.426476 CHAPTER 5. QUIT.6 4.

3. For instance.19 0. . Let A be the between-subject factor with a ﬁxed levels and B be the within-subject (repeated) factor with b levels. y21n2 . 1 y111 . ya1na y. . . ¯ y. . TWO-WAY REPEATED MEASUREMENT DESIGNS D34 6 -1. . . . . .2535306 -5. . ¯ 2 ··· y2. b k = 1. We will deﬁne the model for the general unbalanced cased but we will conﬁne our attention to the balanced case as it is the most common design.b.. yabna y.3166667 0. ya11 . ¯ a ··· ··· ya. Group Means y1. ¯ . . . Within treatment comparisons involve the same subjects and are analyzed similarly to the one-way RM design. . . A random sample of ni subjects are selected and assigned to the ith level of A. y2bn2 . Sometimes it is important to consider an unbalanced case. . .5. . ni where • τi is the ith group eﬀect ( i τi = 0) j • βj is the jth treatment eﬀect ( βj = 0) .. 5.3 Two-Way Repeated Measurement Designs In this section we will consider the two-way RM model with repeated measures on one factor. .. .. . Sn1 S1 . . . The layout for the classic two-way RM design looks like the following: Treatments (B) Groups (A) 1 subjects S1 .. The classic two-way RM model is i = 1. . . j = 1. . . yab1 . . . . S1 . Sn2 . . . . ¯ Sna Treatment Means This is a very widely used design. y11n1 y211 . y1bn1 y2b1 . . Between group comparisons involve distinct subjects and hence are similar to such comparisons in the single factor CRD model. The b levels of B are observed for each subject in each A group. . Suppose that two of the three clinics serve cities that are mediumsized while the third clinic is serving a large metropolitan population.1. ¯ ··· ··· b y1b1 . The recommended sample sizes are : n1 = n2 = n and n3 = 4n.0035 -------------------------------------------------------------------- 109 Once again the only drugs that are not signiﬁcantly diﬀerent are D2 and D4 . . . . . This is known as the classic two-way RM design. . suppose the groups are 3 clinics serving diﬀerent subjects in diﬀerent cities. a yijk = µ + τi + βj + (τ β)ij + πk(i) + (βπ)jk(i) + ijk . Suppose that the third clinic services a population that is four times as large as the other two. . .

and are mutually independent. then we have the following α-level tests: 1.. Cov((βπ)jk(i) . )2 y ¯ df1 and a b ni M S2 = i=1 j=1 k=1 (yijk − yi. a i=1 ni πk(i) random variables • (βπ)jk(i) is the random joint interaction eﬀect of subject k and treatment j nested within group i. k (i ) ) = 0 if k = k or i = i .. In this model we have V ar((τ β)ij ) = Let N = a i=1 a−1 2 στ β a and Cov((τ β)ij . df2 The ANOVA table for the classic two-way RM design is Source of variation Between Subjects A Subjects within groups Within Subjects B AB B× Subjects within groups df N −1 a−1 df1 N (b − 1) b−1 (a − 1)(b − 1) df2 M SB M SAB M S2 2 σ 2 + σβπ + N MS M SA M S1 E(MS) 2 σ 2 + bσπ + 2 + bσ 2 σ π b F Pa i=1 2 ni τi a−1 M SA /M S1 Pa i=1 σ2 + σ2 + 2 σβπ 2 σβπ Pa b−1 b P i=1 2 βi M SB /M S2 M SAB /M S2 + ni (τ β)2 ij (a−1)(b−1) j=1 Comparisons of means may be in order if the main eﬀects are signiﬁcant. )2 ¯ ¯ ¯ . Cov((βπ)jk(i) . a ni . If the interaction eﬀect is not signiﬁcant. The N = 2 are assumed to follow a N (0. b−1 σβπ ). b 2. (βπ)j 3. − yi . (τ β)i j ) = 2 −στ β .. (βπ)jk(i) follows a N (0. (βπ)jk(i) ..110 CHAPTER 5. A test of H0 : τi = τi is |¯i. (βπ)’s are assumed to satisfy 2 1. ijk • The πk(i) . and df2 = (N − a)(b − 1). σ 2 ) distribution. σπ ) distribution. df1 = N − a. + yi.k − yi.k − yij. let a ni M S1 = i=1 k=1 b(¯i. Note that the variables (βπ)jk(i) have a (CS) structure similar to the (CS) structure of (τ β)ij in the mixed two factor model: yijk = µ + τi + βj + (τ β)ij + ijk where τi is ﬁxed and βj is random. REPEATED MEASUREMENT DESIGNS • (τ β)ij is the interaction of the ith group and the jth treatment ( i (τ β)ij = j (τ β)ij = 0) • πk(i) is the random eﬀect of subject k nested within group i. is a random error term which is assumed to follow a N (0. | y ¯ M S1 b 1 ni + 1 ni > tdf1 (α/2) . Further. (βπ)j • ijk k(i) ) = 2 −σβπ b for j = j .

n1 = n2 = · · · = na = n.j. − yi j. − yij . If the AB interaction eﬀect is signiﬁcant. Standard Error of Estimator 2M S1 bn 2M S2 an 2M S2 n 2M S3 n df df1 df2 df2 df3 yij.3. ) = y ¯ 1 1 + ni ni 2 σ 2 + σπ + b−1 2 1 1 σβπ =: + b ni ni M and M cannot be unbiasedly estimated by either M S1 or M S2 . slightly more problematic since V ar(¯ij. N = na and the ANOVA table is Source of variation Between Subjects A (Groups) Subjects within groups Within Subjects B AB B× Subjects within groups df N −1 a−1 a(n − 1) N (b − 1) b−1 (a − 1)(b − 1) a(n − 1)(b − 1) MS M SA M S1 F FA = M SA /M S1 M SB M SAB M S2 FB = M SB /M S2 FAB = M SAB /M S2 Comparisons of means in the balanced case are summarized in the following table: Parameter τi − τi βj − βj µij − µij µij − µi j Estimator yi.. − yij . − yij ¯ ¯ . ¯ Since 2M S2 V ar(¯ij. For the balanced case. − y..j ¯ ¯ yij.e. Let µij be the mean of cell (i. − yij . i. ¯ ¯ y. TWO-WAY REPEATED MEASUREMENT DESIGNS 2. . − yi . ¯ ¯ . The comparison of groups i and i is. comparisons of the means of one factor needs to be performed within the levels of the other factor. − yi j.j. | y ¯ 2M S2 ni > tdf2 (α/2) .5. | y ¯ 2M S2 a2 a 1 i=1 ni > tdf2 (α/2) One may make simultaneous comparisons by making the appropriate adjustments in the above tests. − y. However. Using the Satterthwaite approximation formula. however. M S3 = (df1 )(M S1 ) + (df2 )(M S2 ) df1 + df2 is an unbiased estimate of M . ) = y ¯ ni the α-level test of H0 : µij = µij (comparison of treatments j and j within Group i) |¯ij. The estimator of µij is yij. . we get the degrees of freedom associated with M S3 as [(df1 )(M S1 ) + (df2 )(M S2 )]2 df3 = (df1 )(M S1 )2 + (df2 )(M S2 )2 Thus an α-level test of H0 : µij = µi j is |¯ij.j . A test of H0 : βj = βj is 111 |¯. | y ¯ M S3 1 ni + 1 ni > tdf3 (α/2) . j).

RUN. QUIT. DO B = 1. 1 72 86 81 77 85 86 83 80 2 78 83 88 81 82 86 80 84 3 71 82 81 75 71 78 70 75 4 72 83 83 69 83 88 79 81 5 66 79 77 66 86 85 76 76 6 74 83 84 77 85 82 83 80 7 62 73 78 70 79 83 80 81 8 69 75 76 70 83 84 78 81 . PROC SORT DATA=RM.112 In the balanced case M S3 and df3 are given by M S3 = and df3 = CHAPTER 5. n female human subjects were randomly assigned to each drug. RUN. CARDS. DATA RM.3.3. VAR Y. 69 66 84 80 72 65 75 71 73 62 90 81 72 62 69 70 72 67 88 77 69 65 69 65 74 73 87 72 70 61 68 65 . BY A B S.2.2.4. OUTPUT. After the drug was administered. PROC MEANS MEAN NOPRINT. DRUG Person within drug 1 2 3 4 5 6 7 8 AX23 T1 72 78 71 72 66 74 62 69 T2 86 83 82 83 79 83 73 75 T3 81 88 81 83 77 84 78 76 T4 77 81 75 69 66 77 70 70 T1 85 82 71 83 86 85 79 83 BWW9 T2 86 86 78 88 85 82 83 84 T3 83 80 70 79 76 83 80 78 T4 80 84 75 81 76 80 81 81 T1 69 66 84 80 72 65 75 71 CONTROL T2 73 62 90 81 72 62 69 70 T3 72 67 88 77 69 65 69 65 T4 74 73 87 72 70 61 68 65 The following SAS code performs the analyses: OPTIONS LS=80 PS=66 NODATE. The following table contains results from one such study. END. INPUT Y @@. QUIT. OUTPUT OUT=OUTMEAN MEAN=YM. BY A B. At the start of the study. DO A=1. REPEATED MEASUREMENT DESIGNS M S1 + (b − 1)M S2 b a(n − 1)[M S1 + (b − 1)M S2 ]2 2 2 M S1 + (b − 1)M S2 The following example is taken from Milliken and Johnson : The Analysis of Messy Data (Vol 1) Example An experiment involving d drugs was conducted to study each drug eﬀect on the heart rate of humans. END. the heart rate was measured every ﬁve minutes for a total of t times. INPUT S @.

6145833 531.0090 Tests of Hypotheses Using the Type III MS for B*S(A) as an Error Term Source B A*B DF 3 6 Type III SS 282. .5000000 81.541667 F Value 5.083333 Mean Square 657.657785 .7500000 72. .95 12.483631 94.166667 458.0001 -------------------------------------------------------------------------------Least Squares Means A 1 1 1 1 2 2 2 2 3 3 3 3 B 1 2 3 4 1 2 3 4 1 2 3 4 Y LSMEAN 70. F Value .5.204861 88. MODEL Y = A S(A) B A*B B*S(A).3750000 71.527778 7. PROC GPLOT DATA=OUTMEAN. TEST H=A E=S(A).7500000 84. . TWO-WAY REPEATED MEASUREMENT DESIGNS GOPTIONS DISPLAY. PROC GLM DATA=RM.2048611 88. TITLE3 ’DRUG BY TIME’.16 Pr > F <.614583 531.489583 Type III SS 1315.083333 2320.2500000 . Pr > F .1666667 Mean Square 94.5277778 F Value 12.3. . TITLE1 ’HEART RATE DATA’. SYMBOL2 V=TRIANGLE L=1 I=JOIN CV=BLACK. -------------------------------------------------------------------------------Dependent Variable: Y Source Model Error Corrected Total Source A S(A) B A*B B*S(A) DF 95 0 95 DF 2 21 3 6 63 Sum of Squares 4907. SYMBOL1 V=DIAMOND L=1 I=JOIN CV=BLUE. 113 Tests of Hypotheses Using the Type III MS for S(A) as an Error Term Source A DF 2 Type III SS 1315.1250000 81.489583 0.5000000 80. QUIT.468750 Mean Square 657.6250000 79. TEST H=B A*B E=B*S(A). QUIT. PLOT YM*B=A. SYMBOL3 V=CIRCLE L=1 I=JOIN CV=ORANGE.156250 282.7500000 72.277282 F Value .5000000 71.000000 4907. . CLASS A B S.0001 <. Mean Square 51. LSMEANS A*B.0000000 73. . . Pr > F .95 Pr > F 0. . RUN. RUN.0000000 78.541667 110.

35 8 Thus any |¯ij.5 73. n 2M S2 = n (2)(7. Compare times within drugs The common standard error is se(¯ij.6 79.3 T3 71.00(1.8 81.5 T2 72. ) = y ¯ The least signiﬁcant diﬀerence (LSD) is LSD = t63 (.5 81. | that exceeds 2.70 .8 84.0 ------------ . REPEATED MEASUREMENT DESIGNS The interaction plot as well as the F -test show a signiﬁcant interaction.8 T2 T3 80.025) 2M S2 = 2.70 indicates a signiﬁcant diﬀerence between µij and µij . − yij .1 -----------DRUG 2 : BWW9 T3 T4 T1 T2 78.0 ---------------------------------DRUG 3 : CONTROL T4 71.35) = 2.4 T1 72.28) = 1. − yij . y ¯ This gives us the following set of underlining patterns: DRUG 1 : AX23 T1 T4 70.114 CHAPTER 5.

| that exceeds 5. ) = y ¯ with df3 = a(n − 1)[M S1 + (b − 1)M S2 ]2 3(7)[110.5.5 72.042(2.5 + (3)(7.1 ---------------BWW9 79. .876 (2)(8) The least signiﬁcant diﬀerence (LSD) is LSD = t30 (.87 indicates a signiﬁcant diﬀerence between µij and µi j . Example This experiment involved studying the eﬀect of a dose of a drug on the growth of rats.28)]2 = = 29.0 ------------ TIME 4 CONTROL AX23 71.7 ≈ 30 2 + (b − 1)M S 2 M S1 (110.3 73. TWO-WAY REPEATED MEASUREMENT DESIGNS ---------------------------- 115 Compare drugs within times The common standard error is se(¯ij.75 TIME 3 CONTROL 71.28)) = 2. The weights were obtained each week for 4 weeks.5)2 + (3)(7.8 -------------TIME 2 CONTROL 72.8 The following example taken from Milliken and Johnson illustrates how SAS can be used to make comparisons of means in the absence of a signiﬁcant interaction.5 BWW9 AX23 78.0 -----------BWW9 81.6 81.5 84.87 . y ¯ This gives us the following set of underlining patterns: TIME 1 AX23 CONTROL 70.4 AX23 BWW9 80.876) = 5. − yij . The data set consists of the growth of 15 rats.28)2 2 2M S3 = n 2(M S1 + (b − 1)M S2 ) = nb 2(110. − yi j. n Thus any |¯ij.025) 2M S3 = 2. where 5 rats were randomly assigned to each of the 3 doses of the drug.3.5 + (3)(7.

54 60 63 74 69 75 81 90 77 81 87 94 64 69 77 83 51 58 62 71 62 71 75 81 68 73 81 91 94 102 109 112 81 90 95 104 64 69 72 78 59 63 66 75 56 66 70 81 71 77 84 80 59 64 69 76 65 70 73 77 . PROC SORT DATA=RM1. DO B = 1. RUN. QUIT.3. BY A B. RUN. OUTPUT. SYMBOL2 V=TRIANGLE L=1 I=JOIN CV=BLACK.4. PLOT YM*B=A. END. DO A=1.2.116 CHAPTER 5. .2.3. BY A B S. GOPTIONS DISPLAY. OUTPUT OUT=OUTMEAN MEAN=YM.4.2. CARDS. QUIT. PROC MEANS MEAN NOPRINT. DATA RM1. DO S=1. PROC GPLOT DATA=OUTMEAN. SYMBOL1 V=DIAMOND L=1 I=JOIN CV=BLUE. SYMBOL3 V=CIRCLE L=1 I=JOIN CV=ORANGE. REPEATED MEASUREMENT DESIGNS Week 2 3 60 63 75 81 81 87 69 77 58 62 71 73 102 90 69 63 66 77 64 70 75 81 109 95 72 66 70 84 69 73 Dose 1 Rat 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 54 69 77 64 51 62 68 94 81 64 59 56 71 59 65 4 74 90 94 83 71 81 91 112 104 78 75 81 80 76 77 2 3 The SAS code and output are given below: OPTIONS LS=80 PS=66 NODATE. VAR Y.3.5. END. END. INPUT Y @@.

TWO-WAY REPEATED MEASUREMENT DESIGNS TITLE3 ’DOSE BY TIME’. QUIT.1095 0.183333 32. .700000 Mean Square 1073. 117 Tests of Hypotheses Using the Type III MS for S(A) as an Error Term Source A DF 2 Type III SS 2146. . QUIT.394444 4.5.366667 Mean Square 892.38 Pr > F 0.727778 5. Pr > F . TITLE1 ’RAT BODY WEIGHT DATA’.7764 0.3854 The GLM Procedure Least Squares Means Standard Errors and Probabilities Calculated Using the Type III MS for S(A) as an Error Term A 1 2 3 Y LSMEAN 72.3.366667 177.95226 .216667 450. PROC GLM DATA=RM1.1345 Tests of Hypotheses Using the Type III MS for B*S(A) as an Error Term Source B A*B DF 3 6 Type III SS 2678.1095 0.18333 0. . ---------------------------------------------------------Dependent Variable: Y Source Model Error Corrected Total Source A S(A) B A*B B*S(A) DF 59 0 59 DF 2 12 3 6 36 Sum of Squares 10440.394444 F Value 180.433333 5405. LSMEANS A/PDIFF E=S(A). TEST H=A E=S(A).216667 F Value 2.00000 10440. RUN. Pr > F . . RUN. CLASS A B S. LSMEANS B/PDIFF E=B*S(A). . .458333 892.936111 F Value .18333 Type III SS 2146. F Value .0664 . MODEL Y = A S(A) B A*B B*S(A).7764 2 0.727778 5.0664 3 0. TEST H=B A*B E=B*S(A).6000000 70.0000000 83.86 1. .0500000 LSMEAN Number 1 2 3 Least Squares Means for effect A Pr > |t| for H0: LSMean(i)=LSMean(j) Dependent Variable: Y i/j 1 2 3 1 0. Mean Square 176.500000 2678. .09 Pr > F <.183333 32.0001 0.433333 Mean Square 1073.

REPEATED MEASUREMENT DESIGNS Least Squares Means Standard Errors and Probabilities Calculated Using the Type III MS for B*S(A) as an Error Term B 1 2 3 4 Y LSMEAN 66.0001 2 <. we have considered the analysis of the classic two-way RM design under the assumption that the (S) condition is satisﬁed for each level of A.5 WEEKS : Two-Way RM Design : General Case So far. .118 CHAPTER 5.1 72. Here we consider the analysis of a two-way RM design where the (S) condition may not hold.0001 3 <. the means of A and B can be compared at the highest level.6000000 84.0001 <.6 -------------------W1 66.5 W3 77.5333333 77.3 W2 72.0001 4 <.0001 <.2666667 72.0001 <.0001 Since there is no signiﬁcant interaction.0001 <.0001 <.0 83. Using the output from LSMEANS one obtains the following underlining patterns: DOSES : D3 D1 D2 70.6 W4 84.0001 <.4666667 LSMEAN Number 1 2 3 4 Least Squares Means for effect B Pr > |t| for H0: LSMean(i)=LSMean(j) Dependent Variable: Y i/j 1 2 3 4 1 <.0001 <.0001 <.

yi1k . 119 2. . .5. yi1ni ··· ··· ··· ··· B j yij1 . 1 54 60 63 74 1 69 75 81 90 1 77 81 87 94 1 64 69 77 83 1 51 58 62 71 2 62 71 75 81 2 68 73 81 91 2 94 102 109 112 2 81 90 95 104 2 64 69 72 78 3 59 63 66 75 3 56 66 70 81 3 71 77 84 80 3 59 64 69 76 3 65 70 73 77 . The following SAS code is used to get the tests for B and AB eﬀects. . DATA RM3. REPEATED B 4/ PRINTE. .3. RUN. QUIT. for level i of A. . INPUT A Y1 Y2 Y3 Y4. we test H0 : µi1 = · · · = µib using the data Subject 1 . . That is. we test H0 : µ1j = · · · = µaj and make multiple comparisons of means using the data A i yij1 . PROC GLM DATA=RM3. Run a complete one-way ANOVA of the the levels of A within each level of B. CARDS. . k . . TWO-WAY REPEATED MEASUREMENT DESIGNS The analysis strategy will be as follows: 1. yijni ··· ··· ··· ··· a yib1 . . . for level j of B. . yibni 117 . y1jn1 ··· ··· ··· ··· ··· ··· a yaj1 . . . . . Test the B main eﬀect and the AB interaction eﬀect using the G-G e-adjusted F -test. -------------------------------------------------------------SELECTED OUTPUT -------------------------------------------------------------RAT BODY WEIGHT DATA : 2 1 yi11 . yijni 1 y1j1 . . CLASS A. . yibk . yijk . . TITLE1 ’RAT BODY WEIGHT DATA : 2’. OPTIONS LS=80 PS=66 NODATE. . Run a complete one-way RM analysis of the levels of B within each level of A using the G-G eadjusted one-way RM F -tests followed by multiple comparisons of means. . ni Example We will revisit the body weight of rats data considered above. That is. MODEL Y1-Y4 = A. . yajna 3. .

3854 Adj Pr > F G .0001 0.27398 Chi-Square 4.3811 <.183333 32.A=1 -------------------------------------Sphericity Tests Variables Transformed Variates Orthogonal Components DF 5 5 Mauchly’s Criterion 0.433333 5405.9445682 3. QUIT.216667 450.458333 F Value 2.936111 F Value 180.8269 Using the G-G e-adjusted tests one observes that the AB interaction eﬀect is not signiﬁcant while the B main eﬀect is signiﬁcant both at α = .G H .6058 0. BY A.09 Pr > F <. QUIT.0385 The GLM Procedure Repeated Measures Analysis of Variance Tests of Hypotheses for Between Subjects Effects Source A Error DF 2 12 Type III SS 2146. PROC GLM DATA=RM3.0001 0.031436 11.0001 0.366667 177.1345 118 RAT BODY WEIGHT DATA : 2 The GLM Procedure Repeated Measures Analysis of Variance Univariate Tests of Hypotheses for Within Subject Effects Source B B*A Error(B) Source B B*A Error(B) DF 3 6 36 Type III SS 2678. REPEATED MEASUREMENT DESIGNS Sphericity Tests Variables Transformed Variates Orthogonal Components DF 5 5 Mauchly’s Criterion 0. ----------------------------------------------------------------SELECTED OUTPUT ----------------------------------------------------------------RAT BODY WEIGHT DATA : 2 119 ------------------------------------.0001 0. RUN. REPEATED B 4/PRINTE.5244612 Pr > ChiSq 0.3345438 Chi-Square 33.05.0459293 0.700000 Mean Square 892.394444 4.4227 0. RUN. PROC SORT DATA=RM3.38 Pr > F 0. Mauchly’s test for the (S) condition is signiﬁcant indicating that the analyses run earlier may not be the appropriate ones.500000 Mean Square 1073. We now run one-way RM analyses of B within each level of A. MODEL Y1-Y4=/NOUNI.6197 .86 1.727778 5.3846 Greenhouse-Geisser Epsilon Huynh-Feldt Epsilon 0.F <.120 CHAPTER 5. BY A.1626147 0.740697 Pr > ChiSq <.

2149884 2.3000000 Mean Square 224.A=2 -------------------------------------Sphericity Tests Variables Transformed Variates Orthogonal Components DF 5 5 Mauchly’s Criterion 0.000000 Mean Square 338.0001 Adj Pr > F G .0021 0.200000 F Value 284.333333 F Value 78.0001 <.200000 1.1555 The GLM Procedure Repeated Measures Analysis of Variance Univariate Tests of Hypotheses for Within Subject Effects Source B Error(B) Source B Error(B) DF 3 12 Type III SS 672.0001 <.3.33 Pr > F <.0004 Greenhouse-Geisser Epsilon 0.9500000 111.F <.6286 1. TWO-WAY REPEATED MEASUREMENT DESIGNS 121 The GLM Procedure Repeated Measures Analysis of Variance Univariate Tests of Hypotheses for Within Subject Effects Source B Error(B) Source B Error(B) DF 3 12 Type III SS 1023.000000 52.0001 Greenhouse-Geisser Epsilon Huynh-Feldt Epsilon 0.0001 Greenhouse-Geisser Epsilon Huynh-Feldt Epsilon 0.6370 1.G H .7575 The GLM Procedure Repeated Measures Analysis of Variance Univariate Tests of Hypotheses for Within Subject Effects Source B Error(B) Source B Error(B) DF 3 12 Type III SS 1014.F 0.F <.G H .0001 Adj Pr > F G .G H .A=3 -------------------------------------Sphericity Tests Variables Transformed Variates Orthogonal Components DF 5 5 Mauchly’s Criterion 0.3166667 9.0126045 Pr > ChiSq 0.0055188 0.6252267 Pr > ChiSq 0.2055 ------------------------------------.600000 14.400000 Mean Square 341.4800 .3812237 Chi-Square 7.052686 Chi-Square 14.19 Pr > F <.5.00 Pr > F <.0706227 0.1713 ------------------------------------.0147 0.000000 4.2051 0.0001 Adj Pr > F G .15447 8.2750000 F Value 24.

1876 . MODEL Y=A. ARRAY Z Y1-Y4. RUN. OUTPUT. we may compare the means of B using the M SE as a denominator. PROC GLM DATA=RM4. Y = Z(B).2. B. S = _N_. Y columns and runs: • the one-way ANOVA for the factor A within each level of B. QUIT. QUIT.066667 110. QUIT. The repeated factor B is also signiﬁcant in all the cases.900000 F Value 1. QUIT. RUN.B=1 -------------------------------------The GLM Procedure Dependent Variable: Y Source Model Error DF 2 12 Sum of Squares 428. RUN. Thus. S. QUIT. BY B. DO B=1. CLASS B S.93 Pr > F 0.3.122 Huynh-Feldt Epsilon CHAPTER 5. REPEATED MEASUREMENT DESIGNS 0. The following SAS code re-creates the data as A.2.4. to compare the means of B in the particular group which does not satisfy the (S) condition. PROC SORT DATA=RM4.133333 1330. DROP Y1-Y4.800000 Mean Square 214. RUN. CLASS A. RUN. as shown in the last example of Section 5. BY A. SET RM3. BY B. ------------------------------------------------------------SELECTED OUTPUT ------------------------------------------------------------RAT BODY WEIGHT DATA : 2 140 ------------------------------------. BY A. one uses Welch t-tests. LSMEANS A/PDIFF. PROC GLM DATA=RM4. DATA RM4. • comparisons of the A means for each level of B. and • comparisons of the B means within each level of A. MODEL Y=B S. In the situation where the (S) condition is not satisﬁed in one or more of the groups. END. PROC SORT DATA=RM4. LSMEANS B/ PDIFF.6772 Sphericity is satisﬁed in all the three cases.

B=2 -------------------------------------The GLM Procedure Dependent Variable: Y Source Model Error Corrected Total Source A DF 2 12 14 DF 2 Sum of Squares 538.8000000 62.266667 111.0884 0.1876 The GLM Procedure Least Squares Means A 1 2 3 Y LSMEAN 63.0000000 LSMEAN Number 1 2 3 Least Squares Means for effect A Pr > |t| for H0: LSMean(i)=LSMean(j) Dependent Variable: Y i/j 1 2 3 1 0.1309 0.0000000 73.41 Pr > F 0.0757 3 0.9300 0.0884 0.1018 3 0.5333333 Mean Square 269.0000000 LSMEAN Number 1 2 3 Least Squares Means for effect A Pr > |t| for H0: LSMean(i)=LSMean(j) Dependent Variable: Y i/j 1 2 3 1 0.41 Pr > F 0.0000000 68.8831 2 0. TWO-WAY REPEATED MEASUREMENT DESIGNS 123 Corrected Total Source A 14 DF 2 1758.1018 ------------------------------------.200000 1879.6000000 81.3.1319 Mean Square 269.766667 F Value 2.1319 The GLM Procedure Least Squares Means A 1 2 3 Y LSMEAN 68.B=3 -------------------------------------The GLM Procedure Dependent Variable: Y Source DF Sum of Squares Mean Square F Value Pr > F .5.533333 1341.93 Pr > F 0.1309 0.2666667 F Value 2.8831 0.0757 ------------------------------------.0666667 F Value 1.933333 Type III SS 428.1333333 Mean Square 214.9300 2 0.733333 Type III SS 538.

124

Model Error Corrected Total Source A 2 12 14 DF 2 587.200000 1636.400000 2223.600000 Type III SS 587.2000000

**CHAPTER 5. REPEATED MEASUREMENT DESIGNS
**

293.600000 136.366667 2.15 0.1589

Mean Square 293.6000000

F Value 2.15

Pr > F 0.1589

The GLM Procedure Least Squares Means A 1 2 3 Y LSMEAN 74.0000000 86.4000000 72.4000000 LSMEAN Number 1 2 3

Least Squares Means for effect A Pr > |t| for H0: LSMean(i)=LSMean(j) Dependent Variable: Y i/j 1 2 3 1 0.1190 0.8321 2 0.1190 0.0824 3 0.8321 0.0824

------------------------------------- B=4 -------------------------------------The GLM Procedure Dependent Variable: Y Source Model Error Corrected Total DF 2 12 14 Sum of Squares 624.933333 1274.800000 1899.733333 The GLM Procedure Least Squares Means A 1 2 3 Y LSMEAN 82.4000000 93.2000000 77.8000000 LSMEAN Number 1 2 3 Mean Square 312.466667 106.233333 F Value 2.94 Pr > F 0.0913

Least Squares Means for effect A Pr > |t| for H0: LSMean(i)=LSMean(j) Dependent Variable: Y i/j 1 2 3 1 0.1235 0.4939 2 0.1235 0.0359 3 0.4939 0.0359

RAT BODY WEIGHT DATA : 2

152

------------------------------------- A=1 -------------------------------------The GLM Procedure Dependent Variable: Y Sum of

**5.3. TWO-WAY REPEATED MEASUREMENT DESIGNS
**

Source Model Error Corrected Total Source B S DF 7 12 19 DF 3 4 Squares 2733.600000 14.400000 2748.000000 Type III SS 1023.600000 1710.000000 The GLM Procedure Least Squares Means B 1 2 3 4 Y LSMEAN 63.0000000 68.6000000 74.0000000 82.4000000 LSMEAN Number 1 2 3 4 Mean Square 341.200000 427.500000 F Value 284.33 356.25 Pr > F <.0001 <.0001 Mean Square 390.514286 1.200000 F Value 325.43 Pr > F <.0001

125

Least Squares Means for effect B Pr > |t| for H0: LSMean(i)=LSMean(j) Dependent Variable: Y i/j 1 2 3 4 1 <.0001 <.0001 <.0001 2 <.0001 <.0001 <.0001 3 <.0001 <.0001 <.0001 4 <.0001 <.0001 <.0001

------------------------------------- A=2 -------------------------------------The GLM Procedure Dependent Variable: Y Source Model Error Corrected Total Source B S DF 7 12 19 DF 3 4 Sum of Squares 4326.800000 52.000000 4378.800000 Type III SS 1014.000000 3312.800000 The GLM Procedure Least Squares Means B 1 2 3 4 Y LSMEAN 73.8000000 81.0000000 86.4000000 93.2000000 LSMEAN Number 1 2 3 4 Mean Square 338.000000 828.200000 F Value 78.00 191.12 Pr > F <.0001 <.0001 Mean Square 618.114286 4.333333 F Value 142.64 Pr > F <.0001

Least Squares Means for effect B Pr > |t| for H0: LSMean(i)=LSMean(j) Dependent Variable: Y i/j 1 2 3 4 1 0.0001 <.0001 <.0001 2 0.0001 0.0015 <.0001 3 <.0001 0.0015 0.0002 4 <.0001 <.0001 0.0002

126

CHAPTER 5. REPEATED MEASUREMENT DESIGNS

------------------------------------- A=3 -------------------------------------The GLM Procedure Dependent Variable: Y Source Model Error Corrected Total Source B S DF 7 12 19 DF 3 4 Sum of Squares 1055.650000 111.300000 1166.950000 Type III SS 672.9500000 382.7000000 Mean Square 224.3166667 95.6750000 F Value 24.19 10.32 Pr > F <.0001 0.0007 Mean Square 150.807143 9.275000 F Value 16.26 Pr > F <.0001

The GLM Procedure Least Squares Means B 1 2 3 4 Y LSMEAN 62.0000000 68.0000000 72.4000000 77.8000000 LSMEAN Number 1 2 3 4

Least Squares Means for effect B Pr > |t| for H0: LSMean(i)=LSMean(j) Dependent Variable: Y i/j 1 2 3 4 1 0.0089 0.0002 <.0001 2 0.0089 0.0414 0.0003 3 0.0002 0.0414 0.0159 4 <.0001 0.0003 0.0159

Using underlining to summarize the results Group(A) --------1 2 3 B levels -------------B1 B2 B3 B4 B1 B2 B3 B4 B1 B2 B3 B4

Treatment(B) -----------1 2 3 4

A levels -----------------A3 A1 A2 -----------------A3 A1 A2 -----------------A3 A1 A2 -----------------A3 A1 A2 -------------------

TWO-WAY REPEATED MEASUREMENT DESIGNS 127 .3.5.

128 CHAPTER 5. REPEATED MEASUREMENT DESIGNS .

j = 1. consider the fabric strength experiment considered in Chapter 1. one may be interested in the pattern of the response variable over time. To this end we want to consider applications of regression procedures within the ANOVA framework. 129 . the treatments may be several dosages of the same drug.Chapter 6 More on Repeated Measurement Designs In this chapter we will further investigate one. which is tested using F0 = M SB /M SW . This is the usual CRD framework that is represented by the model yij = µ + τi + where our interest lies in testing H0 HA : : τ1 = τ2 = · · · = τ5 = 0 τi = 0 for at least one i ij .1 6.and two-way RM designs as our ﬁrst section. Each percentage of cotton is randomly assigned to ﬁve randomly selected experimental units.e y = f (x) + ε For example.1 Trend Analyses in One. y.and two-way repeated measurement designs. Later sections consider special cases of the two-way RM design. Thus. in a one-way CRD model. if we consider a regression type relationship between x and y. 5. one may consider the simple linear regression model y = β0 + β1 x + ε and test H0 : β1 = 0 to determine if the response is linearly related to the levels of the treatment. · · · . 6. We can get more insight into the nature of the relationship of the response.and Two-way RM Designs Regression Components of the Between Treatment SS (SSB ) Often the treatments in an experiment consist of levels of a quantitative variable. This may be used to ﬁnd the optimal dosage level. One is usually interested in developing an equation for a curve that describes the dose-response relationship. · · · . x. 5 . The treatment consists of ﬁve diﬀerent levels of cotton percentages and the response is the strength of the fabric produced.1. i = 1. Since RM designs usually involve a time factor. For instance. and the levels of the treatment variable. i. we shall consider trend analysis in one. As an illustration.

This. Thus. the hypothesis that all the treatment means are the same against the alternative that at least two are diﬀerent. FR is a test statistic for testing H0 : β1 = 0 . the hypothesis that there is no linear relationship between the response and the levels of the treatment against the alternative that there is a signiﬁcant linear relationship. SSB = SSR + SSL where SSR is the sum of squares due to the linear regression and SSL is the sum of squares due to lack of ﬁt (i. seems to be the hard way as the F values may have to be computed by hand. To use this approach we may deﬁne contrasts using the deviations of the treatment levels from the treatment mean as our contrast coeﬃcients.e failure of the linear regression to describe the relationship between x and y). assume that we have a balanced CRD model where r represents the number of replications per treatment level. FL is the test statistic for testing H0 : E(y) = β0 + β1 x . the hypothesis that the simple linear regression model describes the data against the alternative that a simple linear model is not suﬃcient. The F ratios in the above ANOVA table are used to test the following hypotheses: 1. Without loss of generality. MORE ON REPEATED MEASUREMENT DESIGNS The variability in the response that is explained by the treatment may now be partitioned into that due to the linear regression and that due to the remainder that cannot be explained by the regression model. Assume also that we have k treatment levels. 2.130 Partitioning SSB CHAPTER 6. . 3. Then k φR = i=1 (xi − x)µi ¯ is a contrast ( k i=1 (xi − x) = 0) whose estimator is ¯ ˆ φR = k (xi − x)¯i ¯y i=1 ˆ From φR we get SSR = and ˆ rφ2 R k i=1 (xi − x)2 ¯ SSL = SSB − SSR . however. F0 is a test statistic for testing H0 : τ1 = · · · = τk = 0 . An easier way is to ﬁnd a set of coeﬃcients to deﬁne a contrast among the treatment means. The ANOVA table for the CRD is Source Between Linear Regression Lack of Fit Within (Error) Total df k−1 1 k−2 n−k n−1 SS SSB SSR SSL SSW SST MS M SB M SR M SL M SW F0 F0 = M SB /M SW FR = M SR /M SW FL = M SL /M SW One may obtain SSR by ﬁtting an ordinary linear regression model of y on x.

j = 1. run.86E-16 -0.316228 0. This is very widely used in practice even though it sometimes entail premature stopping. . i = 1.316228 3.7171372 -0.3162278 0. k −1. . .632456 -0. This means that SSB can be completely decomposed using a polynomial of degree k − 1 with no lack of ﬁt term. 40. say.478091 0. quit. print xp.4).5345225 -0. 10. print xp. TREND ANALYSES IN ONE. 131 In such a case we may replace (xi − x) by a simple set of multipliers known as orthogonal polynomial ¯ coeﬃcients.AND TWO-WAY RM DESIGNS Orthogonal Polynomials The ﬁtting of curves within the ANOVA framework can be greatly simpliﬁed if 1.4472136 -0.534522 -0. The orthogonal polynomial coeﬃcients for all four terms of the polynomial y = β0 + β1 x + β11 x2 + β111 x3 + β1111 x4 are computed as proc iml. The usual procedure is to ﬁt successive terms of the polynomial starting with the linear term until lack of ﬁt becomes non-signiﬁcant.5345225 -0. components each with one degree of freedom. and 50. These coeﬃcients enable us to partition SSB into orthogonal linear.3162278 0. 30. etc. and. .1195229 The ﬁrst column represents the intercept term. run.4472136 0.6324555 1. Note that in SAS the treatment levels need not be equally spaced. quadratic. cubic.4472136 0. For example proc iml.1195229 -0. x={10 20 30 40 50}.267261 0. For instance. xp=orpol(x.3). .6324555 0. . Assume that cij . . quartic. xp=orpol(x. .1. the design is balanced.6. Proc IML in SAS may be used to generate orthogonal polynomial coeﬃcients. . The output is 0.478091 0. 2.4472136 0. consider a case where the treatment has k = 4 levels. k be the ith order polynomial coeﬃcient for the jth treatment level. quit. These coeﬃcients are extensively tabulated but we will use Proc IML in SAS to generate them. the treatment levels are equally spaced.632456 0.267261 -0. and 2 Si = rL2 i k 2 j=1 cij is the 1 df sum of squares associated with the ith term. 20.4472136 0.588E-18 0. x={0 25 75 100}. Then k Li = j=1 cij yj ¯ is the contrast of means associated with the ith order term of the polynomial.

Symbol1 C=Black V=Triangle L=1 I=Spline.267261 0. Title1 "Orthogonal Polynomials". End.5 -0. Title2 "5 Equallyspaced Levels". quit.316228 1.1195229 -0.5345225 -0.267261 -0.6324555 -0.3162278 0.5 0. Symbol4 C=Black V=Circle L=1 I=Spline.534522 -0. Plot Coef*X=Trend / VAxis=Axis1.316228 0.5. Proc GPlot Data=Five.632456 0. Symbol3 C=Black V=Diamond L=1 I=Spline. Output.5 0.3162278 The following code and associated plot show how the contrast coeﬃcients are related to the polynomial terms.632456 -0.316228 0.132 gives CHAPTER 6. Quit.6324555 0.1). Symbol2 C=Black V=Square L=1 I=Spline. Length Trend $9. run.316228 0.6324555 quadratic 0. Input Coef @. Datalines.1195229 .632456 -0.7171372 -0. MORE ON REPEATED MEASUREMENT DESIGNS 0. Axis1 Label=(A=90 "Coefficient") Order=(-1 To 1 By .5 -0.8.478091 0.3162278 0. Data Five.6.6324555 6.3162278 quartic 0.969E-17 0. .5 0. Do X=4.632456 0.5345225 cubic -0.501E-17 -0.5 -0.478091 0.5 -0.7. Run.5 0. proc print. Input Trend @. linear -0.

267261 -0. 10 62 10 55 10 30 26 30 36 30 50 16 50 15 50 70 10 70 11 70 90 13 90 11 90 .501E-17 -0. quit. Fifteen samples of the antibiotic were obtained and three samples. class temp. data potency.1. contrast ’linear’ temp -0. cards.7171372 -0. model pot=temp temp*temp.478091 0.1195229 -0. input temp pot @@. After 30 days of storage the samples were tested for potency.478091 0. output out=quadmod p=p.969E-17 0.3162278. . model pot = temp. run.5345225 -0. The results are given below: 10◦ 62 55 57 Temperature 30◦ 50◦ 70◦ 26 16 10 36 15 11 31 23 18 90◦ 13 11 9 The contrast coeﬃcients in the following code were generated using Proc IML in SAS. contrast ’quartic’ temp 0. contrast ’cubic’ temp -0.6324555.5345225. were stored at each of ﬁve temperatures. selected at random.632456 -0. proc glm.1195229.316228 1. TREND ANALYSES IN ONE.3162278 0. run.AND TWO-WAY RM DESIGNS 133 EXAMPLE: An experiment was conducted to determine the eﬀect of storage temperature on potency of an antibiotic.267261 0.6.632456 0.534522 -0.6324555 6. 57 31 23 18 9 proc glm. contrast ’quadratic’ temp 0.316228 0.

which may cause or explain the insomnia. .100000 3763. MORE ON REPEATED MEASUREMENT DESIGNS proc gplot data=quadmod. who have no medically-diagnosed physical or mental disease or symptoms.0001 <.859329 36. .j or φ = i=1 cj µij be a polynomial contrast of the main B means or the simple B means speciﬁc to Group i.042866 160. < tb . will be investigated in a greater detail in a later chapter. The following is the ANOVA table that is produced by SAS: Source temp linear quadratic cubic quartic Error Corrected Total DF 4 1 1 1 1 10 14 Type I SS 4520.and two-way RM designs.9597 Over the 30 day storage period used in this study temperature had a highly signiﬁcant eﬀect on the potency of the antibiotic (P < .1. quit.80 − 1. .042866 16. CHAPTER 6.199609 720. Let φ1 . 6. ˆ The issue of trend analysis involving more than one independent variable.01x2 ˆ where y is the expected potency and x is the 30-day storage temperature.134 quit.859329 36. . One may use the POLYNOMIAL transformation option of SAS to obtain trend analysis. potency may be described by the equation y = 71.400000 3763. So we will ﬁt a quadratic model using SAS’ Proc GLM. involves b time measurements b b at times t1 < t2 < . run.300220 0.000000 F Value 70. .1629 0. In the following we will consider two examples: one.00 Pr > F <.05 2. φ2 . Over the temperature range 10◦ to 90◦ . The scale used in the study is: . where φi is degree i.63 235. respectively.400000 Mean Square 1130. A Likert scale is used to determine ease in going to sleep. and response surface methodology in general.000000 4680.20 45. Example : One-way RM Design In a small pilot clinical trial dose-response drug study. Personal Communication.300220 0. a pharmaceutical company research team is concerned with the quickness that a new drug can sedate a person so that they can go to sleep. symbol1 c=black V=square L=1 I=spline. plot p*temp/ vaxis=axis1. Let φ = i=1 cj µ. denoted by B hereafter. . The linear and quadratic terms are the only signiﬁcant trend components. axis1 label=(A=90 "Y").2 RM Designs Consider those cases where the within subject factor. The following is due to Michael Stoline.0001 0. φb−1 be the b − 1 orthogonal polynomial contrasts.60x + .0001).0001 <. A sample b = 8 people is selected from a population of insomniacs.27 0.199609 720.

In particular we would like to test the signiﬁcance of the linear quadratic and cubic orthogonal trend components of sleep diﬃculty as a function of time.6. A standard dosage of the new drug is administered daily to each of the eight subjects under the care of a physician.7679593549 -. The data are given below: Subject 1 2 3 4 5 6 7 8 1 week 4 2 2 4 3 3 1 2 Sleep Diﬃculty Scale 2 weeks 4 weeks 8 weeks 2 2 1 2 1 1 2 2 2 3 3 2 2 1 1 1 2 1 1 2 1 2 1 2 The following SAS code is used to perform orthogonal trend analysis. four. PROC GLM DATA=RM1. RUN.1. TREND ANALYSES IN ONE. The related output is ------------------------------------------------------------------------------TIME_N represents the nth degree polynomial contrast for TIME M Matrix Describing Transformed Variables B1 TIME_1 TIME_2 TIME_3 -. in the above Likert scale. REPEATED TIME 4 (1 2 4 8) POLYNOMIAL / PRINTM SUMMARY.AND TWO-WAY RM DESIGNS Scale 1 2 3 4 Time to go to sleep (in minutes) < 10 10 − 20 20 − 40 > 40 135 Each of the subjects in the insomniac population has chronic sleep diﬃculties.3975732840 B4 0.5296271413 -. CARDS. is measured for each patient after one. 4 2 2 1 2 2 1 1 2 2 2 2 4 3 3 2 3 2 1 1 3 1 2 1 1 1 2 1 2 2 1 2 . and eight weeks.3442576419 0. and each consistently scores a value of 4 in their daily evaluation of time required to go to sleep.7951465679 B3 0. two. The goal of the study is to assess the relationship of the ease of going to sleep as a function of the length of time under medication.1059254283 0.4543694674 B2 -. QUIT.5128776445 0.7926290870 0.0466252404 -. INPUT B1 B2 B3 B4.3263766829 -. MODEL B1-B4 = / NOUNI. DATA RM1. using the above instrument. Sleep diﬃculty. TITLE ’ONE-WAY RM TREND’.0567961834 .

F 0.66 Pr > F 0. MORE ON REPEATED MEASUREMENT DESIGNS The GLM Procedure Repeated Measures Analysis of Variance Univariate Tests of Hypotheses for Within Subject Effects Source TIME Error(TIME) DF 3 21 Type III SS 6.15625000 Mean Square 2.50772516 F Value 9.38839286 F Value 5.19791667 0.0168).0061 Greenhouse-Geisser Epsilon Huynh-Feldt Epsilon 0.9549 The GLM Procedure Repeated Measures Analysis of Variance Analysis of Variance of Contrast Variables TIME_N represents the nth degree polynomial contrast for TIME Contrast Variable: TIME_1 Source Mean Error DF 1 7 Type III SS 4.52862903 0.1764 Error 7 2.95244565 3.81653226 2.1391 Contrast Variable: TIME_3 Source DF Type III SS Mean Square F Value Pr > F Mean 1 0.36123272 ------------------------------------------------------------------------------- The linear trend is the only signiﬁcant trend (P = .6766 0.82477209 2.81653226 0.136 CHAPTER 6.07354488 Mean Square 0.0153 0. .95244565 0.0053 Source TIME Error(TIME) Adj Pr > F G .G H .82477209 0.0168 Contrast Variable: TIME_2 Source Mean Error DF 1 7 Type III SS 0.29622070 F Value 2.59375000 8.26 0.75 Pr > F 0.55407609 Mean Square 4.78 Pr > F 0.

. AX23 may have a quadratic trend 2. Control may have a constant trend The following is the SAS analysis of the trend components: OPTIONS LS=80 PS=66 NODATE. TREND ANALYSES IN ONE. DRUG Person within drug 1 2 3 4 5 6 7 8 AX23 T1 72 78 71 72 66 74 62 69 T2 86 83 82 83 79 83 73 75 T3 81 88 81 83 77 84 78 76 T4 77 81 75 69 66 77 70 70 T1 85 82 71 83 86 85 79 83 BWW9 T2 86 86 78 88 85 82 83 84 T3 83 80 70 79 76 83 80 78 T4 80 84 75 81 76 80 81 81 T1 69 66 84 80 72 65 75 71 CONTROL T2 73 62 90 81 72 62 69 70 T3 72 67 88 77 69 65 69 65 T4 74 73 87 72 70 61 68 65 The proﬁle plot (given below) indicates that the trend patterns may be diﬀerent for all the three levels of A. After the drug was administered. BWW9 may have a cubic trend 3. At the start of the study. the heart rate was measured every ﬁve minutes for a total of t times.6.1. n female human subjects were randomly assigned to each drug. DATA RM1. The signiﬁcance of the AB interaction implies that the three levels of A must be treated separately.AND TWO-WAY RM DESIGNS Example : Two-way RM Design 137 Consider again the experiment involving d drugs was conducted to study each drug eﬀect on the heart rate of humans. An inspection of the above proﬁle plot shows that: 1. The following table contains results from one such study.

PROC GLM DATA=RM1.86339286 F Value 0. RUN.37 Pr > F 0. RUN.0748 ------------------------------------. MODEL Y1-Y4 = /NOUNI. QUIT. MORE ON REPEATED MEASUREMENT DESIGNS 86 86 78 88 85 82 83 84 83 80 70 79 76 83 80 78 80 84 75 81 76 80 81 81 3 3 3 3 3 3 3 3 69 66 84 80 72 65 75 71 73 62 90 81 72 62 69 70 72 67 88 77 69 65 69 65 74 73 87 72 70 61 68 65 : TREND’. REPEATED B 4 POLYNOMIAL/ PRINTM SUMMARY.10 Pr > F 0.7187500 Mean Square 639. 1 72 86 81 77 2 85 1 78 83 88 81 2 82 1 71 82 81 75 2 71 1 72 83 83 69 2 83 1 66 79 77 66 2 86 1 74 83 84 77 2 85 1 62 73 78 70 2 79 1 69 75 76 70 2 83 .50625000 4.0312500 5.138 INPUT A Y1 Y2 Y3 Y4 @@. CHAPTER 6. BY A.0001 DF 1 7 Type III SS 28. Selected output ------------------------------------.05625000 6.41339286 F Value 4.8169643 F Value 109. CARDS.50625000 34. TITLE1 ’HEART RATE DATA PROC SORT DATA=RM1. CLASS A.A=1 -------------------------------------B_N represents the nth degree polynomial contrast for B Contrast Variable: B_1 Source Mean Error Contrast Variable: B_2 Source Mean Error Contrast Variable: B_3 Source Mean Error DF 1 7 Type III SS 0. QUIT.05625000 44.0312500 40.04375000 Mean Square 0. BY A.86 Pr > F <.89375000 Mean Square 28.A=2 -------------------------------------Contrast Variable: B_1 Source DF Type III SS Mean Square F Value Pr > F .7564 DF 1 7 Type III SS 639.

18 0.No trend .17410714 F Value 2. AX23 . Control .1.50625000 28.62 Pr > F 0.75625000 69.80625000 47.99375000 51. TREND ANALYSES IN ONE.9511 DF 1 7 Type III SS 11.cubic 3.16 Pr > F 0.5562500 130.74553571 F Value 0.6.21875000 Mean Square 2.94375000 Mean Square 79.5562500 18.53125000 1.2937500 Mean Square 11.quadratic 2. BWW9 .03125000 54.7332 DF 1 7 Type III SS 0.0112 ------------------------------------.84910714 F Value 11.99910714 5.14375000 Mean Square 0.21875000 Mean Square 0.1855 DF 1 7 Type III SS 79.80625000 6.0570 DF 1 7 Type III SS 2.03125000 7.13 Pr > F 0.00 Pr > F 0.75625000 9.02053571 F Value 0.50625000 4.53125000 8.A=3 -------------------------------------Contrast Variable: B_1 Source Mean Error Contrast Variable: B_2 Source Mean Error Contrast Variable: B_3 Source Mean Error DF 1 7 Type III SS 0.AND TWO-WAY RM DESIGNS 139 Mean Error Contrast Variable: B_2 Source Mean Error Contrast Variable: B_3 Source Mean Error 1 7 51.6133929 F Value 0.65 Pr > F 0.4566 The results match our prior expectations: 1.

. Notice that our design restricts the number of whole plots to be a multiple of a. Then each plot is divided into four subplots and the four diﬀerent fertilizers are applied in random order. The ANOVA table for this split-plot design is Source of Variation Whole Plot Analysis A Error(Whole Plot) Subplot Analysis B AB Error(Subplot) df a−1 a(r − 1) b−1 (a − 1)(b − 1) a(b − 1)(r − 1) Notice that the ANOVA table looks like a two-way RM design ANOVA where the whole plot analysis here corresponds to the between subject analysis of the RM design and the subplot analysis corresponds to the within subject analysis of the RM design. N2 . This results in a design known as the split-plot design which is a generalization of the RCBD (or a subset of the classic two-way RM design. The large units are then divided into smaller units. a j = 1.2 The Split-Plot Design In some multi-factor designs that involve blocking. One such experiment considers three methods of seedbed preparation (S1 . r yijk = µ + τi + ij + βj + (τ β)ij + eijk . MORE ON REPEATED MEASUREMENT DESIGNS 6. b k = 1. and the levels of the second factor are assigned at random to the small units within the larger units. as we shall see later). we may not be able to completely randomize the order of runs within each block. N1 . N3 ) as the subplot factor. The following example is taken from Milliken and Johnson.140 CHAPTER 6. b is the number of levels of the subplot factor. An agronomist may be interested in the eﬀect of tillage treatments and fertilizers on yield. Tillage machinery requires large plots while fertilizers can be applied to small plots. · · · . S3 ) as a whole-plot factor and four rates of nitrogen fertilizer applied by hand (N0 . We may have unequal number of repetitions of the whole plot factor if the number of whole plots is not a multiple of a. and r is the number of times a whole plot factor is repeated. The analysis is divided into two parts: whole plot and subplot. where µ + τi + ij is the whole plot part of the model and βj + (τ β)ij + eijk is the subplot part. Whole S2 N0 N3 N2 N1 Plots S1 N3 N2 N0 N1 S3 N3 N2 N1 N0 S1 N2 N3 N0 N1 S3 N0 N1 N3 N2 S2 N1 N0 N3 N2 The statistical model associated with the split-plot design is i = 1. The three methods of land preparation are applied to the whole plots in random order. · · · . The levels of one factor are assigned at random to large experimental units within blocks of such units. S2 . We shall only consider the simplest split-plot design that involves two factors and an incomplete block. · · · . Here a is the number of levels of the whole plot factor. There are two sizes of experimental units: whole plots are the larger units and subplots or split-plots are the smaller units.

7697 8.8794 7.DAT’ FIRSTOBS=2. 30.dat” in the following format: moist fert tray yield 10 2 1 3. MODEL YIELD = MOIST TRAY(MOIST) FERT MOIST*FERT / SS1. The levels of moisture were randomly assigned to the trays. or 8 mg per pot.5173 3.. 4 pots could be put into each tray.95113 6.4373 8.7486 3. /* SPLIT PLOT USING GLM */ CLASS TRAY FERT MOIST. /* SPLIT PLOT MJ 24.9081 13. /* MEANS OK FOR BALANCED DATA */ .5034 40 8 12 10. where the water was absorbed by the peat pots.3170 4.4997 4.0490 5.5144 40 8 11 12.91310 6.2626 8.97584 10 4 1 4.0482 Level of Fertilizer 4 6 4.2 */ DATA EXPT. 6.9334 10.29741 5.1372 9.7133 14.5129 10. INPUT MOIST FERT TRAY YIELD. 4.8393 6.9157 15.0265 10.3458 4.4730 7.0444 10 2 3 1. RUN.9575 15. The wheat seeds were planted in each pot and after 30 days the dry matter of each pot was measured.5168 13. QUIT.3776 5.6.5034 10. INFILE ’C:\ASH\S7010\SAS\SPLIT2.8397 .97584 5.1180 13.. There where 48 diﬀerent peat pots and 12 plastic trays.3170 10 4 2 4.0444 1.2750 15.1413 6. 40 8 10 12.2811 The following SAS code uses two ways (PROC GLM and PROC MIXED) to perform the split-plot analysis.2811 20 30 40 The data was placed in a ﬁle called ”split2. Level of Moisture 10 Tray 1 2 3 4 5 6 7 8 9 10 11 12 2 3.1413 10 4 3 3.0702 10.4367 8 5. or 40 ml of water per pot per day to the tray. The four levels of fertilizer were randomly assigned to the four pots in each tray so that each fertilizer occurred once in each tray.5572 4.0842 10.2.6654 11.27853 6.8397 4.6332 12.3654 6.3458 10 2 2 4. RANDOM TRAY(MOIST) / TEST. PROC GLM. 20.5144 12. The levels of fertilizer were 2.8376 9.56933 8. /* TESTS USING RANDOM STATEMENT */ TEST H = MOIST E = TRAY(MOIST). THE SPLIT-PLOT DESIGN Example 141 The following data are taken from an experiment where the amount of dry matter was measured on wheat plants grown in diﬀerent levels of moisture and diﬀerent amounts of fertilizer. OPTIONS LINESIZE=80 PAGESIZE=37.9419 10. The moisture treatment consisted of adding 10.7348 12.3934 7. /* TESTS USING TEST STATEMENT */ MEANS MOIST / LSD E = TRAY(MOIST).

QUIT. /* SAVE LS MEANS DATASET */ /* INTERACTION PLOTS FROM GLM */ PROC MIXED DATA=EXPT. LSMEAN | 17.2284771 F Value 119.5 | -----------------------------------------------------------10 20 30 40 The Mixed Procedure Type 3 Tests of Fixed Effects Effect moist fert fert*moist Num DF 3 3 9 Den DF 8 24 24 F Value 26. RUN.5513647 18.65 5.0003 Plot of LSMEAN*moist.7521392 36.0003 .0 | 6 | 4 | | 4 7.7298499 3. MORE ON REPEATED MEASUREMENT DESIGNS MEANS FERT / LSD. RANDOM TRAY(MOIST).51 Pr > F <.0562942 Mean Square 89. CLASS TRAY FERT MOIST.142 CHAPTER 6.0001 0. /* SPLIT PLOT USING MIXED */ ---------------------------------------------------------------Sum of Source DF Squares Mean Square F Value Model Error Corrected Total Source moist tray(moist) fert fert*moist 23 24 47 DF 3 8 3 9 631. QUIT.0001 Pr > F <.65 5.5 | | | | 8 15.1895496 27.5 | | 2 | 8 2 4 | 2 5.0001 0. PLOT LSMEAN*MOIST=FERT. LSMEANS MOIST*FERT / PDIFF STDERR OUT=LSM.0540027 38.34 131.30 4. PROC PLOT DATA=LSM.0 | | 8 | | 12.4064398 99. RUN.0019 <.62 27.0180009 4.62 Pr > F 0.0513405 649.6027051 Type I SS 269.0 | 6 | 4 | | 2 2.0002 <.0001 0. Symbol is value of fert. RUN.53 131. MODEL YIELD = MOIST | FERT. PLOT LSMEAN*FERT=MOIST. QUIT.2515182 297.4587550 0.5 | 6 | 8 | | 6 10.

Now a block of land is divided into three whole-plots and the three methods of land preparation are applied to the plots in random order. Each of the four fertilizer levels was randomly assigned to one whole plot within a block. βj is the eﬀect of the jth block. The ANOVA table for this split-plot design is Source of Variation Replication A Error(Whole Plot) B AB Error(Subplot) df r−1 a−1 (a − 1)(r − 1) b−1 (a − 1)(b − 1) a(b − 1)(r − 1) One ﬁnal note here is that there is no appropriate error term to test for signiﬁcance of replication eﬀect.6. The following table shows the experimental plan: Block S3 N3 N2 N1 N0 I S1 N2 N3 N0 N1 S2 N0 N3 N2 N1 S1 N3 N2 N0 N1 II S3 N0 N1 N3 N2 S2 N1 N0 N3 N2 The following rearrangement of the above table shows that a split-plot design is analyzed as a classic two-way RM design with blocks treated as ”subjects”: S1 I N2 N3 N0 N1 The model for such designs is yijk = µ + τi + βj + eij } + πk + (τ π)ik + ijk } S2 II N3 N2 N0 N1 I N0 N3 N2 N1 II N1 N0 N3 N2 I N3 N2 N1 N0 S3 II N0 N1 N3 N2 whole plot part of the model subplot part of the model where τi is the eﬀect of the ith level of the whole plot factor. and πk is the eﬀect of the kth level of the subplot factor. The ﬁeld was divided into two blocks with four whole plots. Example Two varieties of wheat (B) are grown in two diﬀerent fertility regimes (A). and each variety of wheat was randomly assigned to one subplot within each whole plot.2. Consider once again the land preparation and fertilizers example. This is done (replicated) for two blocks of land. THE SPLIT-PLOT DESIGN 143 We may add grouping in the split-plot design to reduce the whole plot variability. Then each plot is divided into four subplots and the four diﬀerent fertilizers are applied in random order. . Each whole plot was divided into two subplots.

QUIT.8 36.8 4 39.8 A4 44. YIELD = N1.1075000 F Value 7.1025000 40. CARDS.1025000 13.1900000 6. PROC GLM. DROP N1--N4. /* SPLIT PLOT MJ 24. PROC MIXED. RANDOM BLOCK BLOCK*FERT. VAR=1.2500000 0. INPUT FERT N1 N2 N3 N4.5 47.6 42. QUIT.07 0.3966667 2.1 */ OUTPUT.6 OPTIONS NOCENTER PS=64 LS=76.3 2 36.1900000 Mean Square 131.3091667 2.3599 0.8 36.7 41.0200000 8. BLOCK=1.3966667 F Value 56.25 Pr > F 0.8612 Mean Square 16.2 A3 34.9275000 2. Block 2 Variety Fertility B1 B2 A1 41. OUTPUT. VAR=1.80 Pr > F 0.0014 0. LSMEANS VAR VAR*FERT / PDIFF STDERR.77 5.6 42. YIELD = N3. YIELD = N4.5 47.4476 0.4500000 Type III SS 131.4 37.5 40.85 Pr > F 0.4 A4 39.1025000 13.6 A3 43.4300000 190. MODEL YIELD = BLOCK FERT BLOCK*FERT VAR VAR*FERT. TEST H = BLOCK FERT E = BLOCK*FERT.6 40.10 1. DATA B. MORE ON REPEATED MEASUREMENT DESIGNS Block 1 Variety Fertility B1 B2 A1 35. RANDOM BLOCK BLOCK*FERT.6 .5166667 F Value 62.9 41.0530 0.6 40. BLOCK=2. MODEL YIELD = FERT VAR VAR*FERT. CLASS BLOCK VAR FERT. RUN.1025000 40. 1 35. QUIT.0914 .7 38.7 38.4 37.144 CHAPTER 6.5 40 44. YIELD = N2. VAR=2. VAR=2.36 1.5472727 2. OUTPUT.6 3 34.0 SAS was used to analyze the data.9 A2 36. RUN.3 A2 42.2500000 1.5500000 Mean Square 131. BLOCK=2. LSMEANS FERT | VAR.0048 0. LSMEANS FERT / PDIFF STDERR E = BLOCK*FERT.7 41.21 6.2 42. DATA SPLIT. -------------------------------------------------------------Dependent Variable: YIELD Source Model Error Corrected Total Source BLOCK FERT BLOCK*FERT VAR VAR*FERT DF 11 4 15 DF 1 3 3 1 3 Sum of Squares 182. BLOCK=1. OUTPUT. SET SPLIT. CLASS BLOCK VAR FERT. RUN.0306 Tests of Hypotheses Using the Type III MS for BLOCK*FERT as an Error Term Source BLOCK FERT DF 1 3 Type III SS 131.4 43.

5500000 8. If this test were signiﬁcant. them multiple comparisons would have to be carried out to determine the signiﬁcant diﬀerences.5166667 2. A slight modiﬁcation of the SAS code above will provide the necessary tests.80 1.2500000 0.2500000 1.0914 0.1025000 13.2.8612 BLOCK 1 FERT 3 BLOCK*FERT=Error(Whole plot) 3 VAR 1 VAR*FERT 3 Error(Subplot) 4 There is no signiﬁcant diﬀerence among fertilizers.3091667 2.9275000 2. This is left as an exercise.3599 0.1900000 6.3966667 2.4300000 Mean Square 131. .6.25 Pr > F 0. THE SPLIT-PLOT DESIGN 145 We may reorganize the ANOVA table in the output as Source DF SS 131.1075000 F Value 5.07 0.1025000 40.

b1 by subjects in sequence level 2 (a2 ) as shown below: Sequence: A a1 a2 Period: C c1 c2 b1 b2 b2 b1 There are three factors: sequence A (a1 and a2 ).1 − Y.1 Y... then τ1 − τ2 ¯ ¯ =0 A : E(Y1. i. b. Further. MORE ON REPEATED MEASUREMENT DESIGNS 6. period C (c1 and c2 ). A Latin square type design approach is used in the construction of the order eﬀects so that each level j of B occurs equally often as the ith observation for i = 1.. 2. E(Y111 ) = µ + β1 + γ1 E(Y122 ) = µ + τ1 + β2 + γ2 One can show that ¯ ¯ E(Y1. Let Yijk be the random variable which represents the observation corresponding to sequence i.1. ) = τ1 − τ2 2 E(Y221 ) = µ + β2 + γ1 E(Y212 ) = µ + τ2 + β1 + γ2 ¯ ¯ E(Y. k = 1. The model incorporates carry-over eﬀects in the second observations but not in the observations made ﬁrst. − Y2.2.2 ) = E τ1 − τ2 2 = (τ1 − τ2 ) − τ1 + τ2 2 Y212 − Y221 Y111 − Y122 − 2 2 The last expression shows that the AB and C eﬀects are identical.. ) = (β1 − β2 ) − ¯ ¯ E(Y111 − Y221 ) = β1 − β2 ¯ ¯ E(Y. AB and C are confounded. and treatment B (b1 and b2 ) in the experiment. ) = (β1 − β2 ) . We will only consider the two-period crossover design as an illustration.146 CHAPTER 6...e. ¯ Y2. 2.2.. − Y.2 Mean ¯ Y1. ) = 2 ¯ ¯ B : E(Y. − Y2. let τ1 be the eﬀect on the second observation if b1 is observed ﬁrst and τ2 be the eﬀect on the second observation if b2 is observed ﬁrst. j = 1.. and γk be the kth period eﬀect. · · · .3 Crossover Designs A crossover design is a two-way RM design in which the order (A) of administration of the repeated levels of a single factor (B) is accounted for as a between subject factor in the design. b2 by subjects in sequence level 1 (a1 ) and in the order b2 . − Y. Thus τ1 and τ2 are the carry-over eﬀects observed in the second observation whenever b1 and b2 are observed ﬁrst. If there is no sequence eﬀect (H0 : τ1 = τ2 ). treatment j and period k. a crossover design is a basic one-way RM design on the levels of B that has been analyzed as a two way RM design to account for the order or sequence eﬀect caused by the fact that the B levels are given one at a time in diﬀerent orders.. Here the repeated factor B has two levels b1 and b2 which are observed in the order b1 . b2 ) a2 = (b2 . Let βj be the jth treatment eﬀect.. The following table gives the observations: Sequence: A a1 = (b1 . Hence. b1 ) Mean Period: C c1 c2 Y111 Y122 Y221 Y212 ¯ ¯ Y. respectively.1.

E(Y. y22n2 a2 = (b2 . n2 Treatment : B b1 b2 y111 y121 . . y21n2 y12n1 y221 . .2. the unweighted mean. b1 ) One can immediately observe that this is a classic two-way RM layout discussed in Chapter 5. then go to Step 2B. Now assume that there are n1 + n2 subjects available. − y. Step 2A: Test for B main eﬀects using µ. .2. 2.2 1. . .3. CROSSOVER DESIGNS 147 ¯ ¯ Thus. Otherwise. The ANOVA table is Source of variation A : Sequence Subjects within A B AB B× Subjects within A df 1 n1 + n2 − 2 1 1 n1 + n2 − 2 MS M SA M S1 M SB M SAB M S2 F FA = M SA /M S1 FB = M SB /M S2 FAB = M SAB /M S2 The recommended analysis strategy is Step 1: Test for sequence eﬀects H0 : µ1. using FA .1. − y22. )/2. b2 ) Subject 1 .n1 +n2 −2 (α) using SAS Type III SS. n1 1 . • If FA is not signiﬁcant. .1 − µ. In this case the test H0 : µ111 = µ221 of the simple B eﬀects is a valid test of H0 : β1 = β2 ¯ ¯ since E(Y111 − Y221 ) = β1 − β2 even when H0 : τ1 = τ2 is not true. − Y. . An α level test rejects H0 if FB > F1.1. = µ2. and n1 are randomly assigned to sequence 1 while the remaining n2 are assigned to sequence 2. ) = y ¯ with the approximate degrees of freedom dfw = (n1 + n2 − 2)(M S1 + M S2 )2 (M S1 )2 + (M S2 )2 M S 1 + M S2 2 1 1 + n1 n2 M S2 2 1 1 + n1 n2 . .1 − µ. A 100(1 − α)% conﬁdence interval for µ. then go to Step 2A. ) ± tn1 +n2 −2 (α/2) y ¯ where y.1. ¯ y ¯ Step 2B: Test for B main eﬀects using µ11 − µ22 The standard error of the estimator of µ11 − µ22 is se(¯11. .6. . + y21. y11n1 y211 . = (¯11. • If FA is signiﬁcant. . . The data layout is Sequence : A a1 = (b1 . the B main eﬀect is a valid test if there is no sequence eﬀect.2 is (¯. ) is a biased estimate of β1 −β2 . .

9 2 8 .2 27 -1.7 -1.3 -. RUN.2 2 1 . ------------------------------------------------------ .7 16 1.4 1 6 1.6 0 2 4 -. 1 1 .4 0.8 -0.4 2 6 1.3 1.3 .7 15 0. INPUT SEQ PERSON A B.2 1. CLASS TRT SEQ PERSON. An α-level test of H0 : µ11 = µ22 rejects H0 if |¯11.2 22 -2.9 2.5 -6. DATA CROSS2.8 0. ) y ¯ y ¯ Example The following data are taken from Grizzle . SET CROSS.4 Mean -1.148 CHAPTER 6.0 -0.9 2 7 -.9 -2. PROC GLM.0 Total -9. MORE ON REPEATED MEASUREMENT DESIGNS 1.8 3. − y22. | y ¯ > tdfw (α/2) se(¯11.2 1 1 2 0 -. Y = A. MODEL Y = SEQ PERSON(SEQ) TRT TRT*SEQ.9 .5 1.3 -1.4 26 -2.2 2.7 -2.6 1.2 1.5333 Period 1 2 Total Treatment B A 21 1. LSMEANS TRT TRT*SEQ.6 1.9 0.2 5.2375 0. pp 467-480. RANDOM PERSON(SEQ) / TEST.2 12 0. QUIT.5 1.3 2 3 . Period 1 2 Total Treatment A B 11 0. TEST H=SEQ E=PERSON(SEQ). TRT = ’A’.3 0. TRT = ’B’.4375 The SAS code and partial output are given below OPTIONS LINESIZE=80 PAGESIZE=66.2 28 -2.9 3. CARDS. − y22.3000 .6 Subject 24 25 -0.3 23 0. DROP A B. ) ± tdfw (α/2)se(¯11.4 -0.3 -2.3 0.9 1.6 0.8 2 5 -1 -.2 1 4 . Y = B.Biometrics .8 . DATA CROSS.7 Subject 13 14 -0.0 -1.3 2 2 1 -2.9 -2. RUN.9 -0.7 -0. A 100(1 − α)% conﬁdence interval for µ11 − µ22 is (¯11.3 -1.1 1 5 .0 -1. OUTPUT. LSMEANS SEQ / E=PERSON(SEQ).1965.9 1.7 Total 1.0 0. − y22. − y22.1 -0. ) y ¯ 2. The responses are diﬀerences between pre-treatment and post-treatment hemoglobin levels.0 Mean .0 1.6 0. OUTPUT.7 1 3 -.1 -1. QUIT.

69 .1165 0.50 Pr > F 0. CROSSOVER DESIGNS 149 Dependent Variable: Y Source Model Error Corrected Total Source SEQ PERSON(SEQ) TRT TRT*SEQ DF 15 12 27 DF 1 12 1 1 Sum of Squares 27.1165). = .3688 ¯ and y.94416667 42.57333333 1.2435 Tests of Hypotheses for Mixed Model Analysis of Variance Dependent Variable: Y Source DF Type III SS Mean Square F Value SEQ 1 4. (1)2 + (1.23750000 Using SAS Type III analysis we get the following summary for α = . The diﬀerence is not signiﬁcant (p-value = . 6 8 2 12(1 + 1.6446 0.573333 4.10 (for illustrative purposes): α = .40000000 Pr > F 0.1.80 2.000 + 1.0538).00055556 3.43750000 0. = −.0538 Least Squares Means TRT A B TRT A A B B Y LSMEAN 0. the t-statistic is 1 1 1. We have ¯ ¯ se(¯11.05 : Sequence eﬀects are not signiﬁcant (p-value = .96583333 14.245)2 = 23.245)2 .006667 1.53333333 -1.35208333 SEQ 1 2 1 2 Y LSMEAN 0. there are no treatment ¯ eﬀects at α = .2375) = 2. − y22.2.0794 0.57 Error 12 12.6.56297619 6.36875000 -0.05 and α = .0449 Mean Square 1.24297619 Mean Square 4. = .0538). α = .24297619 F Value 3.01 Pr > F 0.245 + = 0.572 .86438889 1.56297619 6. Analyze treatment eﬀects at time period 1 using y11.000556 Error: MS(PERSON(SEQ)) * This test assumes one or more other fixed effects are zero.2375.3000 − (−1.7 ≈ 24 .573333 4. ) = y ¯ and dfw = Therefore.3. Analyze treatment eﬀects using y. = −1.24534722 F Value 1.41666667 -0.91000000 Type III SS 4.86 5. Therefore.10 : Sequence eﬀects are signiﬁcant (p-value = .3521.3000 and y22.05 level of signiﬁcance.572 .57333333 12.30000000 0. * Least Squares Means SEQ 1 2 Y LSMEAN 0.00666667 3.67 0.

0128.10.150 CHAPTER 6. there are treatment eﬀects at α = . MORE ON REPEATED MEASUREMENT DESIGNS which has a two-sided p-value of 0. . Thus.10 level of signiﬁcance. the diﬀerence is signiﬁcant at α = . Therefore.

. . . y1cn ··· C1 yb11 . The following table contains the expected MS for the two-way RM design with repeated measures on both factors. where Fixed Eﬀects : • βj and γk ate the Bj and Ck main eﬀects and • (βγ)jk is the Bj Ck interaction eﬀect. Treatments (B) Subjects 1 . . . . The following table gives the data layout of such designs.4 Two-Way Repeated Measurement Designs with Repeated Measures on Both Factors Consider a design where factors B and C are within subject factors with levels b and c. 2 • (γS)kl ∼ N (0. . d2 σ2 ) = Bj by subject l interaction.6. and l = 1. 2 • (βγS)jkl ∼ N (0. . n. . yb1n Bb ··· Cc ybc1 . Source Between Subjects Subjects (S) Within Subjects B B×S C C ×S BC BC × S df n−1 b−1 (b − 1)(n − 1) c−1 (c − 1)(n − 1) (b − 1)(c − 1) (b − 1)(c − 1)(n − 1) MS M SS M Sb M S1 M Sc M S2 M Sbc M S3 E(M S) 2 σ 2 + bcσ1 nc 2 2 βj σ 2 + cσ2 + b−1 2 σ 2 + cσ2 nb 2 2 2 σ + bσ3 + c−1 γk 2 2 σ + bσ3 n 2 σ 2 + σ4 + (b−1)(c−1) 2 2 σ + σ4 (βγ)jk The following ANOVA table gives the F tests to test for main eﬀects: . k = 1.4. d4 σ4 ) = Bj by Ck by subject l interaction. Random Eﬀects : 2 • Sl ∼ N (0. • ejkl ∼ N (0. . . y11n B1 ··· Cc y1c1 . Assume that a random sample of n subjects are observed on each of the bc crossed levels of B and C. . . b. c. ybcn The statistical model for such designs is yjkl = µ + βj + γk + (βγ)jk + Sl + (βS)jl + (γS)kl + (βγS)jkl + ejkl for j = 1. respectively. TWO-WAY REPEATED MEASUREMENT DESIGNS WITH REPEATED MEASURES ON BOTH FACTORS151 6. . n C1 y111 . . σ1 ) = subject l eﬀect 2 • (βS)jl ∼ N (0. We assume that the random eﬀects interactions satisfy some (CS) conditions while all other random variables are assumed to be independent. One should notice that the ejkl are confounded with (βγS)jkl and hence cannot be estimated. These are used in the construction of F tests. . σ 2 ). . . . . d3 σ3 ) = Ck by subject l interaction. . .

Fc . Vol. Ybc = / NOUNI. ¯ ¯ yjk. ¯ ¯ The degrees of freedoms for the simple main eﬀects are approximated using Satterthwaite approximation formulae: (df1 M S1 + df3 M S3 )2 df4 = df1 (M S1 )2 + df3 (M S3 )2 df5 = (df2 M S2 + df3 M S3 )2 df2 (M S2 )2 + df3 (M S3 )2 Three sphericity conditions corresponding to Fb .. The data are given as follows: Son T2 11 13 13 18 14 6 17 Person Father T1 T2 T3 18 19 22 18 19 22 19 18 22 23 23 26 15 15 19 15 16 19 17 17 21 Mother T2 T3 16 19 16 19 16 20 22 26 17 20 19 21 20 23 Family 1 2 3 4 5 6 7 T1 12 13 12 18 15 6 16 T3 14 17 16 21 16 10 18 T1 16 16 17 23 17 18 18 . and mother.. The generic SAS Proc GLM code for analyzing two-way repeated measurement designs with repeated measures on both subjects is PROC GLM. I. Standard Error 2M S1 cn 2M S2 bn 2 (df1 M S1 +df3 M S3 ) n (df1 +df3 2 (df2 M S2 +df3 M S3 ) n (df2 +df3 df df1 df2 df4 df5 y. then G-G e-adjusted F tests followed by paired t tests for mean comparisons need to be performed.. and Fbc need to be checked. µjk − µjk yjk. The following is part of an example taken from Milliken and Johnson. The Analysis of Messy Data. − yjk .k.152 Source Between Subjects Subjects (S) Within Subjects B B×S C C ×S BC BC × S CHAPTER 6. The attitiudes of families were measured every six months for three time periods. − yj ¯ ¯ . father. each family consisting of a son. − y. C c. MODEL Y11 Y12 . If the (S) condition is not satisﬁed. MORE ON REPEATED MEASUREMENT DESIGNS df n−1 b−1 (b − 1)(n − 1) = df1 c−1 (c − 1)(n − 1) = df2 (b − 1)(c − 1) (b − 1)(c − 1)(n − 1) = df3 MS M SS M Sb M S1 M Sc M Sbc M S3 Fb = M Sb /M S1 Fc = M Sc /M S2 Fbc = M Sbc /M S3 F Mean comparisons are made using the following standard errors: Parameter Main Eﬀects βj − βj γk − γk Simple Main Eﬀects µjk − µj k Estimate yj. − yj ¯ ¯ k. REPEATED B b.. The data were obtained for seven families.k .

B = ROUND(I/3).4. ARRAY Z Y1--Y9. DATA RM2. OUTPUT. RUN. TEST H=B*C E=B*C*S. ELSE DO. RUN.3) = 0) THEN DO. C = ROUND(I/B). LSMEANS C/PDIFF E=C*S. QUIT. 19 19 18 23 15 16 17 22 22 22 26 19 19 21 16 16 17 23 17 18 18 16 16 16 22 17 19 20 19 19 20 26 20 21 23 PROC GLM. INPUT Y1-Y9. MODEL Y = B|C|S. PROC GLM DATA = RM2. Some selected output from a run of the above program is: Sphericity Tests(B) . END. DO I=1 TO 9. LSMEANS B*C/PDIFF E=B*C*S. MODEL Y1--Y9 = / NOUNI. S = _N_. END. TEST H=C E=C*S. OUTPUT. TWO-WAY REPEATED MEASUREMENT DESIGNS WITH REPEATED MEASURES ON BOTH FACTORS153 The SAS analysis of the data is given as follows: DATA RM. SET RM. TEST H=B E=B*S. Y = Z[I].3). 12 11 14 18 13 13 17 18 12 13 16 19 18 18 21 23 15 14 16 15 6 6 10 15 16 17 18 17 . REPEATED B 3. END. QUIT. C 3/ PRINTE NOM. RUN. CARDS. IF (MOD(I. QUIT.6. B = FLOOR(I/3) + 1. CLASS S B C. DROP Y1-Y9. LSMEANS B/ PDIFF E=B*S. C = MOD(I.

3188 The GLM Procedure Repeated Measures Analysis of Variance Univariate Tests of Hypotheses for Within Subject Effects Adj Pr > F G .2804233 F Value 258.5262 Adj Pr > F G .82 Pr > F 0.1904762 12.F 0.0007 Greenhouse-Geisser Epsilon Huynh-Feldt Epsilon 0.4285714 0.6961315 0.7187896 Pr > ChiSq 0.0001 <.0007 Source B Error(B) DF 2 12 Type III SS 350.403681 Pr > ChiSq 0.3511 Sphericity Tests(BC) Mauchly’s Criterion 0.159756 10.4780 0.3809524 146.0936229 Pr > ChiSq 0.4043 0.8571429 3.G H .0948413 Variables Transformed Variates Orthogonal Components DF 9 9 Chi-Square 13.9348 Source B*C Error(B*C) DF 4 24 Type III SS 1.2212 Source C Error(C) DF 2 12 Type III SS 144.1555 0.7451 0.6578854 Variables Transformed Variates Orthogonal Components DF 2 2 Chi-Square 0.0012 0.2089947 F Value 14.G H .8735 0.6981 Sphericity Tests(C) Mauchly’s Criterion 0.0508148 0.33333333 0.40740741 F Value 0.3650794 Mean Square 72.0001 Greenhouse-Geisser Epsilon Huynh-Feldt Epsilon 0.8110836 0.28 Pr > F <.5237 . MORE ON REPEATED MEASUREMENT DESIGNS Mauchly’s Criterion 0.G H .2705497 2.8660974 Variables Transformed Variates Orthogonal Components DF 2 2 Chi-Square 1.F 0.F <.154 CHAPTER 6.8819 1.77777778 Mean Square 0.0001 Adj Pr > F G .947328 0.35 Pr > F 0.33333333 9.5079365 Mean Square 175.

0001 <.1428571 LSMEAN Number 1 2 3 4 5 6 7 8 9 .5714286 17.0952381 19.8571429 18.1428571 21.0007 0.3992 <.1428571 13.8627 3 0.6.0000000 LSMEAN Number 1 2 3 Least Squares Means for effect B Pr > |t| for H0: LSMean(i)=LSMean(j) Dependent Variable: Y i/j 1 2 3 1 2 0.0000000 17.8571429 18.0005 0.0000000 21.0001 3 <.1428571 16. TWO-WAY REPEATED MEASUREMENT DESIGNS WITH REPEATED MEASURES ON BOTH FACTORS155 B 1 2 3 Y LSMEAN 14.2857143 16.3992 0.0007 0.0005 0.8627 C 1 2 3 Y LSMEAN 16.0001 <.0001 B 1 1 1 2 2 2 3 3 3 C 1 2 3 1 2 3 1 2 3 Y LSMEAN 13.1904762 19.5714286 LSMEAN Number 1 2 3 Least Squares Means for effect C Pr > |t| for H0: LSMean(i)=LSMean(j) Dependent Variable: Y i/j 1 2 3 1 2 0.4.4285714 19.

0001 <.0001 0.0001 <.0001 <.0001 <.0001 4 <.2 -------------- 6.0001 <.0001 <.0000 <.0001 <.0001 <. MORE ON REPEATED MEASUREMENT DESIGNS Least Squares Means for effect B*C Pr > |t| for H0: LSMean(i)=LSMean(j) Dependent Variable: Y i/j 1 2 3 4 5 6 7 8 9 1 2 1. The BC(Person by Time) interaction is not signiﬁcant (P = 0.0001 8 <. 2.0001 0.0001 <.0001 <.0001 <.0001 <.0001 9 <.0001) 7.3 16.0 19.4106 <.0001 <. The (S) condition is satisﬁed for all three tests B.0001 <.0001 <. Comparison of B(Person) means B1 14.6791 <.2212 <.2212 7 <. and BC. 4.0001 <.0001 <.0001 3 <.0001 5 <.0001 <. The B(Person) main eﬀect is signiﬁcant (P = .0001 The following is a summary of the results: 1.0001 <.0001 <.6791 0.0001 6 <.0001 <.0001 <.4106 <.0001 <.6791 <.0001 <.4106 0.0001 <.5262). 3.6 .0001 <.0001 <. C.0001 <.0001 <.0001 <.0001 1.6791 <.0001 0.4106 <.0001 0.0001 <.6791 <.5262).0001 <.0000 0.0001 <.6791 <.0001 <.0001 0. The BC interaction is not signiﬁcant (P = 0.156 CHAPTER 6.0001 0.0007) 5.0001 <.0001 0.0001 <.4 -------------C3 19. The C(Time) main eﬀect is signiﬁcant (P < .0001 0.0001 <.0000 0.0001 <.0000 1.1 B3 B2 19.0001 1. Comparison of C(Time) means C1 C2 16.0001 <.

ˆ In other words. y1 ).1 Estimation : The Method of Least Squares n n ˆ ˆ The least squares (LS) estimators β0 and β1 of β0 and β1 . where β0 is the intercept and β1 is the slope. We will assume that the random errors.Chapter 7 Introduction to the Analysis of Covariance Analysis of covariance (ANCOVA) methods combine regression and ANOVA techniques to investigate the relationship of a response variable with a set of ’treatments’ as well as other additional ’background’ variables. 7. the method of least squares gives the least possible sum of squared residuals. 2. y2 ). In this chapter we will focus our attention to situations where y depends on x in a linear fashion. · · · . are the values of β0 and β1 that minimize L(β0 . Our goal is to estimate and make inferences about the unknown regression coeﬃcients. In a data setting.1 Simple Linear Regression Let y be a measured response variable that is believed to depend on a predictor x up to a random error. ˆ Using diﬀerential calculus. (x2 . Thus the observation yi is estimated by the ﬁtted value ˆ ˆ yi = β0 + β1 xi . 2. · · · . β1 ) = i=1 ε2 = i i=1 (yi − β0 − β1 xi )2 . (xn . yi = β0 + β1 xi + εi . n . . εi . suppose we have n experimental units giving rise to observations (x1 . yi − yi . y = f (x) + ε. we get the LS estimators as 157 . i = 1. that is. · · · .1. yn ). are independent random variables with mean 0 and constant variance σ 2 . Then our model becomes yi = f (xi ) + εi . 7. that is. f (x) = β0 + β1 x . respectively. In a data setting. β0 and β1 . n . where β0 and β1 are unknown. i = 1.

the total sum of squares (yi − y )2 partitions into smaller variabilities: the ¯ variability in the response explained by the regression model and the unexplained variability. another way to pose the question will be ”Is there a signiﬁcant linear relationship between x and y?” In terms of statistical hypotheses.3 Tests of Hypotheses One very important question is ”Does x truly inﬂuence y?”.1. and β0 = y − β1 x . one may set up an ANOVA table as follows: Source Regression Residual Total df 1 n−2 n−1 SS SSR SSE SST MS M SR = SSR /1 M SE = SSE /(n − 2) F F = M SR /M SE We reject the null if the F -test is signiﬁcant at the desired level of signiﬁcance. ¯ 7. imply that the model is ”good”.2 Partitioning the Total SS Similar to ANOVA models. however. X 16 15 20 13 15 Y 32 26 40 27 30 . As before the degrees of freedom for SST partitions into regression df and error df as n − 1 = 1 + (n − 2) . Example Let X be the length (cm) of a laboratory mouse and let Y be its weight (gm). we are interested in testing H0 : β1 = 0 HA : β1 = 0 The rejection of H0 indicates that there is a signiﬁcant linear relationship between x and y. ¯ ¯ Sxx where Sxy = i=1 n n (xi − x)(yi − y ) and Sxx = ¯ ¯ i=1 (xi − x)2 . Consider the data for X and Y given below. 7.e.1.158 CHAPTER 7. Using the partition of the total variability. SST = SSR + SSE . Since this is under an assumption of a linear relationship. Here SSR represents the sum of squares due to regression while SST and SSE are the familiar SS’s due to total and error. INTRODUCTION TO THE ANALYSIS OF COVARIANCE Sxy ˆ ˆ β1 = .. It does not. This is done as n n n (yi − y )2 = ¯ i=1 i=1 (ˆi − y )2 + y ¯ i=1 (yi − y )2 ˆ i. respectively.

RUN.1. RUN. QUIT. PROC GPLOT DATA=REGOUT. PROC GLM. QUIT. SYMBOL V=CIRCLE I=R. DATA SLR. TITLE1 ’SCATTER PLOT OF Y VS X OVERLAYED WITH REGRESSION FIT’. PROC GPLOT. CARDS. Selected output from SAS is given below: . MODEL Y = X. PLOT Y*X. TITLE1 ’RESIDUAL PLOT’. TITLE1 ’REGRESSION OF Y ON X’. RUN. 32 26 40 27 30 38 34 43 64 45 46 39 SYMBOL V=CIRCLE I=NONE. TITLE1 ’SCATTER PLOT OF Y VS X’. INPUT X Y. PLOT Y*X. PLOT R*P. QUIT. QUIT. OUTPUT OUT=REGOUT R=R P=P. 16 15 20 13 15 17 16 21 22 23 24 18 .7. PROC GPLOT DATA=SLR. SIMPLE LINEAR REGRESSION 17 16 21 22 23 24 18 38 34 43 64 45 46 39 159 The following SAS code is used to perform the simple linear regression. RUN.

7638231 F Value 21.0009 Parameter Intercept X Estimate -5.70503506 0.7638231 Mean Square 813.0009 R-Square 0.36 Pr > F 0.090284 F Value 21.7638231 Mean Square 813.56 0.171733 Y Mean 38.7638231 F Value 21.428909953 2.681164 Coeff Var 15.66667 Source X DF 1 Type I SS 813.96138 Root MSE 6.666667 Sum of Mean Square 813.0009 .763823 38.405213270 Error 9.160 Dependent Variable: Y CHAPTER 7.902844 1194.62 0.52036911 Standard t Value Pr > |t| -0.36 Pr > F 0.763823 380.36 Pr > F 0.5882 4.0009 Source X DF 1 Type III SS 813. INTRODUCTION TO THE ANALYSIS OF COVARIANCE Source Model Error Corrected Total DF 1 10 11 Squares 813.

1. . From the scatter plot and the ﬁtted line. one observes that the relationship between length and weight is an increasing relationship.0009).7.41x The length of a mouse is signiﬁcantly linearly related to the weight of a mouse (P = 0. SIMPLE LINEAR REGRESSION 161 The ﬁtted regression line is y = −5.43 + 2.

then these lines would be parallel to each other. This says that if we were to ﬁt regression lines for each one of the k groups independently. Thus. β. where β An essential assumption in the ANCOVA is that there is a common slope parameter.e. In SAS PROC GLM. µi is unbiasedly estimated by µi = yi. − X. is measured along with the response variable. i. . A covariate is a variable that is thought to have an eﬀect on the response. The ANCOVA model which includes the covariate in a linear model is ¯ Yij = µ + τi + β(Xij − X. Test whether the covariate X is important: (a) Assuming heterogeneous slopes. − X. MODEL Y = A A*X /NOINT. . suppose that for each experimental a covariate. Let A denote the factor of interest. ) + ij ij where µi = µ + τi is the mean of treatment i.. . . µi. CLASS A. Y2n2 X2n2 ··· k Y Yk1 Yk2 Yknk X Xk1 Xk2 Xknk . Yij . ). However. of µi in the ANCOVA model. H0 : β1 = β2 = · · · = βk . The following is the analysis strategy to follow: 1. Xij . H0 : β = 0. β. CLASS A. The data layout for a single factor ANCOVA model is 1 Y Y11 Y12 Y1n1 X X11 X12 X1n1 Treatment 2 ··· Y X ··· Y21 X21 · · · X22 · · · Y22 . . in the ANCOVA model ˆ ¯ ˆ ¯ ¯ ¯ ¯ yi. − β(Xi.adj = yi... . If this hypothesis is true. we test H0 : β = 0 . then we need a follow up test of the importance of the covariate variable as a predictor. In SAS PROC GLM. . In the one-way ANOVA model. MODEL Y = A X. The homogeneity of the group slope parameters is an assumption that needs to be tested.. . . (b) Assuming homogeneous slopes. . In the one-way ANOVA model. we test H0 : β1 = β2 = · · · = βk = 0 . INTRODUCTION TO THE ANALYSIS OF COVARIANCE 7.162 CHAPTER 7. i.e. This assumption is known as the parallelism assumption. .2 Single Factor Designs with One Covariate Analysis of Covariance (ANCOVA) unites analysis of variance (ANOVA) and regression. ) is the adjusted estimator ¯ ˆ ¯ ˆ is the least squares estimator of the common slope parameter. The P -value corresponding to A ∗ X is the P -value for the test of interest. ) + This model may be written as ¯ Yij = µi + β(Xij − X. is an unbiased estimator of µi + β(Xi.. .

2. Using SAS PROC GLM. DATA DRUGTEST. The following example is adapted from Snedecor and Cochran (1967) (See also SAS Online Documentation). Test whether there is a common slope: We test H0 : β1 = β2 = · · · = βk using PROC GLM. then the covariate X is not needed. If either one of the tests is signiﬁcant.two antibiotics (A and D) and a control (F) • PreTreatment . CLASS A. CLASS A. This will be addressed in a later section. The P -value of interest corresponds to the A ∗ X term. ) + ij and tests H0 : τ1 = τ2 = · · · = τk = 0. the we use the ANCOVA model. Go to Step 2. Follow-up pairwise comparisons (H0 : τi = τj ) may be performed by including LSMEANS A/PDIFF. Example Ten patients are selected for each treatment (Drug).a pre-treatment score of leprosy bacilli • PostTreatment . then we perform the ANCOVA analysis. A 11 6 A 8 0 A 6 4 A 10 13 D 6 0 D 6 2 POST @@. (a) If the test is signiﬁcant. in the preceding SAS code. The following is the SAS code used to analyze the data. ﬁts the classic ANCOVA model ¯ Yij = µ + τi + β(Xij − X.7. MODEL Y = A X.. and six sites on each patient are measured for leprosy bacilli.a post-treatment score of leprosy bacilli The covariate (a pretreatment score) is included in the model for increased precision in determining the eﬀect of drug treatments on the posttreatment count of bacilli. It is given along with a partial output. (b) If the test is not signiﬁcant. The variables in the study are • Drug .2. input DRUG $ PRE CARDS. MODEL Y = A X A*X. A A D 5 6 7 2 1 3 A 14 A 11 D 8 8 8 1 A 19 11 A 3 0 D 18 18 . then we follow a Johnson-Neyman analysis. SINGLE FACTOR DESIGNS WITH ONE COVARIATE 163 If both are non-signiﬁcant.

D 19 14 F 13 10 F 12 5 CHAPTER 7.564915 F Value 8. MODEL POST = DRUG DRUG*PRE / NOINT. MODEL POST = DRUG PRE.900000 597.0001 .01 Pr > F 0. QUIT.0001 Dependent Variable: POST Source DRUG PRE PRE*DRUG Error Total DF 2 1 2 24 29 Sum of Squares 293.6446451 397.86 34. CLASS DRUG.966667 199. MODEL POST = DRUG PRE DRUG*PRE. QUIT. CLASS DRUG.6000000 577. RUN.542048 397.58 12.000000 Mean Square 721.8974030 Mean Square 146.180683 16.8974030 F Value 9.557952 3161. MEANS DRUG. RUN.0001 0. INTRODUCTION TO THE ANALYSIS OF COVARIANCE D 8 9 F 11 18 F 12 16 D F F 5 9 7 1 5 1 D 15 9 F 21 23 F 12 20 PROC GLM.0001 <.564915 F Value 43.557952 1288. RUN. PROC GLM.0013 <. --------------------------------------------------------------------------Dependent Variable: POST Source DRUG PRE*DRUG Error Total DF 3 3 24 30 Sum of Squares 2165.8000000 577.8974030 9.8974030 19. QUIT. CLASS DRUG. PROC GLM.02 Pr > F <.6000000 577.8223226 16.8000000 577.5606 Dependent Variable: POST Source DRUG PRE DF 2 1 Sum of Squares 293.164 D 8 4 F 16 13 F 16 12 .700000 Mean Square 146.89 0.15 36.0010 <.59 Pr > F 0. LSMEANS DRUG/ STDERR PDIFF TDIFF.

76211904 5.3000000 6. P = 0.0001).3000000 10.046254 165 Level of DRUG A D F N 10 10 10 -------------POST-----------Mean Std Dev 5.3159234 4.9521 3 -1.0001 <. None of the pairwise diﬀerences is signiﬁcant at α = . P < 0.0835 An observation is that the results of MEANS and LSMEANS are quite diﬀerent. H0 : τ1 = τ2 = τ3 = 0. SINGLE FACTOR DESIGNS WITH ONE COVARIATE Error Total 26 29 417. .64399254 6.202597 1288. Is the covariate important? (a) The hypothesis H0 : β1 = β2 = β3 = 0 is rejected (F = 12. Thus.1611017 Pr > |t| <. P = 0.9521 1.7149635 6.95671019 DRUG A D F POST LSMEAN 6.0835 1. (b) The hypothesis H0 : β = 0 is rejected (F = 36. is signiﬁcant (F = 9.82646 0.0000000 12.14998057 -------------PRE------------Mean Std Dev 9. The test for treatment eﬀects.5606).24933858 3.0607 0.1000000 12.01.826465 0. Thus.15449249 7.2884943 1.05. the covariate is important and needs to be included in the model.2724690 1.060704 0.0010).0001 <.3000000 Standard Error 1.8239348 10.15.59.80011 0. LSMEANS gives the adjusted means while MEANS gives the raw means.7. Here is a summary of the results: 1. P < 0.0793 -1.0001 Least Squares Means for Effect DRUG t for H0: LSMean(i)=LSMean(j) / Pr > |t| Dependent Variable: POST i/j 1 2 3 0. Do we have a common slope? We fail to reject the hypothesis H0 : β1 = β2 = β3 (F = 0.2.800112 0.02.9000000 LSMEAN Number 1 2 3 4. the assumption of a common slope is a valid assumption.0793 1 2 -0. 2.0001).700000 16.

CLASS A. Let A denote the factor of interest and B be the blocking factor. we test H0 : β1 = β2 = · · · = βk = 0 . The analysis of such a design proceeds in the same manner as the single factor design considered above. The P -value corresponding to A ∗ X is the P -value for the test of interest. then the covariate X is not needed. · · · . ) + ij where i = 1. Test whether there is a common slope: We test H0 : β1 = β2 = · · · = βk using PROC GLM. MODEL Y = A B X. (b) If the test is not signiﬁcant. (b) Assuming homogeneous slopes. CLASS A. and γj is the eﬀect of the jth block of n blocks. n. The following is the analysis strategy to follow: 1. 2. (a) If the test is signiﬁcant. j = 1. the we use the ANCOVA model. . If either one of the tests is signiﬁcant. The statistical model for the RCBD ANCOVA is ¯ Yij = µ + τi + γj + β(Xij − X..166 CHAPTER 7. MODEL Y = A B X A*X. MODEL Y = A B A*X /NOINT. · · · . INTRODUCTION TO THE ANALYSIS OF COVARIANCE 7. MODEL Y = A B X. then we perform the ANCOVA analysis. In SAS PROC GLM. The P -value of interest corresponds to the A ∗ X term. k. CLASS A. Go to Step 2. CLASS A. In SAS PROC GLM.3 ANCOVA in Randomized Complete Block Designs We will consider the case where a single covariate is observed along with the response in a RCBD. If both are non-signiﬁcant. then we follow a Johnson-Neyman type analysis. we test H0 : β = 0 . Test whether the covariate X is important: (a) Assuming heterogeneous slopes. Using SAS PROC GLM.

) + ij 167 and tests H0 : τ1 = τ2 = · · · = τk = 0.3. $ B Y X.7. ANCOVA IN RANDOMIZED COMPLETE BLOCK DESIGNS ﬁts the RCBD ANCOVA model ¯ Yij = µ + τi + γj + β(Xij − X.. 64 68 54 62 65 69 60 66 72 70 57 61 54 62 51 53 51 64 47 50 57 60 46 41 . Follow-up pairwise comparisons (H0 : τi = τj ) may be performed by including LSMEANS A/PDIFF. INPUT A CARDS. Example Yields for 3 varieties of a certain crop in a randomized complete block design with 4 blocks are considered. The variables of interest are • X = yield of a plot in a previous year • Y = yield on the same plot for the experimental year The data are as follows: Block 1 X Y 2 X Y X Y X Y Varieties A B C 54 51 57 64 65 72 62 68 51 54 53 62 64 69 47 60 50 66 60 70 46 57 41 61 3 4 The SAS analysis of the data is as follows: DATA CROP. The following example is taken from Wishart (1949). in the preceding SAS code. A 1 A 2 A 3 A 4 B 1 B 2 B 3 B 4 C 1 C 2 C 3 C 4 .

RUN.0000000 24.6790698 F Value 2.00000 8503.0000000 84.95 5. CLASS A B.57 0.6046512 4.26 Pr > F 0.00000 Dependent Variable: Y Source A B X X*A Error Total DF 2 3 1 2 3 11 Sum of Squares 24.6046512 17.92772 Total 12 49476.3953488 324.0000000 Mean Square 12. ----------------------------------------------------------------------Dependent Variable: Y Sum of Source DF Squares Mean Square F Value Pr > F A 3 49176. CLASS A B.0057 X*A 3 42.0000000 84.00000 16392.9277178 F Value 6. QUIT.6046512 23.57 Pr > F 0.0001 B 3 252.1229 Dependent Variable: Y Source A B X Error Total DF 2 3 1 5 11 Sum of Squares 24.0856 0.168 CHAPTER 7.1712 0. PROC GLM.0000000 24. RUN.56 17.30 0. QUIT. CLASS A B.6046512 8.6121953 5.32 <. MODEL Y = A B X A*X.0000000 24. MODEL Y = A B X.0704 .00000 43. MODEL Y = A B A*X/ NOINT.0000000 Mean Square 12.78315 1.8060976 1.21685 14. LSMEANS A/ STDERR PDIFF TDIFF. RUN.0000000 252. INTRODUCTION TO THE ANALYSIS OF COVARIANCE PROC GLM. PROC GLM. QUIT.22 43.76 4.7831535 324.0000000 252.57 12.07228 7.0042 0.0375 0.0684 Error 3 5.0000000 24.00000 84.0057 0.

0684 for heterogeneous slopes and P = 0.0516 3 -2. Hence. with P -values close to 0.868582 0.9302326 B C 65.0001 <.86858 0. the assumption of a common slope is justiﬁed.0697674 -------------------------- .668975 0.545014 0. The results of the pairwise testing are summarized below using underlining. Group A 60.54501 0. we fail to detect any diﬀerence in yield among the 3 diﬀerent crop varieties after adjusting for yield diﬀerences of the previous year.1778789 LSMEAN Number 1 2 3 169 A A B C Y LSMEAN 60.05.5332 The covariate X does not appear to be very important (P = 0.7.9302326 65. The test for treatment eﬀect is not signiﬁcant.0001 <. However.0704 for a single slope). we will keep it in the analysis for the sake of illustration.3.0001 Least Squares Means for Effect A t for H0: LSMean(i)=LSMean(j) / Pr > |t| Dependent Variable: Y i/j 1 2 3 2.0516 2. Thus.0697674 Pr > |t| <.0351 -0.1778789 1. The test for the equality of slopes is not rejected (P = 0.0351 1 2 -2.0000000 66.66898 0. this is the conservative route to follow.0815579 1. Besides. ANCOVA IN RANDOMIZED COMPLETE BLOCK DESIGNS Least Squares Means Standard Error 1.1229).0000000 66.5332 0.

. and ijk ∼iid N (0. Kutner. . and µij are estimated by the adjusted means ˆx µi.1) 1 (1.. ) + ijk . Applied Linear Statistical Models.. σ 2 ). .. b-1) ab . . We may use the one-way ANCOVA methods to perform the test by rewriting the model as ysk = µs + βs (xsk − x.. ) ˆ ¯ ¯ The major assumption. . n under the same assumptions as above. . and Wasserman. . n where a b a b τi = i=1 j=1 γj = i=1 (τ γ)ij = j=1 (τ γ)ij = 0 . . . . .. . . . a yijk = µ + τi + γj + (τ γ)ij + βij (xijk − x. Nachtsheim. . . . . . .j. . . . − x. . ) ˆ ¯ ¯ ˆx µ. . . We shall consider the case where a covariate xijk is observed along with the response for each experimental unit. a yijk = µ + τi + γj + (τ γ)ij + ijk . . b k = 1.. If the slopes are not heterogeneous.. − β(¯. − β(¯i. . . . The following example is taken from Neter. test main and simple eﬀects using the adjusted means.4 ANCOVA in Two-Factor Designs We will start by recalling the balanced two-factor ﬁxed eﬀects analysis of variance model i = 1. ab = 1. ¯ j = 1. . − x. Test for the homogeneity of the slopes. once again. . ) ˆ ¯ ¯ ˆx µij = yij. 2.. If that assumption is not satisﬁed. INTRODUCTION TO THE ANALYSIS OF COVARIANCE 7. then our model may be written as i = 1. the hypothesis H0 : β11 = ..j = y.j. − β(¯ij. .j) s (1. . use a Johnson-Neyman analysis. Thus.. . µ. b k = 1. s k (a. The corresponding two-factor ﬁxed eﬀects ANCOVA model is i = 1. .j . .. the means µi. ... j) is according to the formula s = b(i − 1) + j .1 = 1. ¯ j = 1. is the homogeneity of the slopes. . = yi. As in the one-way model.b) ab which is obtained by ”rolling-out” the cells into one big vector as (i. − x. . . . .2) 2 ··· ··· The correspondence between s and (i. If the slopes are heterogeneous. . . = βab = β is of interest. ) + ijk . . b k = 1.. j = 1.170 CHAPTER 7. a yijk = µ + τi + γj + (τ γ)ij + β(xijk − x. The analysis of two-way ANCOVA is performed as follows: 1. n (a. ) + ¯ sk . . . n under the same assumptions.

ANCOVA IN TWO-FACTOR DESIGNS Example 171 A horticulturist conducted an experiment to study the eﬀects of ﬂower variety (factor A) and moisture level (factor B) on yield of salable ﬂowers (Y ).7. A partial output is given following the code: DATA FLOWERS. C = 2*(A-1)+B. QUIT. . TITLE1 ’TWO-WAY ANCOVA’. TITLE1 ’HOMOGENEITY OF SLOPES’. The data are presented below: Factor B B1 (low) B2 (high) Yi1k Xi1k Yi2k Xi2k 98 15 71 10 60 4 80 12 77 7 86 14 80 9 82 13 95 14 46 2 64 5 55 3 55 60 75 65 87 78 4 5 8 7 13 11 76 68 43 47 62 70 11 10 2 3 7 9 Factor A A1 (variety LP) A2 (variety WB) The following SAS code is used to analyze the above example. CARDS. CLASS C. 71 80 86 82 46 55 76 68 43 47 62 70 10 12 14 13 2 3 11 10 2 3 7 9 PROC GLM. INPUT A B Y X @@. the horticulturist wished to use plot size (X) as the concomitant variable. MODEL Y = C X C*X. Six replications were made for each treatment. RUN.4. PROC GLM. Because the plots were not of the same size. 1 1 98 15 1 2 1 1 60 4 1 2 1 1 77 7 1 2 1 1 80 9 1 2 1 1 95 14 1 2 1 1 64 5 1 2 2 1 55 4 2 2 2 1 60 5 2 2 2 1 75 8 2 2 2 1 65 7 2 2 2 1 87 13 2 2 2 1 78 11 2 2 .

0283916 1.87 0.0365780 Pr > |t| <.796738 F Value 4.21 Pr > F 0.481183 5086.7304444 0.7246356 0.29 544.042244 3994.733375 108.0283916 1.288483 F Value 15.000000 Source C X X*C Error Total DF 3 1 3 16 23 Mean Square 29.9576613 B 1 2 Y LSMEAN 73.0001 <.0001 LSMEAN Number 1 2 3 4 Least Squares Means for Effect A*B .0001 H0:LSMEAN=0 Pr > |t| <. INTRODUCTION TO THE ANALYSIS OF COVARIANCE CLASS A B.55 635.0001 A 1 1 2 2 B 1 2 1 2 Y LSMEAN 76.6807796 66.601826 323.172 CHAPTER 7.0001 Least Squares Means Standard Error 0.747808 5086.849473 16.518817 6.3192204 H0:LSMean1=LSMean2 t Value Pr > |t| 7. ----------------------------------------------------------------------------HOMOGENEITY OF SLOPES Dependent Variable: Y Sum of Squares 87.5423387 70.0001 H0:LSMean1=LSMean2 t Value Pr > |t| 3.0961022 Standard Error 1.0009 <.0212 <.8192204 65.0242739 1.5423387 67.399785 3703.0001 <.18 <.309097 3.518817 119. QUIT.0001 0.849473 16. RUN.309097 10.1267 <.92 0.0009 A 1 2 Y LSMEAN 72. MODEL Y = A B A*B X.7246356 H0:LSMEAN=0 Pr > |t| <.0423387 67.6704 TWO-WAY ANCOVA Dependent Variable: Y Sum of Squares 96.0001 0.0001 <.577792 6.0001 <.50 2.042244 3994.601826 323.36 51.53 Pr > F 0.0001 <. LSMEANS A B A*B / STDERR PDIFF TDIFF.133262 3703.7304444 Standard Error 0.000000 Source A B A*B X Error Total DF 1 1 1 1 19 23 Mean Square 96.

Thus the inclusion of the covariate is justiﬁed.9371 0. The ANCOVA model with a common slope is valid. ANCOVA IN TWO-FACTOR DESIGNS t for H0: LSMean(i)=LSMean(j) / Pr > |t| Dependent Variable: Y i/j 1 2 3 4 -6.0001). (d) The hypothesis H0 : β = 0 is rejected (P < 0. 2.0009).1127 3.781373 <. Comparisons of means results: (a) The AB interaction eﬀect fails to be signiﬁcant (P = 0.254261 0. We may compare A and B means at the main eﬀect level.937098 0.937098 0.9371 0.25426 0.663 0.0362 4 7. . (b) The A main eﬀect is signiﬁcant (P = 0.0009 -2.4.216274 <.0009 -7.0362 -1. (c) The B main eﬀect is signiﬁcant (P < 0.6704).0009 173 2.1127 -3.662999 0.78137 <.7.1267).0001).0001 -3.21627 <.0001 1 2 6.0001 3 3. The parallelism hypothesis H0 : β11 = β12 = β21 = β22 = β is not rejected (P = 0.0009 Thus 1.0001 1.

174 CHAPTER 7. Example Suppose that the data in the following table are based on an experiment in which the treatments consist of two methods of therapy. The following example is taken from Huitema (1980) : The Analysis of Covariance and Alternatives. Heterogeneous slopes present a problem in ANCOVA in that it is impossible to claim signiﬁcance or non-signiﬁcance of the treatment eﬀect throughout the range of the covariate under consideration. The dependent variable is the aggressiveness score on a behavioral checklist. This happens when the treatment eﬀect is dependent upon the value of the covariate.5 7. Thus we wish to test if there is a signiﬁcant treatment eﬀect for a chosen value of the covariate X. there may be a signiﬁcant diﬀerence between the lines for values of X around 2. INTRODUCTION TO THE ANALYSIS OF COVARIANCE 7. however. Scores on a sociability scale are employed as a covariate. The Johnson-Neyman procedure generalizes this process by identifying the values of the covariate for which there is a signiﬁcant diﬀerence between the levels of the treatment. One Covariate We shall now consider ANCOVA designs where the assumption of equal slopes is not satisﬁed. The diﬀerence between the two lines is clearly not signiﬁcant for values of X near 6. In other words. X.5. The following plot gives a case where the heterogeneity of the slopes is clear. there is a signiﬁcant interaction between the levels of the treatment and the covariate variable.5. .1 The Johnson-Neyman Technique: Heterogeneous Slopes Two Groups.

we get the following two regression lines: The slopes are heterogeneous.5 2.5 16 175 X 1 2 2 3 4 5 5 6 6 7 8 8 9 10 11 Using SAS.5 4. The following SAS code may be used to ﬁt the heterogeneous slope ANCOVA and test for signiﬁcance of the diﬀerence between the two methods of therapy on a case by case basis.5.5 4.5 11 12.5 3. 1 1 10 1 2 10 1 2 11 1 3 10 1 4 11 1 5 11 1 5 10 1 6 11 .5 8 9 10 1 Y 5 6 6 7 8 9 9 9 10.5 12 12 11 11 12.5 12.7. THE JOHNSON-NEYMAN TECHNIQUE: HETEROGENEOUS SLOPES Therapy 1 Y 10 10 11 10 11 11 10 11 11.5 14 14. input grp x y. cards.5 12 Therapy X 1 1. A partial output is given following the code.5 5 6 6 7 7 7. data jn.

5963693 59. symbol1 c=black v=dot l=1 i=r. CLASS GRP. 6 7 8 8 9 10 11 1 1.5 11 12.9199841 0.5 12 5 6 6 7 8 9 9 9 10.7 PDIFF TDIFF.5 3.5963693 141. -------------------------------------------------------------------------Sum of Source DF Squares Mean Square F Value Pr > F grp 1 64.5 12.37 <.5 4.7551731 117.0001 x*grp 1 59. X=6. proc gplot.79 <. PROC GLM DATA=JN. symbol2 c=black v=square l=1 i=r.0001 Error 26 10.5 12 12 11 11 12.90 <. run. MODEL Y = GRP X LSMEANS GRP/ AT LSMEANS GRP/ AT LSMEANS GRP/ AT LSMEANS GRP/ AT RUN.1707155 152. GRP*X. plot y*x=grp.5 5 6 6 7 7 7.7551731 280.5 2.5 8 9 10 11. quit. QUIT.1707155 64. X=8 PDIFF TDIFF.5 14 14.4199994 .176 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 .5 PDIFF TDIFF. INTRODUCTION TO THE ANALYSIS OF COVARIANCE goptions reset=symbol. X=6.5 16 CHAPTER 7.5 4.0001 x 1 117. X=5 PDIFF.

7.5. THE JOHNSON-NEYMAN TECHNIQUE: HETEROGENEOUS SLOPES Total 29 176.9666667 Least Squares Means at x=5 H0:LSMean1= LSMean2 grp y LSMEAN Pr > |t| 1 10.8997955 <.0001 2 9.3402754 Least Squares Means at x=6.5 H0:LSMean1=LSMean2 y LSMEAN t Value Pr > |t| 11.2126789 0.07 0.9461 11.1957508 Least Squares Means at x=6.7 H0:LSMean1=LSMean2 y LSMEAN t Value Pr > |t| 11.2543967 -0.74 0.4636 11.4431475 Least Squares Means at x=8 H0:LSMean1=LSMean2 y LSMEAN t Value Pr > |t| 11.5255624 -4.89 <.0001 13.0512261

177

grp 1 2

grp 1 2

grp 1 2

Thus, there is a signiﬁcant diﬀerence between the two methods of therapy for an individual with a sociability scale of 5 (P < 0.0001), while we fail to ﬁnd a signiﬁcant treatment eﬀect for an individual with a sociability scale of 6.5 (P = 0.9461). ˆ ˆ ˆ ˆ Let the regression lines ﬁtted individually for groups 1 and 2 be Y1 = α1 + β1 X1 and Y2 = α2 + β2 X2 , ˆ ˆ ¯ ¯ respectively. Let X1 and X2 be the sample means of the covariate associated with the two groups while

n1

SXX1 =

i=1

¯ (X1i − X1 )2 , and SXX2 =

n2 i=1

¯ (X2i − X2 )2

where n1 and n2 are the respective sample sizes. We can identify the lower and upper limits, XL and XU , of the region of non-signiﬁcance on X using the following formulæ: √ −B − B 2 − AC XL = √A −B + B 2 − AC XU = A where ˆ ˆ + (β1 − β2 )2 SXX1 SXX2 ¯ ¯ X1 X2 ˆ B=d + + (ˆ 1 − α2 )(β1 − β2 ) α ˆ ˆ SXX1 SXX2 ¯2 ¯2 1 X1 X2 1 + + + + (α1 − α2 )2 ˆ ˆ C = −d n1 n2 SXX1 SXX2 A = −d + 1 1

178 Here

CHAPTER 7. INTRODUCTION TO THE ANALYSIS OF COVARIANCE

d = (tn1 +n2 −4 (α/2))2 s2 p where s2 = p (n1 − 2)s2 + (n2 − 2)s2 1 2 n1 + n2 − 4

is the pooled variance. The values of s2 and s2 represent the error mean of squares when the two individual 1 2 regression lines are ﬁt. Example Consider the preceding example. The following SAS code can be used to compute the quantities in the formula. PROC SORT; BY GRP; RUN; QUIT; PROC MEANS MEAN CSS N; VAR X; BY GRP; RUN; QUIT; PROC GLM DATA=JN; MODEL Y = X; BY GRP; RUN; QUIT; /* Get t value squared from the T-table*/ DATA T; TVAL = TINV(.975, 26); TVAL = TVAL*TVAL; RUN; QUIT; PROC PRINT DATA=T; RUN; QUIT; --------------------------------------------------------------------------------------------------------------------------- grp=1 --------------------------Analysis Variable : x Mean 5.8000000 Corrected SS 130.4000000 N 15

--------------------------------------------- grp=2 --------------------------Analysis Variable : x

7.5. THE JOHNSON-NEYMAN TECHNIQUE: HETEROGENEOUS SLOPES

179

Mean 5.5333333

Corrected SS 99.2333333

N 15

--------------------------------------------- grp=1 --------------------------Sum of Source DF Squares Mean Square F Value Pr > F x 1 5.67361963 5.67361963 19.62 0.0007 Error 13 3.75971370 0.28920875 Total 14 9.43333333 Standard Error 0.30641368 0.04709414

Parameter Intercept x

Estimate 9.856850716 0.208588957

t Value 32.17 4.43

Pr > |t| <.0001 0.0007

--------------------------------------------- grp=2 --------------------------------------------Sum of Source DF Squares Mean Square F Value Pr > F x 1 151.8397296 151.8397296 275.68 <.0001 Error 13 7.1602704 0.5507900 Total 14 159.0000000 Standard Error 0.45460081 0.07450137

Parameter Intercept x Obs 1 TVAL 4.22520

Estimate 3.155357743 1.236983540

t Value 6.94 16.60

Pr > |t| <.0001 <.0001

We may now compute XL and XU . A summary the quantities needed is

Group 1 n1 = 15 ¯ X1 = 5.8 SXX1 = 130.4 α1 = 9.857 ˆ ˆ β1 = 0.209 s2 = 0.289 1

t26 (.025) = 4.225

Group 2 n2 = 15 ¯ X2 = 5.533 SXX2 = 99.23 α2 = 3.155 ˆ ˆ β2 = 1.237 s2 = 0.551 2

2 Multiple Groups.4 99. one uses the formulægiven above. XL (r.8 5.675 15 15 130. One Covariate Two procedures extending the Johnson-Neyman strategy to incorporate several groups were proposed by Potthoﬀ(1964).23 A = −(1. A.5. Method 1 appears to be superior for subjects with sociability score below 6. .7745 ∗ + + (9. d.23 5. Let a be the number of groups under consideration.5332 1 + + + C = −(1. s) and XU (r.05.712 + 0. B.4 99.07.857 − 3. For subjects with sociability score between 6.82 5.42 = 1.42 26 d = 4. the region of non-signiﬁcance is (6.209 − 1. The idea is to make all pairwise comparisons using the Bonferroni technique. α ˆ nr ns SXXr SXXs + 2 −Brs − −Brs + 2 Brs − Ars Crs Ars 2 Brs − Ars Crs Ars 1 1 s2 p where s2 = p a 2 i=1 (ni − 2)si a i=1 ni − 2a . However. to determine regions of non-signiﬁcance with simultaneous conﬁdence α. and C are computed as p s2 = p (13 ∗ 0.712 − 0.07). s) = XU (r. The lower and upper limits for comparing groups r and s.0253 130. s) = where Ars = −d Brs Crs Here d = t(Pa ni −2a) i=1 α a(a − 1) ˆ ˆ + (βr − βs )2 SXXr SXXs ¯ ¯ Xr Xs ˆ =d + + (αr − αs )(βr − βs ) ˆ ˆ ˆ SXXr SXXs ¯2 ¯2 1 1 Xr Xs = −d + + + + (ˆ r − αs )2 . We shall consider the simpler of the two which uses the above technique along with a Bonferroni correction.533 = = 6.07 A 1. s). There are a = a(a − 1)/2 possible pairwise 2 comparisons.0253 √ −B + B 2 − AC 6.23 1 5. while method 2 seems to work better for subjects above the sociability score of 7.4 99.237)2 = 1.857 − 3.155)2 = 43. we fail to detect a diﬀerence between the two methods of therapy. One compares each pair of groups using a Johnson-Neyman procedure at level α/ a .289) + (13 ∗ 0.7745) ∗ These give B 2 − AC 6. INTRODUCTION TO THE ANALYSIS OF COVARIANCE The values of s2 .0253 XL = −B − Thus.7745 1 1 + + (0.533 XU = = = 7.07 √ 7. for 1 ≤ r < s ≤ a are: XL (r.03.155) ∗ (0.551) = 0.225 ∗ 0.712 130.209 − 1.180 CHAPTER 7.03 and 7. this may be done at once using the BONFERRONI adjustment of LSMEANS in SAS.7745) ∗ + (9.03 A 1. 7.533 B = 1.237) = −6. using α = 0.03 . If the 2 speciﬁc value of X is known.

Another instance of nesting involves two or more factors in which one or more of the factors are nested within the other structurally. 8.Chapter 8 Nested Designs We have already seen some examples of nesting in the analysis of repeated measurement models as well as split-plot designs. or hierarchical designs. These involve smaller sampling units obtained via subsampling. Assume there are r subjects available for each level of A. Nested designs. The experimental error now consists of the variability among the blood samples per individual and the variability among the individuals themselves. are used in experiments where it is diﬃcult or impossible to measure response on the experimental unit.1 Nesting in the Design Structure The type of nesting where there are two or more sizes of experimental units is known as nesting in the design structure. consider a situation where a the treatments are several diets that are taken by humans in a completely randomized manner. Instead. Responses are measured on n subsamples for each subject. Suppose we are interested in comparing the a levels of factor A. we take several blood samples from each individual and measure the hormone in each sample. As an illustration. The response is the level of a certain hormone in the blood of a subject’s body. Since it is diﬃcult to measure the level of the hormone in all of the blood of an individual. This subsampling process introduces a new source of variability due to the subsamples within the experimental units in our model in addition to the variation among the experimental units. smaller sampling units (subsamples) are selected from each experimental unit on which the response is measured. The data layout looks like the following: 181 .

. . . n . We assume that the random errors follow normal distributions with mean 0 and constant variances. . k = 1. . NESTED DESIGNS Unit 1 Sample 1 2 . . . i. . . . . . τi is the eﬀect of level i of the treatment. He obtained 12 panels and sprayed 4 of the panels with each of the three chemicals. . . y2rn ··· ··· ··· ··· ··· ··· ··· CHAPTER 8. y22n . . ya2n . 1 2 . . r The statistical model for a one-way CRD with subsampling is yijk = µ + τi + j(i) + δk(ij) where i = 1. . . . y11n y121 y122 . j = 1. . and δk(ij) is the random variation of the kth sampling unit within the jth unit on the ith treatment. . . . . ya1n ya21 ya22 . y2r1 y2r2 . . .e. yarn 2 . y21n y221 y222 . . Here µ is the overall mean. . One can show that the expected mean squares are those given in the following table: Source Treatment Experimental Error Sampling Error MS M SA M SE M SS E(MS) 2 σδ + nσ 2 + (rn 2 σδ + nσ 2 2 σδ τi2 )/(a − 1) Using the expected MS. . . we ﬁnd the following ANOVA table Source Treatment Experimental Error Sampling Error Total df a−1 a(r − 1) ar(n − 1) prn − 1 MS M SA M SE F FA = M SA /M SE FE = M SE /M SS The following example is taken from Peterson: Design and Analysis of Experiments. . Example A chemist wanted to measure the ability of three chemicals to retard the spread of ﬁre when used to treat plywood panels.182 Treatment 2 ··· y211 · · · y212 · · · . . n. j(i) is the random variation of the jth experimental unit on the ith treatment. . n 1 y111 y112 . y1r1 y1r2 . . He then cut two small pieces from each panel and measured the time required for each to be completely consumed in a standard ﬂame. n 1 2 . . . y12n . . σδ ) . The data is given as follows: . . . . . . j(i) ∼ N (0. . σ 2 ) and 2 δk(ij) ∼ N (0. . . yar1 yar2 . r. y1rn a ya11 ya12 . . a.

INPUT TRT CARDS. RUN.4 3.1 9. DATA NEW.3 5. PANEL SUB RESP.6 9.7 4.6 5.8 2.1 10.3 4.6 5. -----------------------------------------------------------------------Dependent Variable: RESP .2 PROC GLM.8.1.5 5.6 5.4 3.8 4.4 4.6 5. NESTING IN THE DESIGN STRUCTURE Chemical A B C 10.2 183 Panel 1 2 3 4 Sample 1 2 1 2 1 2 1 2 The SAS analysis of the data is given as follows.4 3.0 4.4 4.4 8.6 5. 1 1 1 2 1 1 3 1 1 1 1 2 2 1 2 3 1 2 1 2 1 2 2 1 3 2 1 1 2 2 2 2 2 3 2 2 1 3 1 2 3 1 3 3 1 1 3 2 2 3 2 3 3 2 1 4 1 2 4 1 3 4 1 1 4 2 2 4 2 3 4 2 .4 1. TEST H=TRT E=PANEL(TRT).7 3.7 6.3 5.8 4.5 5.5 8.7 4.9 5.4 8.4 1.6 9.0 4.6 5.5 8. QUIT.3 4. MODEL RESP=TRT PANEL(TRT).4 3.8 2.7 3.9 5. CLASS TRT PANEL SUB.1 10.0 7.7 6. 10.0 7. It is followed by the output.1 9.

0022 0.0057).0120 2 0.81541667 F Value 9.2988 Thus there is a signiﬁcant diﬀerence between treatment A and the remaining two.0001 0.184 Sum of Squares 137.4693939 0.0019).68 Pr > F 0.81541667 4.0022 0. There is also a signiﬁcant diﬀerence among the panels treated alike (P = 0. NESTED DESIGNS Source Model Error Corrected Total DF 11 12 23 Mean Square 12.0001 Source TRT PANEL(TRT) DF 2 9 Type III SS 93.63083333 43.79 Pr > F <.05 6. Thus.83694444 F Value 63. Mean comparisons can be made by using SAS’ LSMEANS with the appropriate error term.0733333 CHAPTER 8.0057 Thus. there were highly signiﬁcant diﬀerences among the three chemicals in their ability to retard the spread of ﬁre (P = 0.9100000 146.2988 3 0. treatment A appears to be the most eﬀective ﬁre retardant.51 Pr > F <.53250000 Mean Square 46.7425000 F Value 16.63083333 Mean Square 46. LSMEANS TRT / PDIFF E=PANEL(TRT).53750000 3.87500000 5.1633333 8.08750000 LSMEAN Number 1 2 3 i/j 1 2 3 1 0. The corresponding output is TRT 1 2 3 RESP LSMEAN 8.0019 Tests of Hypotheses Using the Type III MS for PANEL(TRT) as an Error Term Source TRT DF 2 Type III SS 93. .0120 0.

. )2 . The data has the following format: A 1 2 3 1 X 2 X 3 X B 4 X 5 X X X 6 7 Assuming that factor A has a levels and the nested factor B has a total of b levels with mi levels appearing with level i if A. Source A B(A) Error Total df a−1 b−a N −b N −1 MS M SA M SB(A) M SE F FA = M SA /M SB(A) FB(A) = M SB(A) /M SE The following example is taken from Milliken and Johnson :The Analysis of Messy Data. ¯ df = i=1 mi (ni − 1) = N − b The following ANOVA table may be used to test if there are any treatment diﬀerences. ¯ mi ni (¯i. 2...8. Further assume that there are ni replications for each level of B nested within level i of A. . mi k = 1. and ijk ∼ N (0. m3 = 2. )2 . 2. In this case B needs to have more levels than A. Let N = a i=1 mi ni be the total number of observations. )2 . A and B. m2 = 3. the sampling unit and the experimental unit coincide. NESTING IN THE TREATMENT STRUCTURE 185 8. σ 2 ) are random errors. − y. )2 . a yijk = µ + τi + βj(i) + ijk . The total sum of squares decomposes into component sum of squares in a natural way. This is left as an exercise. . − yi. 2. .2. . ¯ i=1 j=1 a mi nj(i) df = i=1 (mi − 1) = b − a a SSE = i=1 j=1 k=1 (yijk − yij. Vol I. . . . y ¯ i=1 a mi df = N − 1 SSA = SSB(A) = df = a − 1 a ni (yij. The statistical model for this case is i = 1. Then we have SST = SSA + SSB(A) + SSE where (given with the associated df) a mi ni SST = i=1 j=1 k=1 a (yijk − y. there is only one error term.2 Nesting in the Treatment Structure Consider two factors. . βj(i) is the eﬀect of the jth level of B contained within level i of A. . . j = 1. ni where yijk is the observed response from the kth replication of level j of B within level i of A. Note that. The levels of factor B are nested within the levels of factor A if each level of B occurs with only one level of A.. In the above representation a = 3. τi is the eﬀect of the ith level of A. The correct F is derived using the expected mean squares.. µ is the overall mean. In the current case. .. in contrast to nesting in the design structure. m1 = 2. A more general case that will not be considered here is the case of unequal replications for each B nested within A. .. b = mi = 7.

NESTED DESIGNS 8 9 10 11 B 140 152 133 151 132 139 96 108 94 84 87 82 79 74 73 67 78 63 90 81 96 83 89 94 C D Thus in this example a = 4. input A $ B Y @@. m2 = m3 = 2. 131 137 121 90 81 96 D 11 83 D 11 89 D 11 94 .186 Example Four chemical companies produce insecticides in the following manner: • Company A produces three insecticides. A 1 151 A 2 118 A 3 A 1 135 A 2 132 A 3 A 1 137 A 2 135 A 3 B 4 140 B 5 151 B 4 152 B 5 132 B 4 133 B 5 139 C 6 96 C 7 84 C 6 108 C 7 87 C 6 94 C 7 82 D 8 79 D 9 67 D 10 D 8 74 D 9 78 D 10 D 8 73 D 9 63 D 10 . • Companies B and C produce two insecticides each. • Company D produces four insecticides. cards. m4 = 4. n1 = n2 = n3 = n4 = 3 The following SAS code (given with edited output) may be used to analyze the data: data insect. The treatment structure is a two-way with the data given as follows: Company A 1 151 135 137 2 118 132 135 3 131 137 121 4 Insecticide 5 6 7 CHAPTER 8. proc glm. and • no company produces an insecticide exactly like that of another. m1 = 3. class A B. test H=A E=B(A). model y = A B(A). b = 11.

0001 0. NESTING IN THE TREATMENT STRUCTURE run.47 3.0081 187 Thus there are signiﬁcant diﬀerences among the companies (P = 0.0001) and there are signiﬁcant diﬀerences among the insecticides produced by the same company (P = 0.43182 214.2.36905 57.58333 1260. quit.74 0.87879 7604.29545 1500. .0081).00000 25573.8.27273 35. -------------------------------------------------------------------------Sum of Source DF Squares Mean Square F Value Pr > F A B(A) Error Total 3 7 22 32 22813.

- crd fasf
- SAS Code for Some Experimental Design
- Mima Tutorial
- Mult Reg
- Intermediate Statistics Sample Test
- 17 Chapter 8
- A Guide to Analyzing Biodiversity Experiments
- Tut Two Way Anova
- JADUAL 1.docx
- Private Label
- 2013 Thumbtack.com Small Business Friendliness Survey
- St 205_lecturer Notes
- 06-Okendro
- Response Surface Models for CFD Predictions of Air Diffusion Performance Index in a Displacement Ventilated Office
- ANALISIS VARIANSI
- Modelling the Removal of Mercaptans From Liquid Hydrocarbon Streams in Structured Packing Columns (PDF Download Available).pdf
- 2nd Assignment Managerial Economics XJTU
- Application of Taguchi Method In Health And Safety (Fire Extinguishing Experiment)
- Survey Census
- statistics
- Assig Beta
- Aldous Et Al
- IJEMS 17(4) 265-274
- A Critique of Present Procedures Used to Compare
- The Business Outsourcing in Telecommunication Industry Case of Pakistan
- 0_1862_1
- qt6
- Stat 473-573 Notes
- GJESM123511435692600.pdf
- 2014 Lab 9 A

Sign up to vote on this title

UsefulNot usefulClose Dialog## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

Loading