Attribution Non-Commercial (BY-NC)

81 views

Attribution Non-Commercial (BY-NC)

- MSA
- Sigma Plot Statistics User Guide
- User Guide to MODDE 9
- Stata Logistic
- Linear Models
- Lack of Fit Test
- MINITAB Student Release 14 Manual
- ECO4016F+2011+Tutorial+7
- Experimental Investigation of Process Parameter on Tensile Strength of Selective Laser Melting Built Parts
- DAVE 5
- General Plotting Commands
- AbstractNenaDensidadesSiembra2012
- ARTIGO EXEMPLO
- tmp9165.tmp
- mustafa ozden
- Multiple Regression.doc
- 13. Shawgat
- Vol3-Issue1-7
- Corporate Environmental Disclosure in Developing Countries: Evidence from Bangladesh
- Project 1 - Time Per. B Analysis

You are on page 1of 11

This chapter discusses the analysis of variance model for two categorical explanatory variables. In

particular, it discusses the case where one of the factors is a blocking variable. Two-way ANOVA

can be analyzed as a regression model with two categorical explanatory variables. Each categorical

variable is represented by a set of indicator variables.

generally measured by collecting data at several levels of the factor.

• Each combination of levels from a set of factors comprises a treatment. If only one factor is

present, then the treatments are just the levels of that factor.

Blocking

Suppose we are measuring corn yield in a field experiment for 4 varieties (A, B, C, D). A square

field will be divided up into 16 plots and varieties randomly assigned to plots. Suppose also there is a

moisture gradient running East-West across the field. Random assignment of varieties to plots might,

by chance, end up assigning more of the plots for variety A to the East side of the field than the West

side and vice-versa for variety B. More importantly, the moisture gradient in the analysis causes

there to be great variation in yields within each variety; if we can adjust for the moisture gradient,

then we can more easily detect any difference between varieties.

• One way to adjust for the moisture gradient is to divide the field into 4 blocks of 4 plots each

from east to west and then randomly assign all 4 treatments within each block. The block factor

is then included in the analysis.

• If treatments are assigned randomly to plots within each block, then this is called a randomized

complete block design. (Note: the “complete” refers to the fact that every treatment appears in

every block).

combination. The corn experiment is balanced because there is exactly one plot in every one of

the 16 variety-block combinations. How would the design look if we wanted two plots in every

combination?

page 2

• Blocking is very similar to the idea behind doing matched pairs for comparing two treatments.

In fact, matched pairs is a special case of blocking where there are only two treatments and the

pairs are the blocks. The analysis of matched pairs took advantage of the blocking by analyzing

the difference within each pair. The analysis of a block design in ANOVA does so by including

Block as a factor in the ANOVA.

• We are primarily interested in making inferences about the treatment variable in a randomized

block design. The blocking variable is included simply to make us better able to detect a

treatment effect – we’re not usually interested in testing for a block effect, because we assume

there is one – that’s why we blocked.

• Both Variety and Block can be modeled with three indicator variables, say V1, V2 and V3 and

B1, B2 and B3. How might we define these indicator variables?

• Write the regression model for mean yield with only the main effects of Variety and Block.

How many coefficient parameters are there in the model?

• According to this model what is the mean yield for each of the 16 Block-Variety combination?

Treatment differences

A B C D A-D B-D C-D

Block 1

Block 2

Block 3

Block 4

page 3

What would a plot of mean yield versus variety look like, with each block having a separate line (see

example on p. 383). This is sometimes called a “profile plot.”

• What would the regression model be if we included the Block by Variety interaction? How

many coefficient parameters are there in the model?

• According to this model what is the mean yield for each of the 16 Block-Variety combination?

Treatment differences

A B C D A-D B-D C-D

Block 1

Block 2

Block 3

Block 4

page 4

• What would the profile plot look like?

• Note: we can’t estimate σ 2 in the model with Variety by Block interaction because there are no

within cell replicates. This model has 16 coefficient parameters plus σ 2 and only 16

observations. Without an estimate of σ 2 , we can’t carry out statistical inferences. Our choices

would be: a) don’t include the interaction, b) include replicates within each Variety by Block

combination. What are the advantages/disadvantages of each?

• Observations within each cell (a “cell” means a particular combination of levels of the two

factors) are independent observations from a normal distribution.

• The cell samples are drawn independently of each other (or there is random assignment to cells).

page 5

Case study 13.1: Intertidal Seaweed Grazers

8 blocks, 6 treatments, 2 replicates per block by treatment combination.

A convenient graphical representation for the data in a two-way classification is the one in Display

13.7 on p. 383, sometimes called a “profile plot.” In SPSS, such a plot can be gotten by:

Graphs…Line…Multiple; choose “Other summary function.” The default function is “Mean” which

is what is desired. You can also obtain the same plot from Graphs…Interactive…Line. This plot

illustrates differences between treatments, between blocks, and treatment by block interaction (if

there is no interaction the profiles are parallel).

The plot below has the blocks in numerical order. The plot on p. 383 has the plots ordered from

smallest mean to largest. An advantage of the latter plot is that it makes it clear that there is more

variability in the means as the means increase (left to right). This suggests that nonconstant variance

might be a problem. To get a plot with the blocks in a different order in SPSS, we would create a

new variable with values 1 to 8 which indicates the desired order of the blocks (for example, 1 would

be for the block with the smallest mean and 8 for the largest). Then use values labels to indicate that

“1” is really “Block 1” and “8” is really “Block 4.”

Treat

CONTROL

f

75.00 fF

L

Lf

LfF

Cover

50.00

Dot/Lines show Means

25.00

0.00

BLOCK 1 BLOCK 3 BLOCK 5 BLOCK 7

BLOCK 2 BLOCK 4 BLOCK 6 BLOCK 8

Block

We should also examine the model assumptions through a residual analysis. To fit a two-way model,

you could create all the indicator variables necessary (7 for block and 5 for treatment plus the 35

products for interaction), and use the Regression procedure, but it’s much easier to use

Analyze…General Linear Model…Univariate. Block and Treatment are entered as “Fixed

factors.” Residuals can be saved under “Save.” The default model includes the interaction; the

model is specified under “Model.”

A residual plot (p. 384) confirms the suspicion of nonconstant variance. Since the responses are

percentages between 0 and 100, which can be converted to proportions between 0 and 1, it’s not too

surprising that the variance is not constant since the variance of a binomial proportion is

page 6

p (1 − p ) / n which is not constant and is greatest for p=.5. The logit transformation is often

useful for proportions. If Y is a proportion, then

⎛ Y ⎞

Logit(Y) = ln⎜ ⎟

⎝1− Y ⎠

The quantity Y/(1-Y) is called the odds ratio since it represents the odds of an event whose probability

is Y.

Remembering to divide Cover by 100 before taking the logit, the profile plot (p. 385) is much

“improved” and there is less evidence of interaction. A residual plot also indicates fewer problems:

Tests of Between-Subjects Effects

Type III Sum

Source of Squares df Mean Square F Sig.

Corrected Model 188.462a 47 4.010 13.241 .000

Intercept 145.854 1 145.854 481.618 .000

Treat 96.993 5 19.399 64.055 .000

Block 76.239 7 10.891 35.963 .000

Treat * Block 15.230 35 .435 1.437 .121

Error 14.536 48 .303

Total 348.853 96

Corrected Total 202.999 95

a. R Squared = .928 (Adjusted R Squared = .858)

Type III sum of squares (the default) indicates that the sum-of-squares for each effect is gotten by

comparing the full model (Treat + Block +Treat*Block) to the model without that effect in it; thus,

• for Treat*Block, compare the full model to the model Treat + Block

page 7

• for Treat, compare the full model to the model Block + Treat*Block

• for Block, compare the full model to the model Treat + Treat*Block

The latter two tests really make no sense since a model with Treat*Block, but not Treat, doesn’t make

sense. Therefore, only the test for the Treat*Block interaction makes sense. Unfortunately, some

people naively use these tests to test for the main effects. There are other options:

If the Treat*Block interaction is not significant, leave it out and refit the model Block + Treat. Then

the test of Treat makes sense (we’re not generally interested in the test of Block). We might also

carefully examine the profile plot, to make sure it’s reasonable to leave out Block*Treat even if it’s

not significant (not significant does not necessarily mean it’s zero).

If the Treat*Block interaction is significant, or if we simply want to conservative and not assume it’s

zero, we can:

1. Test Treat by comparing the model Block to the full model Block+Treat+Block*Treat (this is

not discussed in the text but is advocated by some authors)

2. Realize that an interaction means that the effect of Treat is different in different blocks, and

examine the effect in each block separately. This makes more sense than number 1.

In the Seaweed Grazers example, since the interaction is not significant, and the profile plot indicates

treatment effects that are somewhat consistent across blocks, we might fit the model Block + Treat:

Tests of Between-Subjects Effects

Type III Sum

Source of Squares df Mean Square F Sig.

Corrected Model 173.232a 12 14.436 40.252 .0000

Intercept 145.854 1 145.854 406.691 .0000

Block 76.239 7 10.891 30.368 .0000

Treat 96.993 5 19.399 54.090 .0000

Error 29.767 83 .3586

Total 348.853 96

Corrected Total 202.999 95

a. R Squared = .853 (Adjusted R Squared = .832)

• Note that the SS for Treat and Block have not changed, but that the SS for Error has. It has

become the SS for Error in the full model plus the SS for Treat*Block. The reason the SS for

Block and Treat are unchanged is that the design is balanced (equal samples sizes in all cells).

In unbalanced designs, the SS do change which makes the choice of an appropriate analysis

more important.

page 8

• There is very strong evidence (P<.0001) that there is a difference in the mean log regeneration

ratios among the treatments. There is also very strong evidence of a block effect, but that is

of less interest; we expected a block effect; that’s why blocking was used.

• The Block main effect should always be in the model even if it’s not statistically significant.

This is because we believe there is a block effect (that’s why we blocked) even if we don’t

find strong evidence of it in our particular experiment. It’s just as with paired data: we would

always use a paired t-test and wouldn’t ever use the two-sample t even if there didn’t appear

to be differences between the pairs of subjects.

First, to get a table of cell means as in Display 13.12 on p. 388, in General Linear Model…

Univariate use Options…Descriptive statistics. This will also give the Block averages and

Treatment averages.

Descriptive Statistics

Block Treat Mean Std. Deviation N

BLOCK 1 CONTROL -1.5118 .42920 2

f -1.6217 .66331 2

fF -2.0491 .20949 2

L -3.1781 .00000 2

Lf -3.2103 .37594 2

LfF -4.2435 .49731 2

Total -2.6357 1.07644 12

BLOCK 2 CONTROL -.9424 .45723 2

f -1.3077 .71783 2

fF -1.9659 .32712 2

L -2.5145 .10207 2

Lf -3.1138 .51234 2

LfF -3.2103 .37594 2

Total -2.1758 .95409 12

BLOCK 3 CONTROL 1.1123 .57146 2

f .2220 .20076 2

fF -.1206 .17053 2

L -.3108 .89607 2

Lf -1.5569 1.07022 2

LfF -2.5326 .30964 2

Total -.5311 1.33233 12

BLOCK 4 CONTROL 2.8480 .13640 2

f 1.8382 .35717 2

fF .6382 .50401 2

L -.8068 .26558 2

Lf -.5215 1.13616 2

LfF -1.9262 .93410 2

Total .3450 1.76499 12

BLOCK 5 CONTROL -.2716 .55397 2

f -.6857 .03174 2

fF -.6844 .51138 2

L -1.3995 .97761 2

Lf -2.6290 .44605 2

LfF -2.8480 .13640 2

Total -1.4197 1.10962 12

BLOCK 6 CONTROL .7107 .54860 2

f -.1836 .37290 2

fF -.4062 .11793 2

L -1.2292 .60677 2

Lf -.6639 .54031 2

LfF -1.8914 .43246 2

Total -.6106 .91913 12

BLOCK 7 CONTROL -.7851 .94036 2

f -.0809 .28425 2

fF -.7354 .22629 2

L -2.5969 .21863 2

Lf -2.5852 .83836 2

LfF -2.3799 .79843 2

Total -1.5272 1.16484 12

BLOCK 8 CONTROL .2837 .23134 2

f -.6898 .22280 2

fF -1.2481 1.19167 2

L -1.6601 .10534 2

Lf -1.7544 .33664 2

LfF -2.7656 .25297 2

Total -1.3057 1.06369 12

Total CONTROL .1805 1.39899 16

f -.3137 1.07482 16

fF -.8214 .95985 16

L -1.7120 1.02149 16

Lf -2.0044 1.13986 16

LfF -2.7247 .83100 16

Total -1.2326 1.46179 96

page 9

• The General Linear Model procedure will give you estimates of and standard errors for some

specific types of linear combinations of cell means, but not for arbitrary linear combinations as

was possible in the One-way ANOVA procedure.

• Pairwise comparisons of means for each factor, along with some specific types of contrasts, can

be obtained in a couple of different ways which are discussed further down.

To obtain SE’s for arbitrary linear combinations of cell means, as on p. 389, you must do the work by

hand using the formulas we learned in Chapter 6, pp. 154-7. The key is that the estimate of the

common cell standard deviation σ is MSE from your final model. In the Seaweed Grazers

example, the final model is BLOCK + TREAT and MSE = .3586 = .599 with 83 d.f. This is used

for s p in the formulas of Chapter 6.

Example: In number 1, p. 389, the text examines the effect of large fish on the regeneration ratio

through the following contrast in the treatment means (why is it a contrast?):

µ fF − µ f µ LfF − µ Lf

γ1 = +

2 2

This is estimated by the same function of the sample means. Note that the sample means are pooled

over blocks so the sample size for each mean is 16. Therefore,

g= + = + = −0.6140

2 2 2 2

(12 )2 (− 12 )2 (12 )2 (− 12 )2 1

SE(g) = MSE + + + = .599 = .1497

16 16 16 16 16

Thus a 95% confidence interval for the contrast is -.614 ± t 83 (.975)(.1497) = -.614 ± 1.989(.1497) =

-.614 ± .298 = -.912 to -.316.

Can you reproduce the estimates and standard errors for the remaining contrasts on p. 389?

page 10

• Choose Post Hoc on the General Linear Model window. You can get comparisons of Block

means and/or Treatment means. The text recommends (Sec. 13.5.5, p. 401) the Tukey-Kramer

procedure, as in one-way ANOVA, which is listed as “Tukey” in SPSS. Post Hoc will not give

results if there are only two levels of a factor; I have no idea why (there is only one comparison,

but no reason it can’t be done). You can use the Contrasts option (described below) with

Simple chosen, to get the SE and confidence interval for the difference in the two means.

• Some pairwise comparisons are also available by choosing Options, putting Treatment under

Display Means for: and checking Compare main effects. However, only “LSD”,

“Bonferroni” and “Sidak” are available; “Tukey” is not. In addition, while this procedure will

give the exact same confidence intervals as Post Hoc (if the same procedure is chosen) for

balanced designs (same number of observations in every cell), it won’t when the design isn’t

balanced. Stick with Post Hoc and don’t mess with this procedure.

• Some other types of contrasts can be obtained through the Contrasts option on the General

Linear Models window. We can define a set of contrasts to be evaluated for each factor. The

choices are limited to the ones listed. Simple gives contrasts that compare each level with a

reference level (either the first or last level). Deviation gives the deviation of the mean of each

level from the overall mean. The deviations are what are displayed as the “Block effect” and

“Treatment effect” in the last column and row of Display 13.12. In the Seaweed experiment for

the Treatment variable, the deviation contrasts have the form

µ1 + µ 2 + … + µ 6

µ i − µ where µ = is the mean of all the treatment means.

6

• Note on profile plots: the General Linear Models window also has a Plots.. option which gives

“Profile” plots. However, it doesn’t plot the observed means, but the estimated means under the

fitted model. While it will give the “right” plot when a full factorial model has been selected

and the design is balanced, I recommend using Graphs…Line…Multiple as described earlier.

page 11

Pairwise comparisons in the presence of an interaction

• If the Block*Treat interaction is significant and appears important, comparison of treatment

means over all blocks may not be meaningful, since the treatment means are averages over all

blocks. A negative effect in one block could be offset by a positive effect in another block so

that there appears to be no effect when averaging over all blocks.

• The presence of an interaction doesn’t mean that averaging over blocks is necessarily

meaningless. If the treatment effects are in the same direction in all blocks, but simply differ

somewhat in size, then averaging over blocks may still be useful. Also, with large sample sizes,

the interaction may be statistically significant but small in size.

• If an interaction is present and is important, then you should compare means within blocks.

You can use the procedure for arbitrary linear combinations of cell means to do this (Section

13.3.4; also described on a previous page of these notes). Just remember that the sample sizes

are the sample sizes for the cells involved in the comparison (for example, in the Seaweed

experiment, to compare two treatments within block 1, the samples sizes are both 2). You can

see that the resulting SE’s will be much larger than if we can pool across blocks.

• It’s possible that some intermediate model may describe what’s going on with an interaction

present. For example, perhaps the treatment effects are very similar for all blocks but one, so

we might analyze those blocks together.

The General Linear Models procedure uses the linear regression approach to estimating the

parameters of the model. Therefore, it works properly for unbalanced designs. It is not necessary to

use the Regression procedure in SPSS with user-defined indicator variables to analyze the Pygmalion

data, as described in the text (Section 13.4, p. 392). For example, the General Linear Models will

give the regression estimate for treatment effect discussed in Section 13.4.3.

- MSAUploaded bysubbu0815
- Sigma Plot Statistics User GuideUploaded bysdmonteleone
- User Guide to MODDE 9Uploaded byjorgeayes
- Stata LogisticUploaded bymohammad_zayed_7
- Linear ModelsUploaded byKailas Venkat
- Lack of Fit TestUploaded byMukund Purohit
- MINITAB Student Release 14 ManualUploaded bysai3k9
- ECO4016F+2011+Tutorial+7Uploaded bySwazzy12
- Experimental Investigation of Process Parameter on Tensile Strength of Selective Laser Melting Built PartsUploaded byInternational Journal for Scientific Research and Development - IJSRD
- DAVE 5Uploaded byHyacinth Baula Bautista
- General Plotting CommandsUploaded byBenjamin Lutimba Mutebi
- AbstractNenaDensidadesSiembra2012Uploaded byRosa Elena Santander Bermeo
- ARTIGO EXEMPLOUploaded byCamila Evangelista Mendonça
- tmp9165.tmpUploaded byFrontiers
- mustafa ozdenUploaded bychukwudi01
- Multiple Regression.docUploaded bykuashask2
- 13. ShawgatUploaded byarnabdas1122
- Vol3-Issue1-7Uploaded byrindangsukmanita
- Corporate Environmental Disclosure in Developing Countries: Evidence from BangladeshUploaded byMonirul Alam Hossain
- Project 1 - Time Per. B AnalysisUploaded byRandall Stauder
- Modeling and Analysis of Current and Weld Speed on Weld Hardness of Steel-Cu Weldments by RSM ApproachUploaded byIJMER
- Experiment 10 (1)Uploaded byNeerajBoora
- THE EFFECT OF ELECTRONIC STORIES ON CLOTHING BEHAVIOR FOR TEENAGE GIRLS.Uploaded byIJAR Journal
- 762.Version1.2012Uploaded byPi
- Course ContentUploaded byedniel maratas
- AssumptionsUploaded byPoonam Naidu
- KAJIAN TEORIUploaded byMahendra Adi
- AnovaUploaded byJuan Manuel Solar
- 1.5.1 GLMs ANOVA CRDs (Hale) - Supp Reading.pdfUploaded byTeflon Slim
- MBA 102 - Lesson03 - One Sample T-Test.pdfUploaded byPaul Mark Dizon

- Model- vs. design-based sampling and variance estimationUploaded byFanny Sylvia C.
- ReviewChaps3-4Uploaded byFanny Sylvia C.
- SampleSizeCalcRevisitedUploaded byFanny Sylvia C.
- Hypo%26PowerLectureUploaded byFanny Sylvia C.
- ReviewChaps1-2Uploaded byFanny Sylvia C.
- Non%26ParaBootUploaded byFanny Sylvia C.
- Chapter 21Uploaded byFanny Sylvia C.
- Chapter 20Uploaded byFanny Sylvia C.
- Chapter 14Uploaded byFanny Sylvia C.
- Chapter 12Uploaded byFanny Sylvia C.
- Chapter 11Uploaded byFanny Sylvia C.
- Chapter 8Uploaded byFanny Sylvia C.
- Chapter 10Uploaded byFanny Sylvia C.
- Chapter 9Uploaded byFanny Sylvia C.
- Chapter 5Uploaded byFanny Sylvia C.
- Chapter 6Uploaded byFanny Sylvia C.
- Chapter5p2LectureUploaded byFanny Sylvia C.
- Chapter 7Uploaded byFanny Sylvia C.
- An Ova PowerUploaded byFanny Sylvia C.
- Intro BootstrapUploaded byMichalaki Xrisoula
- Good Article on Standard Error vs Standard DeviationUploaded byAshok Kumar Bharathidasan
- Data Modeling: General Linear Model &Statistical InferenceUploaded byFanny Sylvia C.
- Bio Math 94 CLUSTERING POPULATIONS BY MIXED LINEAR MODELSUploaded byFanny Sylvia C.
- GRM: Generalized Regression Model for Clustering Linear SequencesUploaded byFanny Sylvia C.
- Clustering in the Linear ModelUploaded byFanny Sylvia C.
- R Matrix TutorUploaded byFanny Sylvia C.
- The not so Short Introduction to LaTeXUploaded byoetiker
- Close Out NettingUploaded byFanny Sylvia C.

- Conduit SpecificationUploaded byamijetomar08
- API.570.QBank2[1]Uploaded byvishnu ram k
- Design Guide for Cold-Formed Steel TrussesUploaded byeng_ali_khalaf
- MB8006 Assignment 2 W14Uploaded byJarrett Xu
- Logo 2 logo analysis of the brand pepsiUploaded bySameer Karnik
- ADC0809CCNUploaded byFrancesca Castelar Benalcazar
- Disfunción Ejecutiva en La Enfermedad de HuntingtonUploaded byKaren Tolorza
- thesis final report.docxUploaded bySanjeevini S Chikkamath
- Electrical Impedance Tomography in Acute Lung InjuryUploaded byJefferson Santana
- 10mm Frog Embryo - Embryology LabUploaded byIvy Cruz
- Colony Earth – Part X: The Myriad WorldsUploaded byV. Susan Ferguson
- Lect Acceleration Analysis_GraphicalUploaded byRayan Isran
- RBS-6601Uploaded byMavura Michael Mgaya
- SGSN-MMEtroubleshooting-pptUploaded byphnthnhnm
- Tapping Into Ultimate Success by Jack Canfield and Pamela Bruner ExcerptUploaded byhi_chrislee
- Gluteal and Posterior Thigh RegionsUploaded byjsdlzj
- Neurotransmitter Release_presynaptic CellUploaded byRachel Burrell
- cstr 40 LUploaded byMuhammad Nasrul
- Rieju MRT 50 (ENGLISH) Workshop ManualUploaded byavista123
- 3406E, C-10, C-12, C-15 and C-16 on-Highway Engines-Maintenance IntervalsUploaded byKirotosk Jaanek
- Hr ProjectUploaded bybhagvat solanke
- Đề Thi Chuyên Anh lớp 10 tỉnh Vĩnh LongUploaded byTrinh
- Slide Show Week #1 Lecture- MGT 300Uploaded byMusahaque
- Artificial LiftUploaded byPeiwen Lim
- 08111390801 jual topcon mr-2_broch_7010_2211_reva_sm_0Uploaded byAbie
- Jyotish_Nine Planets and Twelve Bhavas_M.N. KedharUploaded bydivya-crawler
- Microwave II/XT CookbookUploaded byplumplumplum
- Hot Wire Anemometer & Laser Doppler AnemometerUploaded byeswarbalachandran
- 2016 Castle Snow CatatlogUploaded byCastleX Snow Gear, Castle Motorcycle Gear, and Helmets
- Lymph NodeUploaded byAstina