## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

:

ANOVA (SPSS 10.0)

SPSS I nc.

233 S Wacker Drive, 11th Floor

Chicago, I llinois 60606

312.651.3300

Training Department

800.543.2185

v10.0 Revised 1/ 17/ 00 hc/ ss

SPSS Neural Connecti on, SPSS QI Anal yst, SPSS for Wi ndows, SPSS Data

Entry I I , SPSS-X, SCSS, SPSS/PC, SPSS/PC+, SPSS Categori es, SPSS Graphi cs,

SPSS Professi onal Model s, SPSS Advanced Model s, SPSS Tabl es, SPSS Trends

and SPSS Exact Tests are the trademarks of SPSS I nc. for i ts propri etary

computer software. CHAI D for Wi ndows i s the trademark of SPSS I nc. and

Stati sti cal I nnovati ons I nc. for i ts propri etary computer software. Excel for

Wi ndows and Word for Wi ndows are trademarks of Mi crosoft; dBase i s a

trademark of Borl and; Lotus 1-2-3 i s a trademark of Lotus Devel opment Corp. No

materi al descri bi ng such software may be produced or di stri buted wi thout the

wri tten permi ssi on of the owners of the trademark and l i cense ri ghts i n the

software and the copyri ghts i n the publ i shed materi al s.

General noti ce: Other product names menti oned herei n are used for

i denti fi cati on purposes onl y and may be trademarks of thei r respecti ve

compani es.

Copyri ght(c) 2000 by SPSS I nc.

Al l ri ghts reserved.

Pri nted i n the Uni ted States of Ameri ca.

No part of thi s publ i cati on may be reproduced or di stri buted i n any form or by

any means, or stored on a database or retri eval system, wi thout the pri or wri tten

permi ssi on of the publ i sher, except as permi tted under the Uni ted States

Copyri ght Act of 1976.

Table of Contents - 1

ADVANCED TECHNIQUES:

ANOVA (SPSS 10.0)

TABLE OF CONTENTS

Chapter 1

Introduction

Why do Anal ysi s of Vari ance 1-1

Vi sual i zi ng Anal ysi s of Vari ance 1-1

What i s Anal ysi s of Vari ance? 1-3

Vari ance of Means 1-4

Basi c Pri nci pl e of ANOVA 1-6

A Formal Statement of ANOVA Assumpti ons 1-8

Examining Data and Testing Assumptions

Why Exami ne the Data? 2-2

Expl oratory Data Anal ysi s 2-3

A Look at the Vari abl e Cost 2-5

A Look at the Subgroups 2-9

Normal i ty 2-11

Compari ng the Groups 2-17

Homogenei ty of Vari ance 2-17

Effects of Vi ol ati ons of Assumpti ons i n ANOVA 2-19

One-Factor ANOVA

Logi c of Testi ng for Mean Di fferences 3-2

Factors 3-2

Runni ng One-Factor ANOVA 3-3

One Factor ANOVA Resul ts 3-5

Post Hoc Testi ng 3-7

Why So Many Tests? 3-8

Pl anned Compari sons 3-16

How Pl anned Compari sons are Done 3-17

Graphi c the Resul ts 3-19

Appendi x: Group Di fferences on Ranks 3-20

Multi-Way Univariate ANOVA

The Logi c of Testi ng, and Assumpti ons 4-2

How Many Factors? 4-2

I nteracti ons 4-3

Expl ori ng the Data 4-5

Two-Factor ANOVA 4-13

The ANOVA Tabl e 4-18

Predi cted Means 4-19

Ecol ogi cal Si gni fi cance 4-20

Resi dual Anal ysi s 4-21

Post Hoc Tests of ANOVA Resul ts 4-22

Chapter 2

Chapter 3

Chapter 4

2 - Table of Contents

Unequal Sampl es and Unbal anced Desi gns 4-24

Sums of Squares 4-25

Equi val ence and Recommendati ons 4-26

Empty Cel l s and Nested Desi gns 4-26

Multivariate Analysis of Variance

Why Perform MANOVA? 5-2

How MANOVA Di ffers from ANOVA 5-3

Assumpti ons of MANOVA 5-3

What to Look for i n MANOVA 5-4

Si gni fi cance Testi ng 5-4

Checki ng the Assumpti ons 5-5

The Mul ti vari ate Anal ysi s 5-11

Exami ni ng Resul ts 5-17

What i f Homogenei ty Fai l ed 5-19

Mul ti vari ate Tests 5-19

Checki ng the Resi dual s 5-23

Concl usi on 5-25

Post Hoc Tests 5-26

Within-Subject Designs: Repeated Measures

Why Do a Repeated Measures Study? 6-2

The Logi c of Repeated Measures 6-2

Assumpti ons 6-5

Proposed Anal ysi s 6-7

Key Concept 6-7

Compari ng the Grade Level s 6-13

Exami ni ng Resul ts 6-19

Pl anned Compari sons 6-26

Between and Within-Subject ANOVA: (Split-Plot)

Assumpti ons of Mi xed Model ANOVA 7-2

Proposed Anal ysi s 7-2

A Look at the Data 7-2

Summary of Expl ore 7-8

Spl i t-Pl ot Anal ysi s 7-8

Exami ni ng Resul ts 7-12

Tests of Assumpti ons 7-13

Spheri ci ty 7-14

Mul ti vari ate Tests I nvol vi ng Ti me 7-15

Tests of Between-Subject Factors 7-15

Averaged F Tests I nvol vi ng Ti me 7-16

Addi ti onal Wi thi n-Subject Factors and Spheri ci ty 7-18

Expl ori ng the I nteracti on - Si mpl e Effects 7-18

Graphi ng the I nteracti on 7-25

Chapter 5

Chapter 6

Chapter 7

Table of Contents - 3

More Split-Plot Design

I ntroducti on: Ad Vi ewi ng wi th Pre-Post Brand Rati ngs 8-1

Setti ng Up the Anal ysi s 8-2

Exami ni ng Resul ts 8-7

Tests of Assumpti ons 8-8

ANOVA Resul ts 8-11

Profi l e Pl ots 8-13

Summary of Resul ts 8-15

Analysis of Covariance

How i s Anal ysi s of Covari ance Done? 9-2

Assumpti ons of ANCOVA 9-2

Checki ng the Assumpti ons 9-3

Basel i ne ANOVA 9-3

ANCOVA - Homogenei ty of Sl opes 9-5

Standard ANCOVA 9-7

Descri bi ng the Rel ati onshi p 9-8

Fi tti ng Non-Paral l el Sl opes 9-9

Repeated Measures ANCOVA wi th a Si ngl e Covari ate 9-11

Repeated Measures ANCOVA wi th a Varyi ng Covari ate 9-16

Further Vari ati ons 9-18

Special Topics

Lati n Square Desi gns 10-2

An Exampl e 10-2

Compl ex Desi gns 10-6

Random Effects Model s 10-6

References

References R-1

Exercises

Exerci ses E-1

Chapter 8

Chapter 10

Chapter 9

References

Exercises

4 - Table of Contents

Introduction 1 - 1

SPSS Training

Introduction

A

nal ysi s of vari ance i s performed i n order to determi ne whether

there are di fferences i n the means between groups or across

di fferent condi ti ons. From a si mpl e two-group experi ment, to a

compl ex study i nvol vi ng many factors and covari ates, the same core

pri nci pl e appl i es. Why thi s techni que i s cal l ed anal ysi s of vari ance

(ANOVA) and not anal ysi s of means, has to do wi th the methodol ogy used

to determi ne i f the means are far enough apart to be consi dered

“si gni fi cantl y” di fferent.

To exami ne the basi c pri nci pl e of ANOVA, i mage a si mpl e experi ment i n

whi ch subjects are randoml y assi gned to one of three treatment groups,

the treatments are appl i ed, then subjects are tested on some performance

measure. One possi bl e outcome appears bel ow. Performance scores are

pl otted al ong the verti cal axi s and each box represents the di stri buti on of

scores wi thi n a treatment group.

Figure 1.1 Performance Scores: Distinct Populations

Chapter 1

WHY DO

ANALYSIS OF

VARIANCE?

VISUALIZING

ANALYSIS OF

VARIANCE

Introduction 1 - 2

SPSS Training

Here a formal testi ng of the di fferences i s al most unnecessary. The

groups show no overl ap i n performance scores and the group means

(medi ans are the dark bar at the center of each box) are wel l spaced

rel ati ve to the standard devi ati on of each group. Thi nk of the vari ati on,

or di stances goi ng from group mean to group mean, and compare thi s to

the vari ati on of the i ndi vi dual scores wi thi n each group.

Let us take another exampl e. Suppose the same experi ment descri bed

above resul ts i n the performance scores havi ng l i ttl e or no di fference. We

pi cture thi s bel ow.

Figure 1.2 Performance Scores: Identical Populations

Here the group means are al l but i denti cal , so there i s l i ttl e vari ati on

or di stance goi ng from group mean to group mean compared to the

vari ati on of performance scores wi thi n the groups. A formal ANOVA

anal ysi s woul d merel y confi rm thi s.

A more real i sti c exampl e i nvol ves groups wi th overl appi ng scores and

group means that di ffer. Thi s i s shown i n the pl ot bel ow.

Introduction 1 - 3

SPSS Training

Figure 1.3 Performance Scores: Overlapping Groups

The formal ANOVA anal ysi s needs to be done to determi ne i f the

group means do i ndeed di ffer i n the popul ati on, that i s, wi th what

confi dence can we cl ai m that the group means are not the same. Once

agai n, thi nk of the vari ati on of the group means (di stances) between pai rs

of groups, or vari ati on of the group means around the grand mean)

rel ates to the vari ati on of the performance scores wi thi n each group.

Stri pped of techni cal adjustments and di stri buti onal assumpti ons, you

are compari ng the vari ati on of group means to the vari ati on of i ndi vi dual

scores wi thi n the groups consti tute the basi s for anal ysi s of vari ance. To

the extent that the di fferences or vari ati on between groups i s l arge

rel ati ve to the vari ati on of i ndi vi dual scores wi thi n the groups, we speak

of the groups showi ng si gni fi cant di fferences. Another way of reasoni ng

about the experi ment we descri bed i s to say that i f the treatments

appl i ed to the three groups had no effect (no group di fferences), then the

vari ati on i n group means shoul d be due to the same sources and be of the

same magni tude (after techni cal adjustments) as the vari ati on among

i ndi vi dual s wi thi n the groups.

WHAT IS

ANALYSIS OF

VARIANCE?

Introduction 1 - 4

SPSS Training

The techni cal adjustment just menti oned i s requi red when compari ng

vari ati on i n means scores to vari ati on i n i ndi vi dual scores. Thi s i s

because the vari ance of means wi l l be l ess than the vari ance of the

i ndi vi dual scores on whi ch the mean i s based. The basi c mathemati cal

rel ati on i s that the vari ance of the means based on a sampl e si ze of “ n”

wi l l be equal to the vari ance of the i ndi vi dual scores i n the sampl e

di vi ded by “n”. The standard devi ati on of the mean i s cal l ed the standard

error or the standard error of the mean. We wi l l i l l ustrate thi s l aw wi th a

l i ttl e under 10,000 observati ons produced by a pseudo-random number

generator i n SPSS, based on a normal di stri buti on wi th a mean of zero

and a standard devi ati on of one. The resul ts appear i n Fi gure 1.4.

The fi rst hi stogram shows the di stri buti on of the ori gi nal 9,600 data

poi nts. Noti ce al most al l of the poi nts fal l between the val ues of –3 and

+3.

The second hi stogram contai ns the mean scores of sampl es of si ze 4

drawn from the ori gi nal 9,600 data poi nts. Each poi nt i s a mean score for

a sampl e of si ze 4 for a total of 2,400 data poi nts. The di stri buti on of

means i s narrower than that of the fi rst hi stogram; al most al l the poi nts

fal l between –1.5 and +1.5.

I n the fi nal hi stogram each poi nt i s a mean of 16 observati ons from

the ori gi nal sampl e. The vari ati on of these 600 poi nts i s l ess than that of

the previ ous hi stograms wi th most poi nts between -.9 and +.9. Despi te

the decrease i n vari ance, the means (or centers of the di stri buti ons)

remai n at zero.

Thi s rel ati on i s rel evant to anal ysi s of vari ance. I n ANOVA, when

compari ng the vari ati on between group mean scores to vari ati on of

i ndi vi dual s wi thi n groups, the sampl e si zes upon whi ch the means are

based are expl i ci tl y taken i nto account.

VARIANCE OF

MEANS

Introduction 1 - 5

SPSS Training

Figure 1.4 Variation in Means as a Function of Sample Size

Introduction 1 - 6

SPSS Training

Whi l e we wi l l gi ve a formal statement of the assumpti ons of ANOVA and

proceed wi th compl ex vari ati ons, thi s basi c pri nci pl e compari ng the

vari ati on of group or treatment means to the vari ati on of i ndi vi dual s

wi thi n groups (or some other groupi ng) wi l l be the underl yi ng theme.

The term “factor” denotes a categori cal predi ctor vari abl e. “Dependent

Vari abl es” are i nterval l evel outcome vari abl es, and “covari ates” are

i nterval l evel predi ctor vari abl es. ANOVA i s consi dered a form of the

general l i near model and most of the assumpti ons fol l ow from that and

are l i sted bel ow:

• Al l vari abl es must exhi bi t i ndependent vari ance. I n other

words, a vari abl e must vary, and i t must not be a one-to-one

functi on of any other vari abl e. Though i t i s the dream of any

data anal yst to have a dependent vari abl e that i s perfectl y

predi cted, i f such were the case, the “F-rati o” for an anal ysi s

of vari ance coul d not be formed (Note: as a practi cal matter, i f

you fi nd such a perfect predi cti on, l ack of an “F-rati o” shoul d

not resul t i n any l ost sl eep).

• Dependent vari abl es and covari ates must be measured i n

i nterval or rati o scal e. Factors may be nomi nal or categori zed

from ordi nal or i nterval vari abl es. However, ordi nal

hypothesi s can onl y be tested i n a pai rwi se fashi on. I mposi ng

the desi red metri c through the appropri ate set of contrasts

can test i nterval hypothesi s.

• For fi xed effect model s, al l l evel s of predi ctor vari abl es that

are of i nterest must be i ncl uded i n the anal ysi s.

• The l i near model speci fi ed i s the correct one; i t i ncl udes al l

the rel evant sources of vari ati on, excl udes al l i rrel evant ones,

and i s correct i n i ts functi onal form (Note: i n the words of the

Sgt. i n Hi l l Street Bl ues “so, be careful out there”).

• Errors of measurement must be unbi ased (have a zero mean).

• Errors must be i ndependent of each other and of the predi ctor

vari abl es.

• Error vari ances must be homogeneous.

• Errors must be normal l y di stri buted. Thi s fi nal assumpti on i s

not requi red for esti mati on, but must be met i n order for an

“F-rati o” to be accuratel y referred to as an “F-di stri buti on”

(Note: that i s, i t i s requi red for testi ng, whi ch i s why you are

doi ng the anal ysi s).

We wi l l exami ne some of these assumpti ons i n the data sets used i n

the rest of thi s course.

We use an “anal ysi s of vari ance” to test for di fferences between

means for the fol l owi ng formal reason:

The formul ati on of the anal ysi s of vari ance approach as a test

of equal i ty of means fol l ows a deducti ve format. We can show

that i f i t i s true that two (or more) means are equal , then

certai n properti es must hol d for other functi ons of the data,

such as between group and wi thi n group vari ati on. The i dea

BASIC

PRINCIPLE OF

ANOVA

A FORMAL

STATEMENT OF

ANOVA

ASSUMPTIONS

Introduction 1 - 7

SPSS Training

behi nd the formul ati on of the fami l i ar “F-rati o” i s that i f the

means bei ng compared are equal , then the numerator and

denomi nator of the “F-rati o” represent i ndependent esti mates

of the same quanti ty (error vari ance) and thei r rati o must

then fol l ow a known di stri buti on. Thi s al l ows us to pl ace a

di sti nct probabi l i ty on the occurrence of sampl e means as

di fferent as those observed under the hypothesi s of zero

di fference among popul ati on means.

I n thi s chapter we di scussed the basi c pri nci pl e of anal ysi s of vari ance

and gave a formal statement of the assumpti ons of the model . We turn

next to exami ni ng these assumpti ons and the i mpl i cati ons i f the

assumpti ons are not met (Note: l i fe as i t real l y i s).

SUMMARY

Introduction 1 - 8

SPSS Training

Examining Data and Testing Assumptions 2 - 1

SPSS Training

Examining Data and Testing

Assumptions

The data set comes from Cox and Snel l (1981). They obtai ned i t from a

report (Mooz, 1978) and reproduced i t wi th the permi ssi on of the Rand

Corporati on. Onl y a subset of the ori gi nal vari abl es i s used i n the data set

we wi l l use.

The data set we wi l l be usi ng contai ns i nformati on for 32 l i ght water

nucl ear power pl ants. Four vari abl es are i ncl uded: the capaci ty and cost

of the pl ant; ti me to compl eti on from start of constructi on; and experi ence

of the archi tect-engi neer who bui l t the pl ant. These vari abl es are

descri bed i n more detai l bel ow.

We wi l l use onl y a subset of al l the vari abl es that were i n the ori gi nal

data set, and have created categori es from the vari abl es capaci ty and

experi ence i n order to use them as factors i n an anal ysi s of vari ance.

I n order of the vari abl es i n the data fi l e, they are:

Capaci ty Generati ng capaci ty

1 Less than 800 MW’s (Mega Watts)

2 800-1000

3 Greater than 1000

Experi ence Experi ence of the archi tect-engi neer

i n bui l di ng power pl ants

1 1-3 pl ants

2 4-9 pl ants

3 10 or more pl ants

Ti me ti me i n months between i ssui ng of constructi on permi t

and i ssui ng of operati ng l i cense.

Cost cost i n mi l l i ons of dol l ars adjusted to a 1976 base (I n 1976

dol l ars).

The anal yst shoul d choose the anal ysi s that best conforms to the type of

i nformati on col l ected i n the data and the research or anal ysi s questi on(s)

you wi sh to answer. We feel that i n a short course there i s an advantage

i n descri bi ng the vari ous types of anal yses that can be done. However, i n

practi ce you woul d run onl y the most appropri ate anal ysi s. I n other

words i f there were two factors i n your study, you woul d run a two-factor

anal ysi s and not begi n wi th one factor anal ysi s as we do here.

Chapter 2

DESCRIPTION OF

THE DATA

Note About the

Analyses That

Follow

SPSS Training

Examining Data and Testing Assumptions 2 - 2

The researcher shoul d state the research questi ons cl earl y and conci sel y,

and refer to these questi ons regul arl y as the desi gn and i mpl ementati on

of the study progresses. Wi thout thi s statement of questi ons, i t i s easy to

devi ate from them when engrossed i n the detai l s of pl anni ng or to make

deci si ons that are at vari ance wi th the questi ons when i nvol ved i n a

compl ex study. Transl ati ng study objecti ves i nto questi ons serves as a

check on whether the study has met the objecti ves.

The next task i s to anal yze the researchabl e questi on(s). I n doi ng thi s

one must

• I denti fy and defi ne key terms

• I denti fy sub questi ons, whi ch must al so be answered

• I denti fy the scope and ti me frame i mposed by the researchabl e

questi on

One of the most i mportant deci si ons that shoul d not be overl ooked i s to

set down i n terms of utmost cl ari ty exactl y what i nformati on i s needed. I t

i s usual l y good procedure to veri fy that al l the data are rel evant to the

purposes of the study and that no essenti al data are omi tted. Unl ess thi s

i s speci fi ed, the reporti ng forms may yi el d i nformati on that i s qui te

di fferent from what i s needed, si nce there i s a tendency to request too

much data, some of whi ch i s subsequentl y never anal yzed.

I t i s cri ti cal that the researcher be fami l i ar wi th the data bei ng anal yzed,

whether i t i s pri mary (data you col l ected) or secondary (someone el se

col l ected i t) data. Not onl y i s knowi ng your data i mportant to defi ni ng

your popul ati on, but i t can (1) hel p to spot trends on whi ch to focus, and

(2) provi de assurance that you are measuri ng what you want to measure.

Vi sual l y revi ew the data for several cases (or the enti re data set i f i t i s

rel ati vel y smal l ). Be fami l i ar wi th the meani ng of every vari abl e and wi th

the codes associ ated wi th the vari abl es of i nterest.

Before appl yi ng formal tests (ANOVA for exampl e i n thi s course) to your

data, i t i s i mportant to fi rst exami ne and check the data. Thi s i s done for

several reasons:

• To i denti fy data errors

• To i denti fy unusual poi nts – outl i ers

• To become aware of unexpected or i nteresti ng patterns

• To check on or test the assumpti ons of the pl anned anal ysi s

• For ANOVA:

• Homogenei ty of vari ance

• Normal i ty of error

Research

Question(s)

Data to be

Collected

Know the Data

Scan the Data

WHY EXAMINE

THE DATA?

Examining Data and Testing Assumptions 2 - 3

SPSS Training

Bar charts and hi stograms, as wel l as such summari es as means and

standard devi ati ons have been used i n stati sti cal work for many years.

Someti mes such summari es are ends i n thei r own ri ght; other ti mes they

consti tute a prel i mi nary l ook at the data before proceedi ng wi th more

formal methods. Seei ng l i mi tati ons i n thi s standard set of procedures,

John Tukey, a stati sti ci an at Pri nceton and Bel l Labs, devi sed a col l ecti on

of stati sti cs and pl ots desi gned to reveal data features that mi ght not be

readi l y apparent from standard stati sti cal summari es. I n hi s book

descri bi ng these methods, enti tl ed Expl oratory Data Anal ysi s (1977),

Tukey descri bed the work of a data anal yst to be si mi l ar to that of a

detecti ve, the goal bei ng to di scover surpri si ng, i nteresti ng, and unusual

thi ngs about the data. To further thi s effort Tukey devel oped both pl ots

and data summari es. These methods, cal l ed expl oratory data anal ysi s

and abbrevi ated EDA, have become very popul ar i n appl i ed stati sti cs and

data anal ysi s. Expl oratory data anal ysi s can be vi ewed ei ther as an

anal ysi s i n i ts own ri ght, or as a set of data checks and i nvesti gati ons

performed before appl yi ng i nferenti al testi ng procedures.

These methods are best appl i ed to vari abl es that have at l east ordi nal

(more commonl y i nterval ) scal e properti es and can take on many

di fferent val ues. The pl ots and summari es woul d be l ess hel pful for a

vari abl e that takes on onl y a few val ues (for exampl e, on fi ve poi nt rati ng

scal es)

We wi l l use the SPSS EXPLORE procedure to exami ne the data and test

some of the ANOVA assumpti ons. I n wi ndows we fi rst open the fi l e.

Al l fi l es for thi s cl ass are l ocated i n the c:\Trai n\Anova fol der on your

trai ni ng machi ne. I f you are not worki ng i n an SPSS Trai ni ng center, the

trai ni ng fi l es can be copi ed from the fl oppy di sk that accompani es thi s

course gui de. I f you are runni ng SPSS Server (cl i ck Fi l e..Swi tch Server to

check), then you shoul d copy these fi l es to the server or a machi ne that

can be accessed (mapped from) the computer runni ng SPSS Server.

SPSS can di spl ay ei ther vari abl e names or vari abl e l abel s i n di al og boxes.

I n thi s course we di spl ay the vari abl e names i n al phabeti cal order. I n

order to match the di al og boxes shown here:

Cl i ck Edit..Options

Wi thi n the General tab of the Opti ons di al og:

Cl i ck the Display names and Alphabetical opti on buttons i n

the Di spl ay Vari abl es area

Cl i ck OK.

Cl i ck File..Open..Data (move to the c:\ Train\ Anova di rectory)

Sel ect SPSS Portable file (.por) from Files of Type l i st

Doubl e-cl i ck on Plant.por to open the fi l e.

Cl i ck on Analyze..Descriptive Statistics..Explore

Move the cost vari abl e i nto the Dependent List box

EXPLORATORY

DATA ANALYSIS

Plan of Analysis

A Note About

Variable Names

and Labels in

Dialog Boxes

Note on

Course Data Files

SPSS Training

Examining Data and Testing Assumptions 2 - 4

Figure 2.1 Explore Dialog Box

The syntax for runni ng the Expl ore procedure i s gi ven bel ow:

EXAMI NE

VARI ABLES=cost

/PLOT BOXPLOT STEMLEAF

/COMPARE GROUP

/STATI STI CS DESCRI PTI VES

/CI NTERVAL 95

/MI SSI NG LI STWI SE

/NOTOTAL.

The vari abl e to be summari zed (here cost) appears i n the Dependent

Li st box. The Factor l i st box can contai n one or more categori cal (for

exampl e, i n our data set capaci ty) vari abl es, and i f used woul d cause the

procedure to present summari es for each subgroup based on the factor

vari abl e(s). We wi l l use thi s feature l ater i n thi s chapter when we want to

see di fferences between the groups. By defaul t, both pl ots and stati sti cal

summari es wi l l appear. We can request speci fi c stati sti cal summari es

and pl ots usi ng the Stati sti cs and Pl ots pushbuttons. Whi l e not di scussed

here, the Expl ore procedure can pri nt robust mean esti mates (M-

esti mators) and l i sts of extreme val ues, as wel l as normal probabi l i ty and

homogenei ty pl ots.

Cl i ck OK to run the Expl ore procedure.

Examining Data and Testing Assumptions 2 - 5

SPSS Training

The Expl ore procedure provi des for us i n thi s fi rst run a summary of the

vari abl e cost for al l 32 pl ants.

Figure 2.2 Descriptives for the Variable Cost

A LOOK AT THE

VARIABLE COST

Expl ore fi rst di spl ays i nformati on about mi ssi ng data. The Case

Process Summary pi vot tabl e (not shown) di spl ays the number of val i d

and mi ssi ng observati ons; thi s i nformati on appears at the begi nni ng of

the stati sti cal summary. Here we have data for the vari abl e cost for al l 32

observati ons. (Typi cal l y an anal yst does not have al l the data.)

Next several measures of central tendency appear. Such stati sti cs

attempt to descri be, wi th a si ngl e number, where the data val ues are

typi cal l y found, or the center of the di stri buti on. The mean i s the

ari thmeti c average. The medi an i s the val ue at the center of the

di stri buti on when i t i s ordered (ei ther l owest to hi ghest or hi ghest to

l owest), that i s, hal f the data val ues are greater than, and hal f the data

val ues are l ess than, the medi an. Medi ans are resi stant to extreme

scores, and so are consi dered to be a robust measure of central tendency.

The 5% tri mmed mean i s the mean cal cul ated after the extreme upper 5%

and the extreme l ower 5% of the data val ues are dropped from the

cal cul ati on. Such a measure woul d be resi stant to smal l numbers of

extreme or wi l d scores. I n thi s case the three measures of central

tendency are si mi l ar (461.56, 448.11, and 455.67), and we can say that

the typi cal pl ant costs about $450 mi l l i on. I f the mean were consi derabl y

above or bel ow the medi an and the tri mmed mean, i t woul d suggest a

Measures of

Central Tendency

SPSS Training

Examining Data and Testing Assumptions 2 - 6

skewed or asymmetri c di stri buti on. A perfectl y symmetri c di stri buti on,

for exampl e, the normal , woul d produce i denti cal expected means,

medi ans, and tri mmed means.

Expl ore provi des several measures of the amount of vari ati on across the

pl ants. They i ndi cate to what degree observati ons tend to cl uster near the

center of the di stri buti on. Both the standard devi ati on and vari ance

(standard devi ati on squared) appear. For exampl e, i f al l the observati ons

were l ocated at the mean then the standard devi ati on woul d be zero. I n

thi s case the standard devi ati on i s $170.12 (mi l l i on). Another way to

express the vari abi l i ty i s that the standard devi ati on i s 36.86% of the

mean, whi ch i ndi cates that the data i s moderatel y vari abl e. The standard

error i s an esti mate of the standard devi ati on of the mean i f repeated

sampl es of the same si ze were taken from the same popul ati on ($30.07).

I t i s used i n cal cul ati ng the 95% confi dence i nterval for the sampl e mean

di scussed bel ow. Al so appeari ng i s the i nterquarti l e range, whi ch i s

essenti al l y the range between the 25

th

and 75

th

percenti l e val ues. Thus

the i nterquarti l e range represents the range i ncl udi ng the mi ddl e 50

percent of the sampl e (321.74). I t i s a vari abi l i ty measure more resi stant

to extreme scores than the standard devi ati on. We al so see the mi ni mum

and maxi mum dol l ar amounts and the range. I t i s useful to check the

mi ni mum and maxi mum to make sure no i mpossi bl e data val ues are

recorded (here a cost at zero or bel ow).

The 95% confi dence i nterval has a techni cal defi ni ti on: i f we were to

repeatedl y perform the study and computed the confi dence i nterval s for

each sampl e drawn, on average, 95 out of each 100 such confi dence

i nterval s woul d contai n the true popul ati on mean. I t i s useful i n that i t

combi nes measures of both central tendency (mean) and vari ati on

(standard error) to provi de i nformati on about where we shoul d expect the

popul ati on mean to fal l . Here, we can say that we esti mate the cost of the

l i ght water nucl ear power pl ants to be $461.56 and we are 95-percent

confi dent that the true but unknown cost woul d be between $400.23 and

$522.90.

The 95% confi dence i nterval for the mean can be easi l y obtai ned from

the sampl e mean, standard devi ati on, and sampl e si ze. The confi dence

i nterval i s based on the sampl e mean, pl us or mi nus 1.96 ti mes the

standard error of the mean. (1.96 i s used because 95% of the area under a

normal curve i s wi thi n 1.96 standard devi ati on of the mean [when doi ng

i n my head I cheat and use 2 si nce i t i s easi er to mul ti pl y by]). Si nce the

sampl e standard error of the mean i s si mpl y the sampl e standard

devi ati on di vi ded by the square root of the sampl e si ze, the 95%

confi dence i nterval i s equal to the sampl e mean pl us or mi nus 1.96 ti mes

(sampl e standard devi ati on di vi ded by {square root of the sampl e si ze}).

Thus i f you have the sampl e mean, sampl e standard devi ati on, and the

sampl e si ze, you can easi l y compute the 95-percent confi dence i nterval .

Variability

Measures

Confidence

Interval for Mean

Examining Data and Testing Assumptions 2 - 7

SPSS Training

Skewness and Kurtosi s provi de numeri c summari es about the shape of

the di stri buti on of the data. Whi l e many anal ysts are content to vi ew

hi stograms i n order to make judgments regardi ng the di stri buti on of a

vari abl e, these measures quanti fy the shape. Skewness i s a measure of

the symmetry of a di stri buti on. I t i s normed so that a symmetri c

di stri buti on has zero skewness. Posi ti ve skewness i ndi cates bunchi ng of

the data on the l eft and a l onger tai l on the ri ght (for exampl e, i ncome

di stri buti on i n the U.S.); negati ve skewness fol l ows the reverse pattern

(l ong tai l on the l eft and bunchi ng of the data on the ri ght). The standard

error of skewness al so appears, and we can use i t to determi ne i f the data

are si gni fi cantl y skewed. I n our case, the skewness i s .5 wi th a standard

error of .414. Thus, usi ng the formul a above the 95-percent confi dence

i nterval for skewness i s between –0.311 and +1.311. Si nce the i nterval

contai ns zero the data i s not si gni fi cantl y skewed. (As a qui ck and di rty

rul e of thumb, however, i f the skewness i s over 3 i n ei ther di recti on you

mi ght want to consi der a di fferent approach i n your study.)

Kurtosi s al so has to do wi th the shape of a di stri buti on and i s a

measure of how peaked the di stri buti on i s. I t i s normed to the normal

curve (kurtosi s i s zero). A curve that i s more peaked than the normal has

a posi ti ve val ue and one that i s fl atter than the normal has negati ve

kurtosi s. Agai n our data i s not si gni fi cantl y peaked. (Agai n the same rul e

of thumb can be appl i ed al though some say that the val ue shoul d be

l arger). The shape of the di stri buti on can be of i nterest i n i ts own ri ght.

Al so, assumpti ons are made about the shape of the data di stri buti on

wi thi n each group when performi ng si gni fi cance tests on mean

di fferences between groups. (As a qui ck rul e of thumb, however, i f the

kurtosi s i s over 3 i n ei ther di recti on you mi ght want to consi der a

di fferent approach i n your study.)

The stem & l eaf pl ot i s model ed after the hi stogram, but i s desi gned to

provi de more i nformati on. I nstead of usi ng a standard symbol (for

exampl e, an asteri sk “*” or bl ock character) to di spl ay a case or group of

cases, the stem & l eaf pl ot uses data val ues as the pl ot symbol s. Thus the

shape of the di stri buti on i s shown and the pl ot can be read to obtai n

speci fi c data val ues. The stem & l eaf pl ot for the cost appears bel ow:

Figure 2.3 Stem & Leaf Plot for Cost

Shape of the

Distribution

Stem & Leaf Plot

SPSS Training

Examining Data and Testing Assumptions 2 - 8

I n a stem & l eaf pl ot the stem i s the verti cal axi s and the l eaves

branch hori zontal l y from the stem (Tukey devi sed the stem & l eaf). The

stem wi dth i ndi cates how to i nterpret the uni ts i n the stem; i n thi s case a

stem uni t represents one hundred dol l ars i n the cost scal e. The actual

numbers i n the chart (l eaves) provi de an extra deci mal pl ace of

i nformati on about the data val ues. For exampl e the stem of 5 and a l eaf

of 6 woul d i ndi cate a cost of $560 to $569. Thus besi des vi ewi ng the shape

of the di stri buti on we can pi ck out i ndi vi dual scores. Bel ow the di agram a

note i ndi cates that each l eaf represents one case. For l arge sampl es a l eaf

may represent two or more cases and i n such si tuati ons an ampersand

(&) represents two or more cases that have di fferent data val ues.

The l ast l i ne i denti fi es outl i ers. These are data poi nts far enough

from the center of the di stri buti on (defi ned more exactl y under Box &

Whi sker pl ots bel ow) that they mi ght meri t more careful checki ng –

extreme poi nts mi ght be data errors or possi bl y represent a separate

subgroup. I f the stem & l eaf pl ot were extended to i ncl ude these outl i ers

the skewness woul d be apparent.

The stem & l eaf pl ot attempts to descri be data by showi ng every

observati on. I n compari son, di spl ayi ng onl y a few summari es, the box &

whi sker pl ot wi l l i denti fy outl i ers (data val ues far from the center of the

di stri buti on). Bel ow we see the box & whi sker pl ot (al so cal l ed a box pl ot)

for cost.

Figure 2.4 Box & Whisker Plot for Cost

Box & Whisker

Plot

The verti cal axi s i s the cost of the pl ants. I n the pl ot, the sol i d l i ne

i nsi de the box represents the medi an. The “hi nges” provi de the top and

Examining Data and Testing Assumptions 2 - 9

SPSS Training

bottom borders to the box; they correspond to the 75

th

and 25

th

percenti l e

val ues of cost, and thus defi ne the i nterquarti l e range (I QR). I n other

words, the mi ddl e 50% of the data val ues fal l wi thi n the box. The

“whi skers” are the l ast data val ues that l i e wi thi n 1.5 box l engths (or

I QRs) of the respecti ve hi nge (edge of box). Tukey consi ders data poi nts

more than 1.5 box l engths from the hi nges to be far enough from the

center to be noted as outl i ers. Such poi nts are marked wi th a ci rcl e.

Poi nts more than 3 box l engths from the hi nges are vi ewed by Tukey to

be “far out” poi nts and are marked wi th an asteri sk type symbol . Thi s

pl ot has no outl i ers or far-out poi nts. I f a si ngl e outl i er appears at a gi ven

data val ue, the case sequence number pri nts out besi de i t (an i d vari abl e

can be substi tuted), whi ch ai ds data checki ng.

I f the di stri buti on were symmetri c, then the medi an woul d be

centered wi thi n the hi nges and the whi skers. I n the pl ot above, the

di fferent l engths of the whi skers show the skewness. Such pl ots are al so

useful when compari ng several groups, as we wi l l see shortl y.

We now produce the same summari es and pl ots for each subgroup (here

based on pl ant capaci ty).

Cl i ck on the Dialog Recall tool on the tool bar.

Cl i ck on the Explore procedure

When the di al og box opens move the vari abl e capacity to the

Factors List box.

Figure 2.5 Explore Dialog Box

A LOOK AT THE

SUBGROUPS

We al so request normal i ty pl ots and homogenei ty tests.

SPSS Training

Examining Data and Testing Assumptions 2 - 10

Cl i ck Plots pushbutton

Cl i ck Normality plots with tests check box

Cl i ck Power estimation opti on button

Figure 2.6 Plots Sub-Dialog Box

Cl i ck Continue

Cl i ck OK

The command bel ow wi l l run the anal ysi s

EXAMI NE

VARI ABLES=cost BY capaci ty

/PLOT BOXPLOT STEMLEAF NPPLOT SPREADLEVEL

/COMPARE GROUP

/STATI STI CS DESCRI PTI VES

/CI NTERVAL 95

/MI SSI NG LI STWI SE /NOTOTAL.

The Nppl ot keyword on the /Pl ot subcommand requests the normal

probabi l i ty pl ots, whi l e the Spreadl evel keyword wi l l produce the spread

& l evel pl ots and the homogenei ty of vari ance tests.

Bel ow we see the stati sti cs and the stem & l eaf pl ot for the fi rst

capaci ty group (under 800 MW). Noti ce that rel ati ve to the group (not the

enti re set of pl ants as i n the previ ous pl ots) there i s an extreme score.

Examining Data and Testing Assumptions 2 - 11

SPSS Training

Figure 2.7 Descriptives for the First Group

Figure 2.8 Stem & leaf Plot for the First Group

NORMALITY

The next pai r of pl ots provi des some speci fi c i nformati on about the

normal i ty of data poi nts wi thi n the group. Thi s i s equi val ent to

exami ni ng the normal i ty of the resi dual s i n ANOVA and i s one of the

assumpti ons made when the “F” tests of si gni fi cance are made.

SPSS Training

Examining Data and Testing Assumptions 2 - 12

Figure 2.9 Q-Q Plot of the First Group

Figure 2.10 Detrended Q-Q Plot of the First Group

The fi rst pl ot i s cal l ed a normal probabi l i ty pl ot. Each poi nt i s pl otted

wi th i ts actual val ue on the hori zontal axi s and i ts expected normal

devi ate val ue (based on the poi nt’s rank-order wi thi n the group). I f the

data fol l ow a normal di stri buti on, the poi nts form a strai ght l i ne.

The second pl ot i s a detrended normal pl ot. Here the devi ati ons of

each poi nt from a strai ght l i ne (normal di stri buti on) i n the previ ous pl ot

are pl otted agai nst the actual val ues. I deal l y, they woul d di stri bute

randoml y around zero.

Next we l ook at the second group.

Examining Data and Testing Assumptions 2 - 13

SPSS Training

Figure 2.11 Descriptives for the Second Group

Figure 2.12 Stem & Leaf Plot for the Second Group

For the second group the stem & l eaf pl ot shows a concentrati on of

costs at the l ow end.

Figure 2.13 Q-Q Plot for the Second Group

SPSS Training

Examining Data and Testing Assumptions 2 - 14

Figure 2.14 Detrended Q-Q Plot for the Second Group

The pattern from the stem & l eaf pl ot carri es over to the normal

probabi l i ty pl ot where the cl uster of l ow cost val ues show i n the l ower l eft

corner of the pl ot.

Let us exami ne the resul ts for the thi rd group.

Figure 2.15 Descriptives for the Third Group

Examining Data and Testing Assumptions 2 - 15

SPSS Training

Figure 2.16 Stem & Leaf Plot for the Third Group

Figure 2.17 Q-Q Plot for the Third Group

SPSS Training

Examining Data and Testing Assumptions 2 - 16

Figure 2.18 Detrended Q-Q Plot for the Third Group

I n addi ti on to a vi sual i nspecti on, two tests of normal i ty of the data

are provi ded. The test l abel ed Kol mogorov-Smi rnov i s a modi fi cati on of i t

usi ng the Li l l i efors Si gni fi cance Correcti on (i n whi ch means and

vari ances must be esti mated from the data) compari ng the di stri buti on of

the data val ues wi thi n the group to the normal di stri buti on. The Shapi ro-

Wi l ks test al so compares the observed data to the normal di stri buti on

and has been found to have good power i n many si tuati ons when

compared to other tests of normal i ty (see Conover, 1980). For the fi rst

group there seem to be no probl ems regardi ng normal i ty, nor any

stri ki ngl y odd data val ues. Noti ce al so that for the second group the tests

of normal i ty reject the nul l hypothesi s that the data comes from a normal

di stri buti on, whi l e the thi rd group the nul l hypothesi s i s not rejected.

Figure 2.19 Tests of Normality

Examining Data and Testing Assumptions 2 - 17

SPSS Training

The box and whi sker al l ows vi sual compari son of the groups.

Figure 2.20 Box and Whiskers Plot

COMPARING THE

GROUPS

The thi rd group appears to contai n hi gher cost pl ants than the fi rst

and second groups. The vari ati on wi thi n each group as gauged by the

whi skers seems fai rl y uni form. Noti ce the outl i er i n group one i s

i denti fi ed by i ts case sequence number. There does not seem to be any

i ncrease i n vari ati on or spread as the medi an cost ri ses from the fi rst to

thi rd group.

Homogenei ty of vari ance wi thi n each popul ati on group i s one of the

assumpti ons i n ANOVA. Thi s can be tested by any of several stati sti cs

and i f the vari ance i s systemati cal l y rel ated to the l evel of the group

(mean, medi an) data transformati ons can be performed to rel i eve thi s (we

wi l l say more on thi s l ater i n thi s chapter). The spread and l evel pl ot

bel ow provi des a di spl ay of thi s by pl otti ng the natural l og of the spread

(i nterquarti l e range) of the group agai nst the natural l og of the group

medi an. I f you can overcome a seemi ngl y i nborn aversi on to l ogs and vi ew

the pl ot, we desi re rel ati vel y l i ttl e vari ati on i n the l og spread goi ng across

the groups – whi ch woul d suggest that the vari ances are stabl e across

groups. The reason for taki ng l ogs i s techni cal . I f there i s a systemati c

rel ati on between the spread and the l evel (or vari ances and means), the

sl ope of the best fi tti ng l i ne i ndi cates what data transformati on (wi thi n

the cl ass of power transformati ons) wi l l best stabi l i ze the vari ances

across the di fferent groups. We wi l l say more about such transformati ons

l ater.

HOMOGENEITY

OF VARIANCE

SPSS Training

Examining Data and Testing Assumptions 2 - 18

Figure 2.21 Spread and Level Plot

Figure 2.22 Test of Homogeneity of Variance

A number of tests are avai l abl e for testi ng homogenei ty of vari ance,

such as the Bartl ett-Box and Cochran’s C tests of homogenei ty of

vari ance. However, these are sensi ti ve to departures from normal i ty as

wel l . The Levene tests appeari ng above are l ess sensi ti ve to departures

from normal i ty and mi ght be preferred for that reason. Some stati sti ci ans

consi der the former tests too powerful i n general ; that i s, they tend to

reject the homogenei ty of vari ance assumpti on when the di fferences are

too smal l to i nfl uence the anal ysi s. Above, the Levene test suggests no

probl em wi th the homogenei ty assumpti on.

Examining Data and Testing Assumptions 2 - 19

SPSS Training

Overal l the data fared fai rl y wel l i n terms of the ANOVA assumpti ons.

The onl y probl em was normal i ty of group 2. I f i nequal i ty of vari ances was

a probl em and a data transformati on appl i ed, that mi ght rel i eve the

di ffi cul ty but no such transformati on i s cal l ed for. Si nce two of the three

groups seem fi ne we wi l l proceed wi th the anal ysi s.

Bel ow we state i n more detai l ed and formal terms the i mpl i cati ons of

vi ol ati ons of the assumpti ons and general condi ti ons under whi ch they

consti tute a seri ous probl em.

I n the fi xed effects model thi s assumpti on i s equi val ent to assumi ng that

the dependent vari abl e i s normal l y di stri buted i n the popul ati on, si nce al l

other terms i n the model are to be consi dered fi xed effects. “F” and “t”

tests used to test for di fferences among means i n the anal ysi s of vari ance

are unaffected by non-normal i ty i n l arge sampl e (thi s has l ed to the

common practi ce of referri ng to the anal ysi s of vari ance as robust wi th

respect to vi ol ati ons of the normal i ty assumpti on). Less i s known about

smal l sampl e behavi or, but the current bel i ef among most stati sti ci ans i s

that normal i ty vi ol ati ons are general l y not a cause for concern i n fi xed

effect model s.

Whi l e i nferences about means are general l y not heavi l y affected by

non-normal i ty, i nferences about vari ances and about rati os of vari ances

are qui te dependent on the normal i ty assumpti on. Thus random effects

model s are vul nerabl e to vi ol ati ons of normal i ty where fi xed effects

model s are not. More i mportant i n the general case, si nce most anal yses

of vari ance i nvol ve fi xed effects model s, i s the fact that many standard

tests of the homogenei ty of error vari ance depend on i nferences about

vari ances, and are therefore vul nerabl e to vi ol ati ons of the normal i ty

assumpti on.

Tests of the homogenei ty of vari ance assumpti on such as the Bartl ett-

Box F, Cochran’s C and the F-max cri teri on al l assume normal i ty and are

i naccurate i n the presence of nonzero popul ati on kurtosi s. I f the

popul ati on kurtosi s i s posi ti ve (si gni fyi ng a peaked or l eptokurti c

di stri buti on), these tests wi l l tend to reject the homogenei ty assumpti on

too often, whi l e a negati ve popul ati on kurtosi s (i ndi cati ve of a fl at or

pl atykurti c di stri buti on) wi l l l ead to too many fai l ures to recogni ze

vi ol ati ons of the homogenei ty assumpti on. For thi s reason the Levene test

for homogenei ty of vari ance (i ncl uded i n the Expl ore procedure) i s

strongl y recommended, as i t i s robust to vi ol ati ons of the normal i ty

assumpti on.

Vi ol ati ons of the homogenei ty of vari ance assumpti on are i n general more

troubl esome than vi ol ati ons of the normal i ty assumpti on. I n general , the

smal l er the smal l er the sampl e si zes of the groups and the more

di ssi mi l ar the si zes of the groups, the more probl emati c vi ol ati ons of thi s

assumpti on become. Thus i n a l arge sampl e wi th equal group si zes, even

Summary of the

Plant Data

EFFECTS OF

VIOLATIONS OF

ASSUMPTIONS IN

ANOVA

Normality of

Errors in the

Population

Homogeneity of

Population Error

Variances Among

Groups

SPSS Training

Examining Data and Testing Assumptions 2 - 20

moderate to severe departures from homogenei ty may not have l arge

effects on i nferences, whi l e i n smal l sampl es wi th unequal group si zes,

even sl i ght to moderate departures can be troubl esome. Thi s i s one

reason that stati sti ci ans recommend l arge sampl es and equal group si zes

whenever possi bl e.

The magni tude of effects on actual Type I error l evel of vi ol ati ons of

the homogenei ty assumpti on depends on how di ssi mi l ar the vari ances

are, how l arge i s the sampl e, and how di ssi mi l ar are the group si zes, as

menti oned above. The di recti on of the di storti on of actual Type I error

l evel depends on the rel ati onshi p between vari ances and group si zes.

Smal l er sampl e from popul ati ons wi th l arger vari ances l ead to i nfl ati on

of the actual Type I error l evel , whi l e smal l er sampl es from popul ati ons

wi th smal l er vari ances resul t i n actual Type I error l evel s smal l er than

the nomi nal test al pha l evel s.

Vi ol ati ons of the i ndependence assumpti on can be seri ous even wi th l arge

sampl es and equal group si zes. Methods such as general i zed l east

squares shoul d be used wi th autocorrel ated data.

Two further poi nts shoul d be consi dered here. Fi rst, our di scussi on

has centered on the i mpact of vi ol ati ons of assumpti ons on the actual

Type I (al pha) error l evel . When consi derati ons such as the power of a

parti cul ar test are i ntroduced, the si tuati on can qui ckl y become much

more compl i cated. I n addi ti on, most of the work on the effects of

assumpti on vi ol ati ons has consi dered each assumpti on i n i sol ati on. The

effects of vi ol ati ons of two or more assumpti ons si mul taneousl y are l ess

wel l known. For more detai l ed di scussi ons of these topi cs, see Scheffe

(1959) or Ki rk (1982). Al so, see Wi l cox (1996, 1997) for who di scusses the

effects of ANOVA assumpti on vi ol ati on and presents robust al ternati ves.

Many researchers deal wi th vi ol ati ons of normal i ty or homogenei ty of

vari ance assumpti ons by transformi ng thei r dependent vari abl e i n a

nonl i near manner. Such transformati ons i ncl ude natural l ogari thms,

square roots, etc. These types of transformati ons are al so empl oyed to

achi eve addi ti vi ty of effects i n factori al desi gns wi th non-crossover

i nteracti ons. There are, however, seri ous potenti al probl ems wi th such an

approach.

Whi l e stati sti cal procedures such as those empl oyed by SPSS are not

concerned wi th the sources of the numbers they are used to anal yze, and

wi l l produce val i d probabi l i ti es assumi ng onl y that di stri buti onal

assumpti ons are met. The i nterpretati on of anal yses of transformed data

can be qui te probl emati c i f the transformati on empl oyed i s nonl i near.

I f data are ori gi nal l y measured on an i nterval scal e, whi ch the

cal cul ati on of means assumes, then nonl i nearl y transformi ng the

dependent vari abl e and runni ng a standard anal ysi s resul ts i n a very

di fferent set of questi ons bei ng asked than wi th the dependent vari abl e i n

Population Errors

Uncorrelated with

Predictors and

with Each Other

A Note on

Transformations

Examining Data and Testing Assumptions 2 - 21

SPSS Training

the ori gi nal metri c. Asi de from the fact that a nonl i near transformati on of

an i nterval scal e destroys the i nterval properti es assumed i n the

cal cul ati on of means, the test of equal i ty of a set of means of nonl i nearl y

transformed data does not test the hypothesi s that the means of the

ori gi nal data are equal , and there i s no one to one rel ati onshi p between

the two tests. Attempts to back-transform parameter esti mates by

appl yi ng the i nverse of the ori gi nal transformati on i n order to appl y the

resul ts to the ori gi nal research hypothesi s do not work. The bi as

i ntroduced i s a compl i cated one that actual l y i ncreases wi th i ncreasi ng

sampl e si ze. For further i nformati on on thi s bi as, see Kendal l & Stuart

(1968).

The practi cal i mpl i cati ons of thi s poi nt are that studi es shoul d be

desi gned such that the vari abl es whi ch are of i nterest are measured, care

shoul d be taken to see that they meet the assumpti ons requi red to make

the computati on of basi c descri pti ve stati sti cs meani ngful , and that

commonl y appl i ed transformati ons i n cases where ANOVA model

assumpti ons are vi ol ated may cause more troubl e than they avert.

Accurate probabi l i ti es attached to si gni fi cance tests of the equal i ty of

meani ngl ess quanti ti es are of even l ess use than di storted probabi l i ti es

attached to tests concerni ng meani ngful vari abl es, especi al l y when the

di recti on and magni tude of di storti ons are of some degree esti mabl e and

can be taken i nto account when i nterpreti ng research resul ts.

I n thi s chapter we di scussed the i mpl i cati ons of vi ol ati on of some of the

assumpti ons of ANOVA: homogenei ty of vari ance, and normal i ty of error.

We used expl oratory data anal ysi s techni ques on the data set pri or to

formal anal ysi s i n order to vi ew the data and check on the assumpti ons.

I n the next chapter we wi l l proceed wi th the actual one-factor ANOVA

anal ysi s and consi der pl anned and post-hoc compari sons.

SUMMARY

SPSS Training

Examining Data and Testing Assumptions 2 - 22

One-Factor Anova 3 - 1

SPSS Training

One-Factor ANOVA

Appl y the pri nci pl es of testi ng for popul ati on mean di fferences to

si tuati ons i nvol vi ng more than two compari son groups. Understand the

concept behi nd and the practi cal use of post-hoc tests appl i ed to a set of

sampl e means.

We wi l l run a one-factor (Oneway procedure) anal ysi s of vari ance

compari ng the di fferent capaci ty groups on the cost of bui l di ng a nucl ear

power pl ant. Then, we wi l l rerun the anal ysi s requesti ng mul ti pl e

compari son (post hoc) tests to see speci fi cal l y whi ch popul ati on groups

di ffer. We wi l l then pl ot the resul ts usi ng an error bar chart. The

appendi x contai ns a nonparametri c anal ysi s of the same data.

We use the l i ght water nucl ear power pl ant data used i n the l ast chapter.

We wi sh to i nvesti gate the rel ati onshi p between the l evel of capaci ty of

these pl ants and the cost associ ated wi th bui l di ng the pl ants. One way to

approach thi s i s to group the pl ants accordi ng to thei r generati ng

capaci ty and compare these groups on thei r average cost. I n our data set

we have the pl ants grouped i nto three capaci ty categori es. Assumi ng we

retai n these categori es we mi ght fi rst ask i f there are any popul ati on

di fferences i n cost among these groups. I f there are si gni fi cant mean

di fferences overal l , we next want to know speci fi cal l y whi ch groups di ffer

from whi ch others.

A

nal ysi s of vari ance (ANOVA) i s a general method of drawi ng

concl usi ons regardi ng di fferences i n popul ati on means when two

or more compari son groups are i nvol ved. The i ndependent-groups

t test appl i es onl y to the si mpl est i nstance (two groups), whi l e ANOVA

can accommodate more compl ex si tuati ons. I t i s worth menti oni ng that

the t test can be vi ewed as a speci al case of ANOVA and they yi el d the

same resul t i n the two-group si tuati on (same si gni fi cance val ue, and the t

stati sti c squared i s equal to the ANOVA’s F stati sti c).

We wi l l compare three groups of pl ants based on thei r capaci ty and

determi ne whether the popul ati ons they represent di ffer i n the cost of

bei ng bui l t.

Chapter 3

Objective

Method

Data

Scenario

INTRODUCTION

One-Factor Anova 3 - 2

SPSS Training

The basi c l ogi c of si gni fi cance testi ng i s that we wi l l assume that the

popul ati on groups have the same mean (nul l hypothesi s), then determi ne

the probabi l i ty of obtai ni ng a sampl e wi th group mean di fferences as

l arge (or l arger) as what we fi nd i n our data. To make thi s assessment

the amount of vari ati on among the group means (between-group

vari ati on) i s compared to the amount of vari ati on among the observati ons

wi thi n each group (wi thi n-group vari ati on). Assumi ng that i n the

popul ati on the group means are equal (nul l hypothesi s), the onl y source

of vari ati on among the sampl e means woul d be the fact that the groups

are composed of di fferent i ndi vi dual observati ons. Thus the rati o of the

two sources of vari ati on (between-group/wi thi n-group) shoul d be about

one when there are no popul ati on di fferences. When the di stri buti on of

the i ndi vi dual observati ons wi thi n each group fol l ows the normal curve,

the stati sti cal di stri buti on of thi s rati o i s known (F di stri buti on) and we

can make a probabi l i ty statement about the consi stency of our data wi th

the nul l hypothesi s. The fi nal resul t i s the probabi l i ty of obtai ni ng sampl e

di fferences as l arge (or l arger) as what we found, i f there were no

popul ati on di fferences. I f thi s probabi l i ty i s suffi ci entl y smal l (usual l y

l ess than .05, i .e., l ess than 5 chances i n 100) we concl ude the popul ati on

groups di ffer.

When performi ng a t test compari ng two groups there i s onl y one

compari son that can be made: group one versus group two. For thi s

reason the groups are constructed so thei r members systemati cal l y vary

i n onl y one aspect: for exampl e, mal es versus femal es, or drug A versus

drug B. I f the two groups di ffered on more than one characteri sti c (for

exampl e, mal es gi ven drug A versus femal es gi ven drug B) i t woul d be

i mpossi bl e to di fferenti ate between the two effects (gender and drug).

Why coul dn’t a seri es of t tests be used to make compari sons among

three groups? Coul dn’t we si mpl y use t tests to compare group one versus

group two, group one versus group three, and group two versus group

three? One probl em wi th thi s approach i s that when mul ti pl e

compari sons are made among a set of group means, the probabi l i ty of at

l east one test showi ng si gni fi cance even when the null hypothesis is

true i s hi gher than the si gni fi cance l evel at whi ch each test i s performed

(usual l y 0.05 or 0.01). I n fact, i f there i s a l arge array of group means, the

probabi l i ty of at l east one test showi ng si gni fi cance i s cl ose to one

(certai nty)! I t i s someti mes asserted that an unpl anned mul ti pl e

compari son procedure can onl y be carri ed out i f the ANOVA F test has

shown si gni fi cance. Thi s i s not necessari l y true as i t depends on what the

research questi on(s) are.

There remai ns a probl em, however. I f the nul l hypothesi s i s that al l

the means are equal , the al ternati ve hypothesi s i s that at l east one of the

means i s di fferent. I f the ANOVA F test gi ves si gni fi cance, we know there

i s a di fference somewhere among the means, but that does not justi fy us

i n sayi ng that any parti cul ar compari son i s si gni fi cant. The ANOVA F

test, i n fact, i s an omnibus test, and further anal ysi s i s necessary to

l ocal i ze whatever di fferences there may be among the i ndi vi dual group

means.

LOGIC OF

TESTING FOR

MEAN

DIFFERENCES

FACTORS

One-Factor Anova 3 - 3

SPSS Training

The questi on of exactl y how one shoul d proceed wi th further anal ysi s

after maki ng the omni bus F test i n ANOVA i s not a si mpl e one. I t i s

i mportant to di sti ngui sh between those compari sons that were planned

before the data were actual l y gathered, and those that are made as part

of the i nevi tabl e process of unpl anned data-snoopi ng that takes pl ace

after the resul ts have been obtai ned. Pl anned compari sons are often

known as a-priori compari sons. Unpl anned compari sons shoul d be

termed a-posteriori compari sons, but unfortunatel y the mi snomer post

hoc i s more often used.

When the data can be parti ti oned i nto more than two groups,

addi ti onal compari sons can be made. Thi s mi ght i nvol ve one aspect or

di mensi on, for exampl e four groups each representi ng a regi on of the

country. Or the groups mi ght vary al ong several di mensi ons, for exampl e

ei ght groups each composed of a gender (two categori es) by regi on (four

categori es) combi nati on. I n thi s l atter case, we can ask addi ti onal

questi ons: (1) i s there a gender di fference? (2) i s there a regi on di fference?

(3) do gender and regi on i nteract? Each aspect or di mensi on the groups

di ffer on i s cal l ed a factor. Thus one mi ght di scuss a study or experi ment

i nvol vi ng one, two, even three or more factors. A factor i s represented i n

the data set as a categori cal vari abl e and woul d be consi dered an

i ndependent vari abl e. SPSS al l ows anal ysi s of mul ti pl e factors, and has

di fferent procedures avai l abl e based on how many factors are i nvol ved

and thei r degree of compl exi ty. I f onl y one factor i s to be studi ed use the

Oneway (or One Factor ANOVA) procedure. When two or more factors

are i nvol ved si mpl y shi ft to the general factori al procedure (General

Li near Model ..General Factori al ). I n thi s chapter we consi der a one-factor

study (capaci ty rel ati ng to the cost of the pl ants), but we wi l l di scuss

mul ti pl e factor ANOVA i n l ater chapters.

Fi rst we need to open our data set.

Cl i ck File..Open..Data (move to the c:\ Train\ Anova di rectory)

Sel ect SPSS Portable (.por) from the Fi l es of Type drop-down

l i st

Doubl e-cl i ck on plant.por to open the fi l e.

To run the anal ysi s usi ng SPSS for Wi ndows:

Cl i ck Analyze..Compare Means ..One-Way ANOVA.

Move cost i nto the Dependent List box

Move capacity i nto the Factor l i st box.

RUNNING ONE-

FACTOR ANOVA

One-Factor Anova 3 - 4

SPSS Training

Figure 3.1 One-Way ANOVA Dialog Box

Enough i nformati on has been provi ded to run the basi c anal ysi s. The

Contrasts pushbutton al l ows users to request stati sti cal tests for pl anned

group compari sons of i nterest to them. The Post Hoc pushbutton wi l l

produce mul ti pl e compari son tests that can test each group mean agai nst

every other one. Such tests faci l i tate determi nati on of just whi ch groups

di ffer from whi ch others and are usual l y performed after the overal l

anal ysi s establ i shes that some si gni fi cant di fferences exi st. Fi nal l y, the

Opti ons pushbutton control s such features as mi ssi ng val ue i ncl usi on and

whether descri pti ve stati sti cs and homogenei ty tests are desi red.

Cl i ck on the Options pushbutton

Cl i ck to sel ect both the Descriptive and Homogeneity-of-

variance

Cl i ck the Exclude cases analysis by analysis opti on button

Figure 3.2 One-way ANOVA Options Dialog Box

One-Factor Anova 3 - 5

SPSS Training

Cl i ck Continue

Cl i ck OK

The mi ssi ng val ue choi ces deal wi th how mi ssi ng data are to be

handl ed when several dependent vari abl es are gi ven. By defaul t cases

wi th mi ssi ng val ues on a parti cul ar dependent vari abl e are dropped onl y

for the speci fi c anal ysi s i nvol vi ng that vari abl e. Si nce we are l ooki ng at a

si ngl e dependent vari abl e, the choi ce has no rel evance to our anal ysi s.

The fol l owi ng syntax wi l l run the anal ysi s:

ONEWAY

cost BY capaci ty

/STATI STI CS DESCRI PTI VES HOMOGENEI TY

/MI SSI NG ANALYSI S .

The ONEWAY procedure performs a one-factor anal ysi s of vari ance.

Cost i s the dependent measure and the keyword BY separates the

dependent vari abl e from the factor vari abl e. We request descri pti ve

stati sti cs and a homogenei ty of vari ance test. We al so tol d SPSS to

excl ude cases wi th mi ssi ng data on an anal ysi s by anal ysi s basi s.

I nformati on about the groups appears i n the fi gure bel ow. We see that

costs i ncrease wi th the i ncrease i n capaci ty. 95-percent confi dence

i nterval s for the capaci ty groups are presented i n the tabl e. One shoul d

note that the standard devi ati ons for the three groups appear to be fai rl y

cl ose.

Figure 3.3 Descriptive Statistics

ONE-FACTOR

ANOVA RESULTS

Descriptive

Statistics

One-Factor Anova 3 - 6

SPSS Training

We al so requested the Levene test of homogenei ty of vari ance.

Figure 3.4 Levene Test of Homogeneity of Variance

Homogeneity of

Variance

Thi s assumpti on of equal i ty of vari ance for al l groups was tested i n

Chapter 2 usi ng the EXPLORE (Exami ne) procedure. The Levene test

al so shows that wi th thi s parti cul ar data set the assumpti on of

homogenei ty of vari ance i s met, i ndi cati ng that the vari ances do not

di ffer across groups.

What do we do i f the assumpti on of equal vari ances i s not met? I f the

sampl e si zes are cl ose to the same si ze and suffi ci entl y l arge we coul d

count on the robustness of the assumpti on to al l ow the process to

conti nue. However, there i s no general adjustment for the F test i n the

case of unequal vari ances, as there was for the t test. A stati sti cal l y

sophi sti cated anal yst mi ght attempt to appl y transformati ons to the

dependent vari abl e i n order to stabi l i ze the wi thi n-group vari ances

(vari ance stabi l i zi ng transforms). These are beyond the scope of thi s

course. I nterested readers mi ght turn to Emerson’s chapter i n Hoagl i n,

Mostel l er, and Tukey (1991) for a di scussi on from the perspecti ve of

expl oratory data anal ysi s, and note that the spread & l evel pl ot i n

EXPLORE wi l l suggest a vari ance stabi l i zi ng transform. A second and

conservati ve approach woul d be to perform the anal ysi s usi ng a

stati sti cal method that does not assume homogenei ty of vari ance. A one-

factor anal ysi s of group di fferences assumi ng that the dependent vari abl e

i s onl y an ordi nal (rank) vari abl e i s avai l abl e as a nonparametri c

procedure wi thi n SPSS. Thi s anal ysi s i s provi ded i n the appendi x to thi s

chapter. However, one shoul d note that correspondi ng nonparametri c

tests are not avai l abl e for al l anal ysi s of vari ance model s.

Figure 3.5 ANOVA Summary Table The ANOVA Table

One-Factor Anova 3 - 7

SPSS Training

The output i ncl udes the anal ysi s of vari ance summary tabl e and the

probabi l i ty val ue we wi l l use to judge stati sti cal si gni fi cance.

Most of the i nformati on i n the ANOVA tabl e i s techni cal i n nature

and i s not di rectl y i nterpreted. Rather the summari es are used to obtai n

the F stati sti c and, more i mportantl y, the probabi l i ty val ue we use i n

eval uati ng the popul ati on di fferences. Noti ce that i n the fi rst col umn

there i s a row for the between-group and a row for the wi thi n-group

vari ati on. The df col umn contai ns i nformati on about the degrees of

freedom, rel ated to the number of groups and the number of i ndi vi dual

observati ons wi thi n each group. The degrees of freedom are not

i nterpreted di rectl y, but are used i n esti mati ng the between-group and

wi thi n-group vari ati on (vari ances). Si mi l arl y, the sums of squares are

i ntermedi ate summary numbers used i n cal cul ati ng the between and

wi thi n-group vari ances. Techni cal l y they represent the sum of the

squared devi ati ons of the i ndi vi dual group means around the total grand

mean (between) and the sum of the squared devi ati ons of the i ndi vi dual

observati ons around thei r respecti ve sampl e group mean (wi thi n). These

numbers are never i nterpreted and are reported because i t i s tradi ti onal

to do so. The mean squares are measures of between and wi thi n group

vari ances. Recal l i n our di scussi on of the l ogi c of testi ng that under the

nul l hypothesi s both vari ances shoul d have the same source and the rati o

of between to wi thi n woul d be about one. Thi s rati o, the sampl e F

stati sti c, i s 4.05 and we need to deci de i f i t i s far enough from one to say

that the group means are not equal . The si gni fi cance (Si g.) col umn

i ndi cates that under the nul l hypothesi s of no group di fferences, the

probabi l i ty of getti ng mean costs thi s far (or more) apart by chance i s

under three percent (.028). I f we were testi ng at the .05 l evel , we woul d

concl ude the capaci ty groups di ffer i n average cost. I n the l anguage of

stati sti cal testi ng, the nul l hypothesi s that power pl ants of these di fferent

capaci ti es do not di ffer i n cost i s rejected at the 5% l evel .

From thi s anal ysi s we concl ude that the capaci ty groups di ffer i n terms of

cost. I n addi ti on, we woul d l i ke to know whi ch groups di ffer from whi ch

others (Are they al l di fferent? Does the hi gh capaci ty group di ffer from

each of the other two?). Thi s secondary exami nati on of pai rwi se

di fferences i s done vi a procedures cal l ed mul ti pl e compari son testi ng

(al so cal l ed post hoc testi ng and mul ti pl e range testi ng). We turn to thi s

i ssue next.

The purpose of post hoc testi ng i s to determi ne exactl y whi ch groups

di ffer from whi ch others i n terms of mean di fferences. Thi s i s usual l y

done after the ori gi nal ANOVA F test i ndi cates that al l groups are not

i denti cal . Speci al methods are empl oyed because of concern wi th

excessi ve Type I error.

I n stati sti cal testi ng, a Type I error i s made i f one fal sel y concl udes

that di fferences exi st when i n fact the nul l hypothesi s of no di fferences i s

correct (someti mes cal l ed a fal se posi ti ve). When we test at a gi ven l evel

of si gni fi cance say 5% (.05), we i mpl i ci tl y accept a fi ve percent chance of a

Type I error occurri ng. The more tests we perform, the greater the overal l

chances of one or more Type I errors croppi ng up.

Conclusion

POST-HOC

TESTING

One-Factor Anova 3 - 8

SPSS Training

Thi s i s of parti cul ar concern i n our exami nati on of whi ch groups

di ffer from whi ch others si nce the more groups we have the more tests we

make. I f we consi der pai rwi se tests (al l pai ri ngs of groups, the number of

tests for K groups i s {[(K)*(K-1)]/2}. Thus for three groups, three tests are

made, but for 10 groups, 45 tests woul d appl y. The purpose of the post

hoc methodol ogy i s to al l ow such testi ng si nce we have i nterest i n

knowi ng whi ch groups di ffer, yet appl y some degree of control over the

Type I error.

There are di fferent schemes of control l i ng for Type I error i n post hoc

testi ng. SPSS makes many of them avai l abl e. We wi l l bri efl y di scuss the

di fferent post hoc tests, and then appl y some of them to the nucl ear pl ant

data. We wi l l appl y several post hoc methods for compari son purposes, i n

practi ce, usual l y onl y one woul d be run.

The i deal post hoc test woul d demonstrate ti ght control of Type I error,

have good stati sti cal power (probabi l i ty of detecti ng true popul ati on

di fferences), and be robust over assumpti on vi ol ati ons (fai l ure of

homogenei ty of vari ance, nonnormal error di stri buti ons). Unfortunatel y,

there are i mpl i ci t tradeoffs i nvol vi ng some of these desi red features (Type

I error and power) and no one current post hoc procedure i s best i n al l

areas. Coupl e to thi s the facts that there are di fferent stati sti cal

di stri buti ons on whi ch pai rwi se tests can be based (t, F, studenti zed

range, and others) and that there are di fferent l evel s at whi ch Type I

error can be control l ed (per i ndi vi dual test, per fami l y of tests, vari ati ons

i n between), and you have a huge col l ecti on of post hoc tests.

We wi l l bri efl y compare post hoc tests from the perspecti ve of bei ng

l i beral or conservati ve regardi ng the control of the fal se posi ti ve rate and

appl y several to our data. There i s a ful l l i terature (i ncl udi ng several

books) devoted to the study of post hoc (al so cal l ed mul ti pl e compari son or

mul ti pl e range tests, al though there i s a techni cal di sti ncti on between the

two) tests. More recent books (Toothaker, 1991) summari ze si mul ati on

studi es that compare post hoc tests on thei r power (probabi l i ty of

detecti ng true popul ati on di fferences) as wel l as performance under

di fferent scenari os of patterns of group means, and assumpti on vi ol ati ons

(homogenei ty of vari ance).

The exi stence of numerous post hoc tests suggests that there i s no

si ngl e approach that stati sti ci ans agree wi l l be opti mal i n al l si tuati ons.

I n some research areas, publ i cati on revi ewers requi re a parti cul ar post

hoc method, whi ch si mpl i fi es the researcher’s deci si on.

Bel ow we present some tests roughl y ordered from the most l i beral

(greater stati sti cal power and greater fal se posi ti ve rate) to the most

conservati ve (smal l er fal se posi ti ve rate, l ess stati sti cal power), and

menti on some desi gned to adjust for the l ack of homogenei ty of vari ance.

WHY SO MANY

TESTS?

One-Factor Anova 3 - 9

SPSS Training

The LSD or l east si gni fi cant di fference method si mpl y appl i es the

standard t tests to al l possi bl e pai rs of group means. No adjustment i s

made based on the number of tests performed. The argument i s that si nce

an overal l di fference i n group means has al ready been establ i shed at the

sel ected cri teri on l evel (say .05), no addi ti onal control i s necessary. Thi s i s

the most l i beral of the post hoc tests.

The SNK (Student-Newman-Keul s), REGWF (Ryan-Ei not-Gabri el -Wal sh

F), REGWQ (Ryan-Ei not-Gabri el -Wal sh Q [based on studenti zed range

stati sti c]), and Duncan methods i nvol ve sequenti al testi ng. After orderi ng

the group means from l owest to hi ghest, the two most extreme means are

tested for a si gni fi cant di fference usi ng a cri ti cal val ue adjusted for the

fact that these are extremes from a l arger set of means. I f these means

are found not to be si gni fi cantl y di fferent, the testi ng stops; i f they are

di fferent then the testi ng conti nues wi th the next most extreme pai rs,

and so on. Al l are more conservati ve than the LSD. REGWF and REGWQ

i mprove on the tradi ti onal l y used SNK i n that they adjust for the sl i ghtl y

el evated fal se posi ti ve rate (Type I error) that SNK has when the set of

means tested i s much smal l er than the ful l set.

The Bonferroni (al so cal l ed the Dunn procedure) and Si dak (al so cal l ed

Dunn-Si dak) perform each test at a stri ngent si gni fi cance l evel to ensure

that the overal l (experi ment wi de) fal se posi ti ve rate does not exceed the

speci fi ed val ue. They are based on i nequal i ti es rel ati ng the probabi l i ty of

one or more fal se posi ti ves for a set of i ndependent tests. For exampl e,

the Bonferroni i s based on an addi ti ve i nequal i ty, so the cri teri on l evel for

each pai rwi se test i s obtai ned by di vi di ng the ori gi nal cri teri on l evel (say

.05) by the number of pai rwi se compari sons made. Thus wi th three

means and therefore 3 pai rwi se compari sons, each Bonferroni test wi l l be

performed at the .05/3 or .016667 l evel .

The Tukey(b) test i s a compromi se test, combi ni ng the Tukey (see bel ow)

and the SNK cri teri on produci ng a test that fal l s between the two.

Tukey (al so cal l ed Tukey HSD, WSD, or Tukey(a) test): Tukey’s HSD

(Honestl y Si gni fi cant Di fference) control s the fal se posi ti ve rate

experi ment wi de. Thi s means i f you are testi ng at the .05 l evel , that when

performi ng al l pai rwi se compari sons, the probabi l i ty of obtai ni ng one or

more fal se posi ti ves i s .05. I t i s more conservati ve than the Duncan and

SNK. I f al l pai rwi se compari sons are of i nterest, whi ch i s usual l y the

case, Tukey’s test i s more powerful than the Bonferroni and Si dak.

Scheffe’s method al so control s the overal l (or experi ment wi de) error rate.

I t adjusts not onl y for the pai rwi se compari sons, but for any possi bl e

compari son the researcher mi ght ask. As such i t i s the most conservati ve

of the avai l abl e methods (fal se posi ti ve rate i s l east), but has l ess

stati sti cal power.

LSD

SNK, REGWF,

REGWQ, and

Duncan

Bonferroni &

Sidak

Tukey(b)

Tukey

Scheffe

One-Factor Anova 3 - 10

SPSS Training

Most post hoc procedures menti oned earl i er (excepti ng LSD, Bonferroni ,

and Si dak) were deri ved assumi ng equal sampl e si zes i n addi ti on to

homogenei ty of vari ance and normal i ty of error. When subgroup sampl e

si zes are unequal , SPSS substi tutes a compromi se val ue (the harmoni c

mean) for the sampl e si zes. Hochberg’s GT2 and Gabri el ’s post hoc test

expl i ci tl y al l ow for unequal sampl e si zes.

The Wal l er-Duncan takes an i nteresti ng approach (Bayesi an) that

adjusts the cri teri on val ue based on the si ze of the overal l F stati sti c i n

order to be sensi ti ve to the types of group di fferences associ ated wi th the

F (for exampl e, l arge or smal l ). Al so, you can speci fy the rati o of Type I

(fal se posi ti ve) to Type I I (fal se negati ve) error i n the test. Thi s feature

al l ows for adjustments i f there are di fferenti al costs to the two types of

error.

Each of these post hoc tests adjusts for unequal vari ances and sampl e

si zes i n the groups. Si mul ati on studi es suggest that al though Games-

Howel l can be too l i beral when the group vari ances are equal and sampl e

si zes are unequal , i t i s more powerful than the others.

An approach some anal ysts take i s to run both a l i beral (say LSD)

and a conservati ve (Scheffe or Tukey HSD) post hoc test. Group

di fferences that show up under both cri teri a are consi dered sol i d fi ndi ngs,

whi l e those found di fferent onl y under the l i beral cri teri on are vi ewed as

tentati ve resul ts.

To i l l ustrate the di fferences among the post hoc tests we wi l l request

si x di fferent post hoc tests: (1) LSD, (2) Duncan, (3) SNK, (4) Tukey’s

HSD, (5) Bonferroni , and (6) Scheffe.

Cl i ck Dialog Recall button

Sel ect One-Way ANOVA

Wi thi n the One-Way ANOVA Di al og Box

Cl i ck on the Post Hoc pushbutton.

Sel ect the fol l owi ng types of post hoc tests: LSD, Duncan, SNK,

Tukey, Bonferroni, and Scheffe.

Specialized Post

Hocs Unequal

Ns:

Hochberg’s GT2

& Gabriel

Waller-Duncan

Unequal

Variances and

Unequal Ns:

Tamhane T2,

Dunnett’s T3,

Games-Howell,

Dunnett’s C

One-Factor Anova 3 - 11

SPSS Training

Figure 3.6 Post Hoc Dialog Box

By defaul t, stati sti cal tests wi l l be done at the .05 l evel . For some

tests you may suppl y your preferred cri teri on l evel . The command to run

the post hoc anal ysi s appears bel ow.

ONEWAY

cost BY capaci ty

/MI SSI NG ANALYSI S

/POSTHOC = SNK TUKEY DUNCAN SCHEFFE LSD

BONFERRONI ALPHA(.05).

Post hoc tests are requested usi ng the POSTHOC subcommand. The

STATSISTICS subcommand need not be i ncl uded here si nce we have

al ready vi ewed the means and di scussed the homogenei ty test.

Cl i ck Continue

Cl i ck OK

The begi nni ng part of the output contai ns the ANOVA tabl e,

descri pti ve stati sti cs, and the homogenei ty test, whi ch we have al ready

revi ewed. We wi l l move di rectl y to the post hoc test resul ts.

One-Factor Anova 3 - 12

SPSS Training

Figure 3.7 LSD Post Hoc Results

Al l tests appear i n one tabl e. However, the Post Hoc Tests and

Homogeneous Subsets pi vot tabl es were edi ted i n the Pi vot Tabl e Edi tor

so that each test can be vi ewed and di scussed separatel y. (To do so,

doubl e-cl i ck on the pi vot tabl e to i nvoke the Pi vot Tabl e Edi tor, then cl i ck

Pi vot..Pi vot Trays so that the Pi vot Trays opti on i s checked and the Pi vot

Trays wi ndow i s vi si bl e. Next cl i ck and drag the pi vot tray i con for Test

(to see an i con's l abel , just cl i ck on the i con) from the Row di mensi on tray

i nto the Layer di mensi on tray. Now test resul ts for any si ngl e post hoc

test can be vi ewed by sel ecti ng the desi red test from the Test drop-down

l i st l ocated just above the tabl e.)

The rows are constructed from every possi bl e pai ri ng of groups. For

exampl e, the l ess than 800 Mwe group i s pai red agai nst the other two

groups, then the 800-1000 Mwe group i s pai red agai nst the other two

groups, etc. The col umn l abel “Mean Di fference (I -J)” contai ns the mean

di fference between each pai ri ng of groups. We see that the <800 group

has a mean cost di fference of -$35.7866 wi th the 800-1000 group and a

di fference of -$193.5885 wi th the >1000 group. I f a di fference i s

stati sti cal l y si gni fi cant at the speci fi ed l evel after appl yi ng any post hoc

adjustments (none for LSD), then an asteri sk (*) appears besi de the mean

di fference. Noti ce the actual si gni fi cance val ue for the test appears i n the

col umn l abel ed “Si g.”

The fi rst LSD bl ock i ndi cates that i n the popul ati on those pl ants

havi ng l ess than 800 Mwe’s di ffer si gni fi cantl y i n cost from the pl ants

havi ng a capaci ty of greater than 1000 Mwe’s. I n addi ti on, the standard

errors and 95% confi dence i nterval s for each mean di fference are

di spl ayed. These provi de i nformati on of the preci si on wi th whi ch we have

esti mated the mean di fferences. Note that, as you expect, i f a mean

di fference i s not si gni fi cant, the confi dence i nterval contai ns zero. Usi ng

LSD, the hi gh capaci ty group di ffers from each of the other two, but the

l ower capaci ty groups do not di ffer from each other.

Note

One-Factor Anova 3 - 13

SPSS Training

Figure 3.8 Duncan Results

SPSS does not present the Duncan resul ts i n the same format as we

saw for the LSD. Thi s i s because for some of the post hoc test methods

standard errors and 95-percent confi dence i nterval s are not defi ned (for

mul ti pl e-range tests, recal l testi ng stops once the remai ni ng most

extreme means are not found di fferent). Rather than di spl ay resul ts wi th

empty col umns i n such si tuati ons, a di fferent format, homogeneous

subsets, i s used. A homogeneous subset i s a set of groups for whi ch no

pai r of group means di ffers si gni fi cantl y. Dependi ng on the post hoc test

requested SPSS wi l l di spl ay a mul ti pl e compari son tabl e, a homogeneous

subset tabl e, or both. I n thi s data set, i t shows that the two l ower

capaci ty groups do not di ffer i n cost, but di ffer from the hi ghest capaci ty

group.

Figure 3.9 SNK Results

One-Factor Anova 3 - 14

SPSS Training

The SNK resul ts di spl ay the same pattern as the Duncan tests.

Figure 3.10 Tukey Results for Multiple Comparisons

The Tukey mul ti pl e compari son tests show that the l ess than 800

Mwe group i s si gni fi cantl y di fferent from the greater than 1000 Mwe

group, but thi s i s the onl y pai rwi se di fference.

Figure 3.11 Tukey Results for Homogeneous Subsets

The Tukey homogeneous subset tabl e i s consi stent wi th the mul ti pl e

compari son tabl e. The fi rst homogeneous subset contai ns the two l ower

One-Factor Anova 3 - 15

SPSS Training

capaci ty pl ants (they do not di ffer). The second homogeneous subset i s

made up of the second and thi rd groups (they do not di ffer). Thus the onl y

di fference i s between the l ess than 800 Mwe group and the greater than

1000 Mwe group. I t shoul d be poi nted out that the second and thi rd

groups are barel y not si gni fi cant (.07) and had the sampl e si zes been

l arger thei r di fference mi ght have been si gni fi cant.

Figure 3.12 Bonferroni Results

The test shows that the l ess than 800 Mwe’s group has a si gni fi cantl y

di fferent cost than the greater than 1000 Mwe’s group.

Figure 3.13 Scheffe Results for Multiple Comparisons

One-Factor Anova 3 - 16

SPSS Training

Figure 3.14 Scheffe Results for Homogeneous Subsets

From these resul ts we can see that si mi l ar to the Bonferroni test,

onl y the hi gh and l ow capaci ty groups di ffer.

As di scussed before, the di fferent post hoc procedures offer di fferent

trade-offs between Type I error (fal sel y cl ai mi ng a si gni fi cant di fference)

and power (abi l i ty to detect a real di fference). Your choi ce i n the matter

depends on how you want to bal ance the two. I n thi s anal ysi s i t appears

that the hi gh and l ow capaci ty groups do di ffer i n cost, whi l e the l ow and

mi ddl e groups do not. The mi ddl e to hi gh capaci ty di fference mi ght be

useful l y consi dered as a tentati ve fi ndi ng.

Post hoc tests compare al l pai rs of groups and most of the methods

di scussed appl y a penal ty functi on (adjusti ng the cri ti cal val ue) because

so many tests are bei ng made. I n some experi ments and studi es, the

researcher has i n mi nd some speci fi c compari sons to be made between

group means. Compared to post hoc tests, pl anned compari sons are fewer

i n number and are to be formul ated before vi ewi ng the data. Because

they are l i mi ted i n number (based on between-group degrees of freedom

(the number of groups mi nus one)) and speci fi ed beforehand, the

adjustments made for post hoc tests are not requi red.

A broad vari ety of pl anned compari sons (someti mes cal l ed a priori

compari sons) can be requested: al l treatment groups mi ght be compared

to a control group; a l i near trend l i ne coul d be fi t; step compari sons coul d

be made to detect a threshol d.

To demonstrate thi s method, l et us suppose that there i s i nterest i n

maki ng some speci fi c compari sons between capaci ty groups. The i dea i s

that at some poi nt the change i n capaci ty woul d resul t i n a l arge change

i n cost. To see i f and where thi s occurs, we can compare the l ow to mi ddl e

capaci ty pl ants, then the mi ddl e to hi gh capaci ty pl ants. I f ei ther of these

Conclusion

PLANNED

COMPARISONS

One-Factor Anova 3 - 17

SPSS Training

compari sons i s si gni fi cant, we have an i dea of where the bi g cost i ncrease

wi l l occur.

Pl anned compari sons between groups are done by appl yi ng a set of

coeffi ci ents to the group means and testi ng whether the resul t i s zero. For

exampl e, to compare the l ow and mi ddl e pl ant groups, mul ti pl y the mean

of the l ow pl ants by one, the mean of the mi ddl e pl ants by negati ve one,

the mean of the l arge pl ants by zero, and sum the resul t. Thus, we

compare the means, and i f thi s di fference i s si gni fi cantl y di fferent from

zero, then the l ow capaci ty pl ants di ffer from the mi ddl e capaci ty pl ants.

I n ONEWAY you can request pl anned compari sons by provi di ng sets of

coeffi ci ents.

To request tests of l ow versus mi ddl e, and the mi ddl e versus hi gh

capaci ty groups we use the Contrasts pushbutton and appl y the

necessary coeffi ci ents.

Cl i ck Dialog Recall tool

Sel ect One-Way ANOVA

Cl i ck the Contrasts pushbutton

Type 1 i n the Coefficients text box and cl i ck Add pushbutton

Type –1 i n the Coefficients text box and cl i ck Add pushbutton

Type 0 i n the Coefficients text box and cl i ck Add pushbutton

Cl i ck Next pushbutton

Type 0 i n the Coefficients text box and cl i ck Add pushbutton

Type 1 i n the Coefficients text box and cl i ck Add pushbutton

Type -1 i n the Coefficients text box and cl i ck Add pushbutton

Figure 3.15 Contrasts Dialog Box

HOW PLANNED

COMPARISONS

ARE DONE

Each set of contrast coeffi ci ents i s assi gned a number (1,2, …) and

appears as a col umn i n the Coeffi ci ents l i st box.

One-Factor Anova 3 - 18

SPSS Training

Cl i ck Continue to process the Contrasts

Cl i ck OK to run the anal ysi s

Thi s l eads to the syntax bel ow (note the PostHoc subcommand i s not

i ncl uded al though our previ ous post hoc requests are sti l l stored i n the

Post Hoc di al og.

ONEWAY

cost BY capaci ty

/CONTRAST= 1 –1 0 /CONTRAST = 0 1 –1

/MI SSI NG ANALYSI S.

The fi rst contrast requests the di fference between the l ow and mi ddl e

groups; the second compares the mi ddl e and hi gh groups. We are l i mi ted

to two pl anned compari sons because wi th three groups we have but two

between-group degrees of freedom.

Scrol l to the Contrast Coefficients Pivot Table i n the Vi ewer

wi ndow.

Figure 3.16 Contrast Coefficients

The requested compari sons are fi rst reproduced al ong wi th the group

l abel s. We veri fy that the fi rst compares the l ow to mi ddl e group, and the

second compares the mi ddl e to hi gh group.

Figure 3.17 Contrast Results

Noti ce that there are two sets of resul ts, one l abel ed “assume equal

vari ances” and the other “does not assume equal vari ances”. Resul ts

l abel ed “does not assume equal vari ances” are adjusted resul ts that can

be used i f the homogenei ty of vari ance assumpti on i s not met. We

One-Factor Anova 3 - 19

SPSS Training

previ ousl y determi ned the vari ances are equal and wi l l use the “assume

equal vari ances” stati sti cs.

The col umn l abel ed "Val ue of Contrast" contai ns the val ues of the

contrast coeffi ci ents appl i ed to the sampl e means, whi ch here represent

the mean di fference between pai rs of groups. Thi s can be veri fi ed by

checki ng the group means appeari ng earl i er. The fi rst compari son

(between the l ow and mi ddl e groups) i s not si gni fi cant, but the second one

(compari ng the mi ddl e and hi gh capaci ty groups) i s. Thi s suggests that

the bi g cost i ncrease comes when shi fti ng from the mi ddl e to hi gh

capaci ty pl ants. A t test i s used si nce each compari son i nvol ves one

degree of freedom; i t i s equi val ent to usi ng an F test (the t stati sti c

squared woul d equal the F).

Thus a l i mi ted number of pl anned compari sons between group means

can be speci fi ed as part of the general anal ysi s. Performi ng pl anned

compari sons does not precl ude runni ng post hoc anal yses l ater.

For presentati ons i t i s useful to di spl ay the sampl e group means al ong

wi th thei r 95-percent confi dence i nterval s. I n SPSS for Wi ndows

Cl i ck on Graphs..Error Bar

Veri fy that Simple i s sel ected, then cl i ck Define pushbutton

Move cost i nto the Variable l i st box

Move capacity i nto the Category Axis l i st box.

Figure 3.18 Error Bar Dialog Box

GRAPHING THE

RESULTS

Cl i ck on OK

One-Factor Anova 3 - 20

SPSS Training

The command bel ow wi l l produce the error bar chart usi ng a

standard graph (there i s al so I nteracti ve graph that produces an error

bar chart).

GRAPH

/ERRORBAR( CI 95 )=cost BY capaci ty

/MI SSI NG=REPORT.

Figure 3.19 Error Bar Chart of Cost by Capacity Group

The chart provi des a vi sual sense of how far the groups are separated.

The confi dence bands are determi ned for each group separatel y (thus

i nspecti on of the confi dence band overl ap i s not formal l y equi val ent to

testi ng for group di fferences) and no adjustment i s made based on the

number of groups that are compared. However, from the graph we have a

cl earer sense of the rel ati on between capaci ty and cost.

I n thi s chapter we tested for popul ati on mean di fferences wi th more than

two groups when these groups consti tute a si ngl e factor. We exami ned

the data to check for assumpti on vi ol ati ons, di scussed al ternati ves, and

i nterpreted the ANOVA resul ts. Havi ng found si gni fi cant di fferences we

performed post hoc tests to determi ne whi ch speci fi c groups di ffered from

whi ch others, and summari zed the anal ysi s wi th an error bar graph. The

appendi x contai ns a nonparametri c anal ysi s of the same data.

Anal ysi s of vari ance assumes that the dependent measure i s i nterval

scal e, that i ts di stri buti on wi thi n each group fol l ows a normal curve, and

that the wi thi n-group vari ati on i s homogeneous across groups. I f any of

these assumpti ons fai l i n a gross way, one may be abl e to appl y

techni ques that make fewer assumpti ons about the data. Such tests fal l

under the cl ass of nonparametri c stati sti cs (they do not assume speci fi c

data di stri buti ons descri bed by the parameters such as the mean and

SUMMARY

APPENDIX:

GROUP

DIFFERENCES

ON RANKS

One-Factor Anova 3 - 21

SPSS Training

standard devi ati on). Si nce these methods make few i f any di stri buti onal

assumpti ons, they can often be appl i ed when the usual assumpti ons are

not met. I f you are tempted to thi nk that somethi ng i s obtai ned for

nothi ng, the downsi de of such methods i s that i f the stronger data

assumpti ons hol d, the nonparametri c tests are general l y l ess powerful

(probabi l i ty of fi ndi ng true di fferences) than the appropri ate parametri c

method. Al so, there are some parametri c stati sti cal anal yses that

currentl y have no correspondi ng nonparametri c method. I t i s fai r to say

that the boundari es concerni ng when to use parametri c versus

nonparametri c methods are i n practi ce somewhat vague, and stati sti ci ans

can and often do di sagree about whi ch approach i s opti mal i n a speci fi c

si tuati on.

For the purposes of thi s appendi x l et us assume that we needed to

run the test usi ng nonparametri c methods. We wi l l perform a

nonparametri c procedure that onl y assumes that the dependent measure

has ordi nal properti es. The basi c l ogi c behi nd thi s test, the Kruskal -

Wal l i s test, i s as fol l ows. I f we rank order the dependent measure

throughout the enti re sampl e, we woul d expect under the nul l hypothesi s

(of no popul ati on di fferences) that the average rank (techni cal l y the sum

of the ranks adjusted for sampl e si ze) shoul d be about the same for each

group. The Kruskal -Wal l i s test cal cul ates the ranks, each sampl e group’s

mean rank, and the probabi l i ty of obtai ni ng group average ranks

(wei ghted summed ranks) as far apart (or further) as what i s observed i n

the sampl e, i f the popul ati on groups were i denti cal .

To run the Kruskal -Wal l i s test i n SPSS we woul d

Cl i ck Analyze..Nonparametric Tests..K Independent

Samples

Move cost i nto the Test Vari abl e Li st

Move capacity i nto the Groupi ng Vari abl e box

Cl i ck the Define Range button and enter a Minimum of 1 and

Maximum of 3.

Cl i ck Continue

Figure 3.20 Analysis of Ranks Dialog Box

One-Factor Anova 3 - 22

SPSS Training

By defaul t, the Kruskal -Wal l i s test wi l l be performed. The

organi zati on of thi s di al og box cl osel y resembl es that of the One-Way

ANOVA. The command to run thi s anal ysi s usi ng SPSS fol l ows.

NPAR TESTS

/K-W=cost BY capaci ty(1 3)

The K-W subcommand i nstructs the nonparametri c testi ng routi ne to

perform the Kruskal -Wal l i s anal ysi s of vari ance of ranks on the

dependent vari abl e cost wi th capaci ty as the i ndependent or groupi ng

vari abl e.

Cl i ck OK to run the anal ysi s

Figure 3.21 Results of Kruskal-Wallis Nonparametric Analysis

We see the pattern of mean ranks (remember smal l er ranks i mpl y

l ower cost) fol l ows that of the ori gi nal means of cost, i ncreasi ng as the

capaci ty i ncreases. The chi -square stati sti c i s used i n the Kruskal -Wal l i s

i ndi cates that i t i s very unl i kel y (fewer than 4 chances i n 100) to obtai n

sampl es wi th average ranks so far apart i f the nul l hypothesi s (no cost

di fferences between groups) were true. Thi s i s consi stent wi th our

concl usi on from the i ni ti al one-way ANOVA anal ysi s.

Multi-Way Univariate ANOVA 4 - 1

SPSS Training

Multi-Way Univariate ANOVA

We wi l l appl y the pri nci pl es of testi ng for di fferences i n popul ati on means

to si tuati ons i nvol vi ng more than one factor. Al so we wi l l show how the

two-factor ANOVA i s a general i zati on of the one-factor desi gn that we

covered i n the l ast chapter. We wi l l devel op some understandi ng of the

new features of the anal ysi s. We wi l l then di scuss the i mpl i cati ons of

unequal sampl e si zes and empty cel l s.

We wi sh to test whether there are any di fferences i n the cost of the

nucl ear power pl ants based on capaci ty or the experi ence of the archi tect/

engi neer. Fi rst we use the EXPLORE procedure to expl ore the subgroups

i nvol ved i n the anal ysi s. Next, we make use of the Uni vari ate procedure

to run the two-factor ANOVA, speci fyi ng cost as the dependent vari abl e

wi th capaci ty and experi ence as factors. We di spl ay the resul ts usi ng an

error bar chart. I n the appendi x we perform post hoc tests based on the

resul ts of our anal ysi s.

We conti nue to use the nucl ear pl ant data. The data set i s an SPSS

portabl e fi l e (pl ant.por) contai ni ng i nformati on about 32 l i ght water

nucl ear power pl ants. Four vari abl es are i ncl uded: the capaci ty and cost

of the pl ant; ti me to compl eti on; and experi ence of the archi tect-engi neer

who bui l t the pl ant.

A

nal ysi s of vari ance (ANOVA) i s a general method for drawi ng

concl usi ons about di fferences i n popul ati on means when two or

more compari son groups are i nvol ved. I n an i ntroductory stati sti cs

cl ass you have seen how a “t” test i s used to contrast two groups, and i n

the l ast chapter we saw how one-way ANOVA compares more than two

groups whi ch di ffer al ong a si ngl e factor. I n thi s chapter, we expand our

consi derati on of ANOVA to al l ow mul ti pl e factors i n a si ngl e anal ysi s.

Such an approach i s effi ci ent i n that several questi ons are addressed

wi thi n one study. The assumpti ons and i ssues consi dered i n the l ast

chapter (normal i ty of the dependent vari abl e wi thi n each group,

homogenei ty of vari ance, and the i mportance of both) appl y to general

ANOVA and wi l l not be repeated here.

We wi l l i nvesti gate whether there are di fferences i n the average cost

of a pl ant for the di fferent pl ant capaci ti es and l evel s of experi ence of the

desi gner/engi neer. Si nce two factors, capaci ty and experi ence, are under

consi derati on, we can ask three di fferent questi ons: (1) Are there cost

di fferences based on capaci ty? (2) Are there di fferences based on

experi ence? (3) Do capaci ty and experi ence i nteract?

Chapter 4

Objective

Method

Data

INTRODUCTION

Multi-Way Univariate ANOVA 4 - 2

SPSS Training

A mul ti -factor anal ysi s i nvol ves the same approach and pri nci pl es, as

di d a one-way ANOVA. The between-groups vari ati on can now be

parti ti oned i nto pi eces attri butabl e to mai n effects and i nteracti on

components, but the method i s much the same. Some compl i cati ons ari se

wi th unequal cel l si zes and empty cel l s that were not a probl em when we

tested a si ngl e factor. We wi l l di scuss these i ssues and i l l ustrate the

anal ysi s.

As i n earl i er chapters, we begi n by runni ng an expl oratory data

anal ysi s, then proceed wi th more formal testi ng.

As before, we wi sh to draw concl usi ons about the popul ati ons from whi ch

we sampl e. The mai n di fference i n movi ng from a one-way ANOVA to the

general ANOVA i s that more questi ons can be asked about the

popul ati ons. However, the resul ts wi l l be stated i n the same terms: how

l i kel y i s i t that we woul d obtai n means as far apart as what we observe i n

our sampl e, i f there were no mean di fferences i n the popul ati ons.

Compari sons are agai n framed as a rati o of the vari ati on among the

group means (between-group vari ati on) to the vari ati on among

observati ons wi thi n each group (wi thi n-group vari ati on). When stati sti cal

tests are performed, homogenei ty of vari ance and normal i ty of the

dependent vari abl e wi thi n each group are assumed. Comments made

earl i er regardi ng robustness of the means anal ysi s when these

assumpti ons are vi ol ated appl y di rectl y.

The new aspect we consi der i s how to i ncl ude several factors, or ask

several di fferent questi ons of the data, wi thi n a si ngl e anal ysi s of

vari ance. We wi l l test whether there are di fferences i n cost based on

capaci ty, whether there are di fferences based on the experi ence of the

engi neer, and fi nal l y, whether capaci ty and experi ence i nteract

concerni ng the cost of the pl ants. The i nterpretati on of an i nteracti on i s

di scussed i n the next secti on.

Al though our exampl e i nvol ves onl y two factors (capaci ty and

experi ence), ANOVA can accommodate more. Usual l y, the number of

factors i s l i mi ted by ei ther the i nterests of the researcher, who mi ght

wi sh to exami ne a few speci fi c i ssues, or by sampl e si ze consi derati ons.

Sampl e si ze pl ays a rol e i n that the greater the number of factors, the

greater the number of cel l means that must be computed, and the smal l er

the sampl e for each mean. For exampl e, suppose we have a sampl e of 800

pl ants and wi sh to l ook at cost di fferences due to whether the pl ant was

l i ght water or heavy water (2 l evel s), capaci ty (3 l evel s), experi ence (3

l evel s), regi on of the country (9 l evel s), and age of the pl ant (4 l evel s).

There are 2*3*3*9*4 or 648 subgroup means i nvol ved. I f the data were

di stri buted evenl y across the l evel s, each subgroup mean woul d be based

on approxi matel y two observati ons, and thi s woul d not produce a very

powerful anal ysi s. Such anal yses can be performed, and techni cal l y,

questi ons i nvol vi ng si ngl e effects l i ke capaci ty or experi ence woul d be

based on means i nvol vi ng fai rl y l arge sampl es. Al so, some pl anned

experi ments permi t many subgroups to be dropped (for exampl e,

i ncompl ete desi gns). Yet the fact remai ns that wi th smal l er sampl es,

there are practi cal l i mi tati ons i n the number of questi ons you can ask of

the data.

LOGIC OF

TESTING, AND

ASSUMPTIONS

HOW MANY

FACTORS?

Multi-Way Univariate ANOVA 4 - 3

SPSS Training

When movi ng beyond one-factor ANOVA, the di sti ncti on between mai n

effects and i nteracti ons becomes rel evant. A mai n effect i s an effect (or

group di fference) attri butabl e to a si ngl e factor (i ndependent vari abl e).

For exampl e, when we study cost di fferences across capaci ty groups and

experi ence groups, the effect of capaci ty al one, and the effect of

experi ence al one, woul d be consi dered mai n effects. The two-way

i nteracti on woul d test whether the effect of one factor i s the same at each

l evel of the other factor.

I n our exampl e, thi s can be phased i n ei ther of two ways. We coul d

say the i nteracti on tests whether the cost di fference due to capaci ty

(whi ch coul d be zero) i s the same for each l evel of experi ence.

Al ternati vel y, we can say that the two-way i nteracti on tests whether the

experi ence di fference i n cost i s the same for each capaci ty group. Whi l e

these two phrasi ngs are mathemati cal l y equi val ent, i t can someti mes be

si mpl er (based on the number of l evel s i n each factor) for you to present

the i nformati on from one perspecti ve i nstead of the other. The presence of

a two-way i nteracti on i s i mportant to report, si nce i t qual i fi es our

i nterpretati on of a mai n effect. For exampl e, a capaci ty by experi ence

i nteracti on i mpl i es that the magni tude of the capaci ty di fference vari es

across l evel s of experi ence. I n fact, there may be no di fference or a

reversal i n the pattern of the capaci ty means for some experi ence l evel s.

Thus statements about capaci ty di fferences must be qual i fi ed by

experi ence i nformati on.

Si nce we are studyi ng two factors, there can be onl y one i nteracti on.

I f we expand our anal ysi s to three factors (say capaci ty, experi ence, and

age of pl ant) we can ask both two-way (capaci ty by experi ence, capaci ty

by age, experi ence by age) and three-way (capaci ty by experi ence by age)

i nteracti on questi ons. As the number of factors i ncreases, so does the

possi bl e compl exi ty of the i nteracti ons. I n practi ce, si gni fi cant hi gh-order

(three, four, fi ve-way, etc.) i nteracti ons are rel ati vel y rare compared to

the number of si gni fi cant mai n effects.

I nterpretati on of an i nteracti on can be done di rectl y from a tabl e of

rel evant subgroup means, but i t i s more conveni ent and common to vi ew

a mul ti pl e-l i ne chart of the means. We i l l ustrate thi s bel ow under several

scenari os.

Suppose that we have four l evel s of one i ndependent vari abl e (say

l ocati on) and two l evel s for the second i ndependent vari abl e (say gender).

I n our scenari o, suppose that women are more hi ghl y educated than men,

there are regi onal di fferences i n educati on, and that the gender

di fferences are the same across regi ons. The l i ne chart bel ow pl ots a set of

means consi stent wi th thi s pattern.

INTERACTIONS

Multi-Way Univariate ANOVA 4 - 4

SPSS Training

Illustration 4.1 Main Effects, No Interaction

I n the i l l ustrati on we see that the mean l i ne for women i s above that

of the men. I n addi ti on, there are di fferences among the four l ocati ons.

However, note that the gender di fferences are nearl y i denti cal across the

four l ocati ons. Thi s equal di stance between the l i nes (paral l el i sm of l i nes)

i ndi cates that there i s no i nteracti on present.

Illustration 4.2 No Main Effects, Strong Interaction

Here the overal l means for men and women are about the same, as

are the means for each l ocati on (pool i ng the two gender groups).

Multi-Way Univariate ANOVA 4 - 5

SPSS Training

However, the gender di fferences vary dramati cal l y across the di fferent

l ocati ons: i n l ocati on B women have hi gher educati on, i n l ocati ons A and

D there i s no gender di fference, and i n l ocati on C mal es have hi gher

educati on. We cannot make a statement about gender di fferences wi thout

qual i fyi ng i t wi th l ocati on i nformati on, nor can we make l ocati on cl ai ms

wi thout menti oni ng gender. Strong i nteracti ons are marked by thi s

crossover pattern i n a mul ti pl e-l i ne chart.

Illustration 4.3 One Main Effect, Weak Interaction

We see a gender di fference for each of the four l ocati ons, but the

magni tude of thi s di fference vari es across l ocati ons (substanti al l y greater

for l ocati on D). Thi s di fference i n magni tude of the gender effect woul d

consti tute an i nteracti on between gender and l ocati on. I t woul d be

termed a weak i nteracti on because there i s no crossover of the mean

l i nes.

Addi ti onal scenari os can be charted, and we have not menti oned

three-way and hi gher i nteracti ons. Such topi cs are di scussed i n

i ntroductory stati sti cs and anal ysi s of vari ance books (see the reference

page for suggesti ons). We wi l l now proceed to anal yze our data set.

We begi n by appl yi ng expl oratory data anal ysi s to the cost of the pl ants

wi thi n subgroups defi ned by combi nati ons of capaci ty and experi ence. I n

practi ce, you woul d check each group’s summari es, l ook for patterns i n

the data, and note any unusual poi nts. Al so, we wi l l request that the

Expl ore procedure perform a homogenei ty of vari ance test.

Cl i ck on File..Open..Data (move to the c:\ Train\ Anova fol der)

Sel ect SPSS Portable (*.por) from the Fi l es of Type drop-

down l i st

Doubl e-cl i ck on plant.por

EXPLORING THE

DATA

Multi-Way Univariate ANOVA 4 - 6

SPSS Training

Cl i ck on Analyze..Descriptive Statistics..Explore

Move cost i nto the Dependent Li st box

Move the exper and capacity i nto the Factor Li st box.

Figure 4.1 Explore Dialog Box

Cl i ck on the Plots pushbutton

Cl i ck the Power estimation opti on button i n the “Spread vs.

Level wi th Levene Test” area

Figure 4.2 Plots Dialog Box

By defaul t, no homogenei ty test i s performed (“None” opti on button).

Each of the remai ni ng choi ces wi l l l ead to homogenei ty bei ng tested. The

Multi-Way Univariate ANOVA 4 - 7

SPSS Training

second (Power esti mati on) and thi rd (Transformed) choi ces are used by

more techni cal anal ysts to i nvesti gate power transformati ons of the

dependent measure that woul d yi el d greater homogenei ty of vari ance.

These i ssues are of i nterest to seri ous practi ti oners of ANOVA, but are

beyond the scope of thi s course (see Emerson i n Hoagl i n, Mostel l er, and

Tukey (1991), al so a bri ef di scussi on i n Box, Hunter, and Hunter (1978),

and the ori gi nal (techni cal ) paper by Box and Cox (1964)). The

Untransformed choi ce bui l ds a pl ot wi thout transformi ng the scal e of the

dependent measure.

Cl i ck on the Continue button to return to the Explore di al og

box

Si nce we are compari ng capaci ty by experi ence subgroups, we

desi gnate exper (experi ence) and capacity as the factors (or nomi nal

i ndependent vari abl es). However, i f we were to run thi s anal ysi s, i t woul d

produce a set of summari es for each capaci ty group, then a set for each

experi ence group. I n other words, each of the two factors woul d be treated

separatel y, i nstead of bei ng combi ned, whi ch we desi re. To i nstruct SPSS

to treat each capaci ty by experi ence combi nati on as a subgroup we must

use SPSS syntax. The easi est way to accompl i sh thi s woul d be to cl i ck the

Paste pushbutton that opens a syntax wi ndow and bui l ds an Exami ne

command that wi l l perform an anal ysi s for each factor.

Cl i ck on the Paste pushbutton to paste the syntax i nto a Syntax

wi ndow

Figure 4.3 Examine Command in Syntax Window

The Examine command requi res onl y the Variables subcommand i n

order to tun. We al so i ncl ude the Plot subcommand si nce we desi re the

homogenei ty test (control l ed by the SPREADLEVEL keyword). The

Multi-Way Univariate ANOVA 4 - 8

SPSS Training

other subcommands speci fy defaul t val ues and appear i n order to make i t

si mpl e for the anal yst to modi fy the command when necessary. Note the

keyword BY separates the dependent vari abl e cost from the exper and

capacity factors. Currentl y, both exper and capacity fol l ow the BY

keyword and thus have the same status: an anal ysi s wi l l be run for each

separatel y. To i ndi cate we wi sh a joi nt anal ysi s, we i nsert an addi ti onal

BY keyword between exper and capacity on the VARI ABLES

subcommand.

Figure 4.4 Examine Command Requesting Subgroup Analysis

SPSS now i nterprets the factor groupi ngs to be based on each

capaci ty by experi ence combi nati on (exper BY capacity ).

Cl i ck Run..Current or cl i ck the Run button .

Looki ng at the descri pti ves for each of our subgroups we see the

fol l owi ng i nformati on.

Al l descri pti ve stati sti cs appear i n a si ngl e pi vot tabl e; the fi gures present

separate secti ons of the tabl e

Note

Multi-Way Univariate ANOVA 4 - 9

SPSS Training

Figure 4.5 Descriptives for 1-3 Plants and <800 MWe

We fi nd i n thi s subgroup that the mean cost i s 404.0829 and other

descri pti ve i nformati on i s avai l abl e to us.

Figure 4.6 Descriptives for 1-3 Plants and >1000 MWe

Figure 4.7 Descriptives for 4-9 Plants and <800 MWe

Multi-Way Univariate ANOVA 4 - 10

SPSS Training

Figure 4.8 Descriptives for 4-9 Plants and 800-1000 MWe

Figure 4.9 Descriptives for 4-9 Plants and > 1000 MWe

Figure 4.10 Descriptives for 10 or more Plants and <800 MWe

Multi-Way Univariate ANOVA 4 - 11

SPSS Training

Figure 4.11 Descriptives for 10 or more Plants and 800-1000 MWe

Figure 4.12 Descriptives for 10 or more Plants and >1000 MWe

I n the Vi ewer wi ndow we fi nd a warni ng message concerni ng the

spread and l evel pl ot and the test for homogenei ty of vari ance. Thi s

message tel l s us that because we had a smal l number of cases i n some of

our subgroups that the medi an and/or the i nterquarti l e range was not

defi ned. Thus the test and pl ot are not produced.

Figure 4.13 Test of Homogeneity of Variance

Multi-Way Univariate ANOVA 4 - 12

SPSS Training

Figure 4.14 Warning Messages

Now we move to view the Box and Whiskers Plot.

Figure 4.15 Box and Whisker Plot of Cost

We see vari ati on i n the l engths of the boxes that suggests that the

vari ati on of cost wi thi n the subgroups i s not homogeneous. We can see

that there are di fferences i n the medi an cost across our subgroups, but

wi th the smal l sampl e si zes are they di fferent enough to be stati sti cal l y

si gni fi cant? An outl i er i s vi si bl e at the hi gh end. Does i t seem so extreme

as to suggest a data error?

Multi-Way Univariate ANOVA 4 - 13

SPSS Training

To run the anal ysi s i n SPSS we choose Analyze..General Linear

Model menu. Pl ease note that the General Li near Model menu choi ces

wi l l vary dependi ng on your versi on of SPSS and whether you have the

SPSS Advanced Model s opti on i nstal l ed. We wi l l use the Uni vari ate

procedure.

The Uni vari ate choi ce permi ts the anal yst to handl e desi gns from the

si mpl e to the more compl ex (i ncompl ete bl ock, Lati n square, etc.) and al so

provi des the user wi th control over vari ous aspects of the anal yses. The

Mul ti vari ate menu choi ce performs mul ti vari ate (mul ti pl e dependent

measures) anal ysi s of vari ance, whi l e the Repeated Measures menu

choi ce i s used for studi es i n whi ch an observati on contri butes to several

factor l evel s (these are commonl y cal l ed spl i t-pl ot or repeated measure

desi gns). Fi nal l y, the Vari ance Components menu choi ce performs an

anal ysi s that esti mates the vari ati on i n the dependent vari abl e

attri butabl e to each random effect i n a model (see di scussi on of random

and fi xed effects bel ow). Thus i t assesses the rel ati ve i nfl uence of each

random effect i n a model contai ni ng mul ti pl e random effects.

Cl i ck on Analyze..General Linear Model..Univariate

Move cost and to the Dependent Variable l i st box

Move exper and capacity to the Fixed Factor(s) l i st box.

Figure 4.16 Univariate Dialog Box

TWO-FACTOR

ANOVA

Our anal ysi s does not i ncl ude random factors (other than the pl ant to

pl ant vari ati on that i s al ready accounted for as the wi thi n-group

vari ati on. Bri efl y, fi xed factors have a l i mi ted (fi ni te) number of l evel s

and we wi sh to draw popul ati on concl usi ons about onl y those l evel s.

Multi-Way Univariate ANOVA 4 - 14

SPSS Training

Random factors are those i n whi ch a random sampl e of a few l evel s from

al l possi bl e ones are i ncl uded i n the study, but popul ati on concl usi ons are

to be appl i ed to al l l evel s. For exampl e, an i nsti tuti onal researcher mi ght

randoml y sel ect school s from a l arge school di stri ct to be i ncl uded i n a

study i nvesti gati ng sex di fferences i n l earni ng mathemati cs. Here sex i s a

fi xed factor whi l e school i s a random factor. I t i s i mportant to di sti ngui sh

between fi xed and random factors si nce error terms di ffer.

Our anal ysi s al so does not i ncl ude covari ates. They are i nterval scal e

i ndependent vari abl es, whose rel ati onshi ps wi th the dependent measure

you wi sh to stati sti cal l y control , before performi ng the ANOVA i tsel f.

The OK button i s acti ve, so we can run the anal ysi s. However, we wi l l

request some addi ti onal i nformati on.

Cl i ck on the Model pushbutton.

Figure 4.17 Model dialog box

Wi thi n the Model di al og you can speci fy the model you want appl i ed

to the data. By defaul t a model contai ni ng al l mai n effects and

i nteracti ons i s run. Anal ysts who anal yze data based on i ncompl ete

desi gns (some combi nati ons of factors are not eval uated i n order to

reduce the sampl e si ze requi rements) woul d use thi s di al og to i ndi cate

whi ch effects shoul d be eval uated. Al so, i f your sampl e si zes are unequal

across subgroups you can choose among several sums of squares

adjustments. Thi s i ssue i s di scussed l ater i n the chapter.

Cl i ck the Cancel button.

The next pushbutton we wi l l l ook at i s the Opti ons button. We ask

Uni vari ate to provi de us wi th means for the mai n effects and the two-way

Multi-Way Univariate ANOVA 4 - 15

SPSS Training

i nteracti on. Al so we wi l l request a test of homogenei ty of vari ance.

Cl i ck the Options pushbutton

Move (Overall), exper , capacity, and exper *capacity i nto

the Display Means for l i st box

Cl i ck the Homogeneity tests check box

Figure 4.18 Univariate: Options Dialog Box

Cl i ck on Continue

Cl i ck the Save pushbutton

Cl i ck the check boxes for Unstandardized Predicted Values

and both the Unstandardized and Standardized

Residuals.

Multi-Way Univariate ANOVA 4 - 16

SPSS Training

Figure 4.19 Univariate: Save Dialog Box

Cl i ck on Continue

Cl i ck on OK.

The fol l owi ng syntax wi l l al so run the anal ysi s:

UNI ANOVA

cost BY exper capaci ty

/METHOD = SSTYPE(3)

/I NTERCEPT=I NCLUDE

/SAVE=PRED RESI D ZRESI D

/EMMEANS=TABLES(OVERALL)

/EMMEANS=TABLES(exper)

/EMMEANS=TABLES(capaci ty)

/EMMEANS=TABLES(exper*capaci ty)

/PRI NT=HOMOGENEI TY

/CRI TERI A=ALPHA(.05)

/DESI GN=exper capaci ty exper*capaci ty.

Now we wi l l l ook at the output from our anal ysi s. The fi rst resul t i s a

l i sti ng of the Between-Subjects Factors.

Multi-Way Univariate ANOVA 4 - 17

SPSS Training

Figure 4.20 Between Subjects Factors

Next we see the resul t of the Levene’s test of equal i ty of error

vari ances (homogenei ty of vari ance test).

Figure 4.21 Levene’s Test of Homogeneity of Variance

The si gni fi cance l evel i s .014 whi ch means that i f the error vari ances

were equal i n the popul ati on, we woul d get an “F” stati sti c thi s l arge onl y

14 ti mes i n one thousand. Thus the homogenei ty of vari ance assumpti on

does not hol d. A techni cal anal yst mi ght move to the spread and l evel

pl ots to see i f the dependent vari abl e can transformed such that

homogenei ty of vari ance hol ds. A nonparametri c anal ysi s coul d be done,

al though SPSS currentl y does not contai n a two-factor nonparametri c

Anova procedure). We wi l l proceed wi th the anal ysi s, real i zi ng that the

test resul ts may not be compl etel y accurate.

Multi-Way Univariate ANOVA 4 - 18

SPSS Training

The ANOVA tabl e contai ns the i nformati on, much of i t techni cal ,

necessary to eval uate whether there are si gni fi cant di fferences i n cost

across capaci ty groups, across experi ence groups, and whether the two

factors i nteract.

Figure 4.22 The ANOVA Table

THE ANOVA

TABLE

The fi rst col umn l i sts the di fferent sources of vari ati on. We are

i nterested i n the capaci ty and experi ence mai n effects, as wel l as the

capaci ty by experi ence i nteracti on. The source l abel ed “Error” contai ns

summari es of the wi thi n-group vari ati on (or Resi dual term) whi ch wi l l be

used when cal cul ati ng the “F” rati os (rati os of between-group to wi thi n-

group vari ati on). The remai ni ng sources i n the l i st are si mpl y total s

i nvol vi ng the sources al ready descri bed, and as such are general l y not of

i nterest. The Sum of Squares col umn contai ns a techni cal summary (sum

of the squared devi ati ons of group means around the overal l mean, or of

i ndi vi dual observati ons around thei r group mean) that i s not i nterpreted

di rectl y, but i s used i n cal cul ati ng the l ater col umn val ues. The “df”

(degrees of freedom) col umn contai ns val ues that are functi ons of the

number of l evel s of the factors (for capaci ty, experi ence, and capaci ty by

experi ence) or the number of observati ons (for resi dual ). Al though thi s i s

a gross oversi mpl i fi cati on, you mi ght thi nk of degrees of freedom as

measuri ng the number of i ndependent val ues (whether means or

observati ons) that contri bute to the sum of squares i n the previ ous

col umn. As wi th sums of squares, degrees of freedom are techni cal

measures, not i nterpreted themsel ves, but used i n l ater cal cul ati ons.

Mean Square val ues are vari ance measures attri butabl e to the

vari ous effects (capaci ty, experi ence, capaci ty by experi ence) and to the

vari ati on of i ndi vi dual s wi thi n groups (error). The rati o of an effect mean

square to the mean square of the error provi des the between-group to

wi thi n-group vari ance rati o, or “F” stati sti c. I f there were no group

di fferences i n the popul ati on, then the rati o of the between-group

Multi-Way Univariate ANOVA 4 - 19

SPSS Training

vari ati on to the wi thi n-group vari ati on shoul d be about one. The col umn

“Si g” contai ns the most i nterpretabl e numbers i n the tabl e: the

probabi l i ti es that one can obtai n “F” rati os as l arge or l arger (or group

means as far or farther apart) as what we fi nd i n our sampl e, i f there

were no mean di fferences i n the popul ati on.

The ANOVA tabl e summari zes the stati sti cal testi ng. Both experi ence

and capaci ty are margi nal l y si gni fi cant at the .046 and .050 respecti vel y.

The resul t for capaci ty i s si mi l ar but not i denti cal to i ts resul t i n the one

factor ANOVA for several reasons. Fi rst, the wi thi n-groups error term i s

now based on ni ne cel l s and not onl y three as before. Al so, si nce the

sampl e si zes are nei ther equal nor proporti onal , the effects of capaci ty

and experi ence are not i ndependent of each other and the test for

capaci ty adjusts for the experi ence factor. I n the one factor anal ysi s the

second factor was i gnored. The i nteracti on i s not si gni fi cant i ndi cati ng

that the capaci ty di fferences do not change across di fferent l evel s of

experi ence.

Average cost shows si gni fi cant di fferences across l evel s of pl ant capaci ty

and l evel s of bui l di ng experi ence. The two factors do not seem to i nteract.

I n the Opti ons di al og box we asked for the means to be di spl ayed for each

mai n effect and the i nteracti on. The fol l owi ng fi gures provi de those

requested means.

Figure 4.23 Grand Mean and Means for Experience Levels

Conclusion

PREDICTED

MEANS

Multi-Way Univariate ANOVA 4 - 20

SPSS Training

Note that surpri si ngl y, the mean cost i s l owest for the mi ddl e l evel of

experi ence.

Figure 4.24 Means for Capacity Levels

Figure 4.25 Means for Capacity*Experience Levels

We have found that both mai n effects are stati sti cal l y si gni fi cant

al though the assumpti on of homogenei ty of the vari ances i s not met and

may compromi se the resul ts. Al so the anal yst must ask hi m or hersel f i f

any di fferences are si gni fi cant i n a practi cal sense. I t i s agai n i mportant

to recal l that a stati sti cal l y si gni fi cant mean di fference i mpl i es that the

popul ati on di fference i s not zero. Di fferences can be smal l yet stati sti cal l y

si gni fi cant when the sampl e si ze i s l arge. Thi s effect of l arge sampl es i s

certai nl y not a probl em i n thi s study.

ECOLOGICAL

SIGNIFICANCE

Multi-Way Univariate ANOVA 4 - 21

SPSS Training

To vi ew the predi cted val ues and resi dual s we turn to the case summary

procedure, al though we coul d si mpl y exami ne them i n the Data Edi tor

wi ndow.

Cl i ck Analyze..Reports..Case Summaries

Move cost, pre_1, res_1, and zre_1 to the Vari abl es l i st box

Cl i ck on OK

The fol l owi ng syntax wi l l al so produce the case summary report.

SUMMARI ZE

/TABLES=cost pre_1 res_1 zre_1

/FORMAT=VALI DLI ST NOCASENUM TOTAL LI MI T=100

/TI TLE=’Case Summari es’ /FOOTNOTE ‘ ‘

/MI SSI NG=VARI ABLE

/CELLS=COUNT.

Figure 4.26 Case Summary Report

RESIDUAL

ANALYSIS

Multi-Way Univariate ANOVA 4 - 22

SPSS Training

POST HOC

TESTS OF

ANOVA RESULTS

Noti ce the predi cted val ues are i denti cal for al l cases i n the same cel l ,

that i s, group membershi p determi nes the predi cted val ue. The

standardi zed resi dual s are i n standard devi ati on uni ts; do you see any

surpri si ngl y l arge resi dual s?

To run post hoc tests on our resul ts we wi l l need to re-open the

Uni vari atel di al og box.

Cl i ck the Di al og Recal l Tool , then cl i ck Univariate

Cl i ck the Post Hoc pushbutton

Move exper and capacity i nto the Post Hoc Tests for box

Sel ect the LSD, Games-Howell, and Scheffe post hoc tests

(cl i ck thei r check boxes)

Figure 4.27 Post Hoc Test Dialog Box

Cl i ck Continue

Cl i ck OK.

As shown bel ow, the Posthoc subcommand requests the post hoc tests.

Multi-Way Univariate ANOVA 4 - 23

SPSS Training

UNI ANOVA

cost BY exper capaci ty

/METHOD = SSTYPE(3)

/I NTERCEPT=I NCLUDE

/SAVE=PRED RESI D ZRESI D

/POSTHOC = capaci ty exper ( SCHEFFE LSD GH )

/EMMEANS=TABLES(OVERALL)

/EMMEANS=TABLES(exper)

/EMMEANS=TABLES(capaci ty)

/EMMEANS=TABLES(exper*capaci ty)

/PRI NT=HOMOGENEI TY

/CRI TERI A=ALPHA(.05)

/DESI GN=exper capaci ty exper*capaci ty.

We have sel ected the LSD, Games-Howel l (because of the fai l ure of

the homogenei ty of vari ance assumpti on), and Scheffe tests. We wi l l

exami ne the post hoc tests onl y for capaci ty (si nce thi s chapter i s l engthy

as i t i s). I n practi ce you woul d exami ne the resul ts for both capaci ty and

experi ence i f they were found to be si gni fi cant. I f ti me permi ts, revi ew the

post hoc test resul ts for experi ence. What do you fi nd?

Figure 4.28 Post Hoc Tests For Capacity

We can see from the post hoc resul ts wi th the LSD testi ng that both

the l ess than 800 MWe and 800-1000 MWe pl ants were di fferent from the

over 1000 MWe pl ants. Thi s however i s the most l i beral test. The two

Multi-Way Univariate ANOVA 4 - 24

SPSS Training

other tests fi nd onl y that the l ess than 800MWe pl ants are di fferent from

the over 1000MWe pl ants.

Figure 4.29 Homogenous Subsets for Capacity

As we woul d expect gi ven the post hoc resul ts, the homogeneous

subsets produced by the Scheffe test confi rms that onl y the l owest and

hi ghest capaci ty groups di ffer.

Up to now we have not di scussed the i mpl i cati ons of unequal sampl e

si zes. The basi c probl em ari ses when the sampl e si zes are not equal

across groups (or not proporti onal i f you are mai nl y i nterested i n mai n

effects). When thi s occurs, or i f cel l s are mi ssi ng enti rel y, the effects i n

the anal ysi s become correl ated, that i s, they overl ap. As the cel l si ze

i mbal ance i ncreases, i t becomes i ncreasi ngl y di ffi cul t to speak of

i ndependent effects. For exampl e, i f al most al l hi gh-capaci ty pl ants were

bui l t by peopl e wi th experi ence bui l di ng 10 or more pl ants, how can we

speak of separate effects? The same probl em, hi gh correl ati on among

predi ctor vari abl es, i s frequentl y di scussed i n the l i terature on

regressi on. There are di fferent methods for adjusti ng for such overl ap of

effects and we di scuss some of these approaches and thei r i mpl i cati ons

bel ow.

UNEQUAL

SAMPLES AND

UNBALANCED

DESIGNS

Multi-Way Univariate ANOVA 4 - 25

SPSS Training

From the Model di al og box you can choose a type of sums of squares. Type

I I I i s the most commonl y used and i s the defaul t. Each type adjusts for

unequal sampl e si zes i n a di fferent way. When al l subgroup sampl e si zes

are the same, the vari ous sums of squares’ cal cul ati ons yi el d the i denti cal

resul t.

• Type I . Thi s method i s al so known as hi erarchi cal

decomposi ti on of the sum-of-squares. Each term i s adjusted

for onl y the terms that precedes i t i n the model . Type I sums

of squares are commonl y used i n si tuati ons i n whi ch the

researcher has a pri or orderi ng of effects i n mi nd. For

exampl e, i f previ ous research has al ways found a factor to be

si gni fi cant there mi ght be i nterest i n determi ni ng i f a second

factors makes a substanti al contri buti on. I n thi s si tuati on,

the known factor mi ght be entered fi rst i n the model (not

adjusti ng for the second factor), whi l e the new factor fol l ows

i n the model (so i t i s tested after adjusti ng for the fi rst factor).

• Type I I . Thi s method cal cul ates the sums of squares of an

effect i n the model adjusted for al l other “appropri ate” effects.

An appropri ate effect i s one that corresponds to al l effects

that do not contai n the effect bei ng exami ned. Thus a mai n

effect woul d adjust for al l other mai n effects but i nteracti ons.

A two-way i nteracti on woul d adjust for al l mai n effects and

other two-way i nteracti ons, but i gnore three-way and hi gher

effects.

• Type I I I . Thi s i s the defaul t. Thi s method cal cul ates the sums

of squares of an effect i n the desi gn as the sums of squares

adjusted for any other effects that do not contai n i t and

orthogonal to any effects (i f any) that contai n i t. Essenti al l y,

each effect i s adjusted for al l other effects (mai n effects, same

order i nteracti ons, hi gher order i nteracti ons) i n the model .

Thus you can speak of an effect i ndependent of al l other

effects. The Type I I I sums of squares have a major advantage

i n that they are i nvari ant wi th respect to the cel l frequenci es

as l ong as the general form of esti mabi l i ty remai ns constant.

I n practi ce, thi s means that Type I I I sums of squares i s often

consi dered useful for an unbal anced model wi th no mi ssi ng

cel l s. I n a factori al desi gn wi th no mi ssi ng cel l s, thi s method

i s equi val ent to the Yates’ wei ghted-squares-of-means

techni que. Type I I I i s recommended on the strength of the

fact that a stati sti cal test for an effect adjusts for al l other

effects i n the model . However, i f there are mi ssi ng data cel l s,

Type I V i s preferred.

• Type I V. Thi s method i s desi gned for si tuati ons i n whi ch

there are mi ssi ng cel l s. The techni cal descri pti on of Type I V

sums of squares fol l ows. For any effect F i n the desi gn, i f F i s

not contai ned i n any other effect, the Type I V = Type I I I =

Type I I . When F i s contai ned i n other effects, Type I V

di stri butes the contrasts bei ng made among the parameters

i n F to al l hi gher-l evel effects equi tabl y. To gi ve a practi cal

exampl e, suppose we were testi ng sal ary di fferences due to

SUMS OF

SQUARES

Multi-Way Univariate ANOVA 4 - 26

SPSS Training

two factors: experi ence (i n three categori es) programmi ng i n a

computer l anguage, and the computer l anguage i tsel f (two

categori es). I f there were no programmers wi th the hi ghest

experi ence l evel for one of the l anguages (say Java), then that

experi ence category woul d not be used when eval uati ng the

computer l anguage mai n effect. Thus the computer l anguage

effect woul d be eval uated from onl y those experi ence

categori es contai ni ng programmers of both l anguages. Thi s i s

the source of the equi ty menti oned above. The Type I V sum-

of-squares method i s commonl y used for an unbal anced model

wi th empty cel l s.

Al l of these types of sums of squares are equi val ent when there i s onl y

one effect to be tested. Thus the one-way ANOVA procedure does not offer

any opti ons i n terms of sums of squares. Al so as menti oned above, they

gi ve i denti cal resul ts for a bal anced desi gn. I n practi ce today, the Type I I I

sums of squares method i s usual l y used i f there are no mi ssi ng cel l s. I f

cel l s are mi ssi ng the Type I V method i s general l y chosen. When a

researcher wants to test effects after adjusti ng for certai n effects, but

i gnori ng others, then the Type I or Type I I methods are empl oyed.

Any ti me that the between-subject porti on of an anal ysi s cannot be l ai d

out i n a ful l factori al setup wi th al l cel l s fi l l ed (havi ng at l east one

observati on), matters can become qui te compl i cated, and knowl edge of

the theory of esti mabl e functi ons i s requi red i n order to determi ne just

what hypotheses can be tested. The best advi ce that can be gi ven here i s

to consul t a stati sti ci an knowl edgeabl e i n experi mental desi gn i n order to

determi ne the appropri ate, testabl e hypotheses i n a parti cul ar case

Vi rtual l y any testabl e hypothesi s can be tested usi ng the General

Li near Model - Uni vari ate procedure wi th i ts fl exi bl e DESI GN

subcommand, but determi ni ng the appropri ate hypothesi s to test when

there are mi ssi ng cel l s can be extremel y di ffi cul t. I t shoul d be noted i n

parti cul ar that si mpl y appl yi ng standard sets of commands to such data

can produce resul ts that are uni nterpretabl e, si nce the parti cul ar

hypotheses tested have not been i denti fi ed.

For further i nformati on on the anal ysi s of such data, see Searl e

(1987) or Mi l l i ken and Johnson (1984). Of the two, Searl e’s book i s more

compl ete but rather techni cal . Mi l l i ken and Johnson’s book i s more

accessi bl e.

I n thi s chapter we general i zed ANOVA to the case wi th two or more

factors and di scussed post hoc compari sons i n the context. I n addi ti on,

the effects of unequal sampl e si ze and mi ssi ng cel l s were presented. We

turn next to another general i zati on: ANOVA wi th mul ti pl e dependent

measures – mul ti vari ate anal ysi s of vari ance.

EQUIVALENCE

AND

RECOMMENDATIONS

EMPTY CELLS

AND NESTED

DESIGNS

SUMMARY

Multivariate Analysis of Variance 5 - 1

SPSS Training

Multivariate Analysis Of

Variance

The purpose of thi s chapter i s to understand the properti es of

mul ti vari ate anal ysi s of vari ance, drawi ng on our previ ous di scussi ons of

uni vari ate ANOVA.

We run the EXPLORE procedure to check on some of the assumpti ons of

the mul ti vari ate ANOVA. We wi l l use the General Li near Model -

Mul ti vari ate procedure to run a two-factor mul ti vari ate anal ysi s of

vari ance wi th two dependent vari abl es.

We use the same data set as i n the pri or chapters, i .e., the nucl ear power

pl ant data set (pl ant.por).

A two-factor two dependent vari abl e mul ti vari ate anal ysi s of vari ance –

experi ence and pl ant capaci ty are the two fi xed factors (3 l evel s each),

cost and ti me (ti me before pl ant was l i censed) are the dependent

measures.

M

ul ti vari ate anal ysi s of vari ance (MANOVA) i s a general i zati on

of anal ysi s of vari ance that permi ts testi ng for mean di fferences

on several dependent measures si mul taneousl y. I n thi s chapter

we wi l l expl ore the rati onal e and assumpti ons of mul ti vari ate anal ysi s of

vari ance, revi ew the key summari es to exami ne i n the resul ts, and then

step through an anal ysi s l ooki ng at group di fferences on two measures i n

our data set.

Mul ti vari ate anal ysi s of vari ance i s used when there i s an i nterest i n

testi ng for mean di fferences between groups on several dependent

vari abl es si mul taneousl y. ANOVA wi l l test whether the mean of a si ngl e

vari abl e (scal ar) di ffers across groups. MANOVA covers the broader case

of testi ng for mean di fferences i n several vari abl es (vector) across groups.

Chapter 5

Objective

Method

Data

Design

INTRODUCTION

Multivariate Analysis of Variance 5 - 2

SPSS Training

Mul ti vari ate anal ysi s of vari ance (MANOVA) tests for popul ati on group

di fferences on several dependent measures si mul taneousl y. I nstead of

exami ni ng di fferences for a si ngl e outcome vari abl e (as anal ysi s of

vari ance does), MANOVA tests for di fferences on a set or vector of means.

The outcome measures (dependent vari abl es) are typi cal l y rel ated; for

exampl e, a set of rati ngs of empl oyee performance, mul ti pl e physi ol ogi cal

measures of stress, several scal es assessi ng an atti tude, a col l ecti on of

fi tness measures, mul ti pl e scal es measuri ng a product's appearance,

several measures of the fi scal heal th of a company.

MANOVA i s typi cal l y performed for two reasons: stati sti cal power

and control of fal se posi ti ve resul ts (al so known as Type I error).

Fi rst, there can be greater stati sti cal power, that i s, the abi l i ty to

detect true di fferences, i n a mul ti vari ate anal ysi s. The argument i s that i f

you have several i mperfect measures of an outcome, for exampl e, several

physi ol ogi cal measures of stress, the joi nt anal ysi s wi l l be more l i kel y to

show a true di fference i n stress than any i ndi vi dual anal ysi s. A

mul ti vari ate anal ysi s compares mean di fferences across several vari abl es

and takes formal account of thei r i ntercorrel ati ons. I n thi s way a smal l

di fference appeari ng i n several rel ated outcome vari abl es may resul t i n a

si gni fi cant mul ti vari ate test, al though no si ngl e outcome measure shows

a si gni fi cant di fference. Thi s i s not to say there i s a power advantage i n

throwi ng 20 or so unrel ated vari abl es i nto a mul ti vari ate anal ysi s of

vari ance, si nce a true di fference i n a si ngl e outcome measure can be

di l uted i n a joi nt test i nvol vi ng many vari abl es that di spl ay no effect.

However, i f you are i nterested i n studyi ng group di fferences i n outcomes

for whi ch vari ous measures exi st (thi s occurs i n marketi ng, soci al sci ence,

medi cal , ecol ogi cal , and engi neeri ng studi es), then MANOVA probabl y

carri es greater stati sti cal power.

The second argument for runni ng MANOVA i n pl ace of separate

uni vari ate (si ngl e outcome vari abl e) anal yses concerns control l i ng the

fal se posi ti ve rate when mul ti pl e tests are done. I f a separate ANOVA i s

run for every outcome vari abl e, each tested at the 0.05 l evel , then the

overal l (or experi ment-wi se) fal se posi ti ve rate (chance of obtai ni ng one or

more fal se posi ti ve test resul ts) i s wel l above 5 i n 100 (0.05 or 5%)

because of the mul ti pl e tests. A MANOVA appl i ed to a study wi th seven

outcome measures woul d resul t i n a si ngl e test performed at the 0.05

l evel . Al though there are certai nl y al ternati ve methods for control l i ng the

fal se posi ti ve rate when mul ti pl e tests are performed (for exampl e,

Bonferroni adjustments), usi ng a mul ti vari ate test accompl i shes thi s as

wel l . Some researchers fol l ow the procedure of fi rst performi ng a

mul ti vari ate test and onl y i f the resul ts are si gni fi cant woul d they

exami ne the i ndi vi dual uni vari ate test resul ts. Thi s provi des some control

over the fal se posi ti ve rate. I t i s not a perfect sol uti on (i t i s si mi l ar to the

argument for the LSD mul ti pl e compari son procedure) and has recei ved

some cri ti ci sm (see Huberty (1989)).

WHY PERFORM

MANOVA?

Multivariate Analysis of Variance 5 - 3

SPSS Training

Fi rst, the good news, MANOVA i s si mi l ar to ANOVA i n that vari ati on

between group means i s compared to vari ati on of i ndi vi dual s wi thi n

groups. Si nce thi s vari ati on i s measured on several vari abl es, MANOVA

computes a matri x contai ni ng the vari ati on and covari ati on (there are

several vari abl es!) of the vector of group means and a second matri x

contai ni ng wi thi n-group vari ances and covari ances. When testi ng i n

MANOVA, a rati o i s taken not of the two vari ances (two numbers), but of

two matri ces. I nstead of the usual “F” test, the mul ti vari ate form – cal l ed

a general i zed “F” i s used. The summary tabl e wi l l contai n some

unfami l i ar stati sti cs, but i n the end wi l l report the probabi l i ty of

obtai ni ng means as far (or farther) apart as you di d by chance al one, just

as ANOVA di d.

I n short, whi l e the requi red matri x notati on used whi l e deri vi ng or

descri bi ng MANOVA i s a bi t i nti mi dati ng, the same pri nci pl es that have

gui ded us so far i n anal ysi s usi ng ANOVA – vari ati on between group

means compared to vari ati on wi thi n groups – sti l l hol ds true i n

MANOVA. The stati sti cs change because we are now tal ki ng about

vectors of means (a set of means) bei ng tested joi ntl y.

The assumpti ons made when performi ng mul ti vari ate anal ysi s of

vari ance are l argel y extensi ons of those made under ordi nary anal ysi s of

vari ance. I n addi ti on to the usual assumpti ons for a l i near model

(addi ti vi ty, i ndependence between the error and model effects,

i ndependence of the errors), MANOVA testi ng assumes that the resi dual

errors fol l ow a mul ti vari ate normal di stri buti on i n the popul ati on; thi s i s

a general i zati on of the normal i ty assumpti on made i n ANOVA. I n SPSS

you can exami ne and test i ndi vi dual vari abl es for normal i ty wi thi n each

group. Thi s i s not equi val ent to testi ng for mul ti vari ate normal i ty, but i s

sti l l qui te useful i n eval uati ng the assumpti on. I n addi ti on, homogenei ty

of vari ance, fami l i ar from ANOVA, has a mul ti vari ate extensi on

concerni ng homogenei ty of the wi thi n-group vari ance-covari ance

matri ces. A mul ti vari ate test of homogenei ty of vari ance (Box’s M test) i s

avai l abl e to check thi s assumpti on.

For l arge sampl es, we expect departures from normal i ty to make

l i ttl e di fference. Thi s i s due to the central l i mi t theorem argument

combi ned wi th the fact that i n MANOVA we are general l y testi ng si mpl e

functi ons of group means. I f the sampl es are smal l and mul ti vari ate

normal i ty i s vi ol ated, the resul ts of the anal ysi s may be effected. Data

transformati ons (for exampl e, l ogs) on the dependent measure(s) may

al l evi ate the probl em, but have potenti al probl ems of thei r own

(i nterpretati on, i ncompl ete equi val ence between tests i n the transformed

and untransformed scal es). Unfortunatel y, a general cl ass of mul ti vari ate

nonparametri c tests i s not currentl y avai l abl e; devel opments i n thi s area

woul d hel p provi de a sol uti on.

Concerni ng homogenei ty of vari ance, i n practi ce i f the sampl e si ze i s

si mi l ar across groups then moderate departures from homogenei ty of the

wi thi n-group vari ance-covari ance matri ces do not effect the anal ysi s. I f

homogenei ty does not hol d and the sampl e si ze vari es substanti al l y

across groups, then test resul ts can be effected. I n the si mpl est scenari os,

HOW MANOVA

DIFFERS FROM

ANOVA

ASSUMPTIONS

OF MANOVA

Multivariate Analysis of Variance 5 - 4

SPSS Training

CHECKING THE

ASSUMPTIONS

the di recti on of the effect depends on whi ch si zed group has the l arger

vari ances, but speci fi c si tuati ons can be far more compl ex, i n whi ch case

l i ttl e i s known.

After i nvesti gati ng whether the assumpti ons are met, pri mary i nterest

woul d be i n the mul ti vari ate stati sti cal tests. I f si gni fi cant effects are

found you mi ght then exami ne uni vari ate resul ts. Addi ti onal l y, you can

perform post hoc compari sons to di scover just where the di fferences

resi de.

When testi ng for mean di fferences between groups on a si ngl e dependent

vari abl e the test comes down to a rati o of between-group to wi thi n-group

vari ati on – a si ngl e number. I n mul ti vari ate anal ysi s, we are l eft wi th

two matri ces contai ni ng the between and wi thi n-group vari ati on and

covari ati on. There are di fferent test stati sti cs that can appl y. Most of

them i nvol ve computi ng the l atent roots of some functi on of the rati o of

the two matri ces. I n depth di scussi on of these test measures i s beyond

the scope of thi s course, but some comments about thei r characteri sti cs

wi l l be made when we revi ew the resul ts.

We wi l l perform MANOVA wi th experi ence and pl ant capaci ty as the

factors wi th ti me to compl eti on (l i censi ng) and cost as dependent

vari abl es. Thi s pai ri ng of dependent vari abl es makes sense i f you bel i eve

the adage that ti me i s money. We expect that pl ants that requi re more

ti me to bui l d shoul d al so be more costl y. By usi ng both vari abl es we hope

to tap a more general measure of cost.

The anal ysi s we conducted i n Chapter 4 i nvesti gated the properti es of the

cost measure. There was some evi dence for heterogenei ty of vari ance,

al though the tests were not i n compl ete agreement. The normal pl ot of

the errors suggested some skewness. We wi l l now proceed to l ook at some

of the i nformati on from the EXPLORE procedure for the ti me vari abl e.

One cauti on, these pl ots l ook at each vari abl e separatel y whi l e the

assumpti ons for MANOVA i nvol ve the joi nt di stri buti on of the vari abl es.

As a practi cal matter, i f the assumpti ons are met for the vari abl es si ngl y,

thi ngs l ook good for the mul ti vari ate assumpti ons, but i f the assumpti ons

fai l for the si ngl e vari abl es, they shoul d fai l for the mul ti vari ate si tuati on

as wel l .

Cl i ck File..Open..Data

Move to the c:\ Train\ Anova di rectory (i f necessary)

Sel ect SPSS Portable (*.por) from the Fi l es of Type drop-down

l i st

Doubl e-cl i ck on plant.por

WHAT TO LOOK

FOR IN MANOVA

SIGNIFICANCE

TESTING

Proposed

Analysis

Multivariate Analysis of Variance 5 - 5

SPSS Training

Cl i ck on Analyze..Descriptive Statistics..Explore

Move time i nto the Dependent List box

Move capacity and exper i nto the Factor List box

Figure 5.1 Explore Dialog Box

Cl i ck the Plots pushbutton

Cl i ck the Normality plots with tests checkbox

Cl i ck the Untransformed opti on button i n the Spread vs.

Level with Levene Test area

Figure 5.2 Plots Dialog Box

Multivariate Analysis of Variance 5 - 6

SPSS Training

Cl i ck Continue.

Cl i ck Paste to paste the syntax i nto a Syntax Edi tor wi ndow

Figure 5.3 Syntax Editor Before Change

Si nce we want to anal yze the resul ts for ti me i n al l the combi nati ons

of capaci ty and experi ence we must i nsert the keyword BY between

capaci ty and experi ence i n the Exami ne syntax command.

Type BY between capaci ty and exper i n the Exami ne syntax

command

Figure 5.4 Syntax Editor After Change

Cl i ck Run..Current to run the Exami ne command

Multivariate Analysis of Variance 5 - 7

SPSS Training

We show the normal pl ots for the ti me vari abl e bel ow, noti ng that i n

most groups there are too few observati ons to perform a normal i ty test,

but i n the few cases that tests of normal i ty coul d be made, the data was

consi stent wi th i t.

Figure 5.4A Tests of Normality

Figure 5.5 Q-Q Plot of <800 MWe and 1-3 Plants

Multivariate Analysis of Variance 5 - 8

SPSS Training

Figure 5.6 Q-Q Plot of <800 MWe and 4-9 Plants

Figure 5.7 Q-Q Plot of <800 MWe and 10 or more Plants

Figure 5.8 Q-Q Plot of 800-1000 MWe and 4-9 Plants

Multivariate Analysis of Variance 5 - 9

SPSS Training

Figure 5.9 Q-Q Plot of 800-1000 MWe and 10 or more Plants

Figure 5.10 Q-Q Plot of >1000 MWe and 1-3 Plants

Figure 5.11 Q-Q Plot of >1000 MWe and 4-9 Plants

Multivariate Analysis of Variance 5 - 10

SPSS Training

Figure 5.12 Q-Q Plot of >1000 MWe and 10 or more Plants

Next we see the Box and Whi sker pl ot for ti me.

Figure 5.13 Box and Whisker Plot for Time

There seems to be a fai r amount of vari ati on among the groups i n

ti me taken to l i cense the pl ant. I t l ooks as i f there i s more spread for

groups wi th l ess experi ence than there i s for those wi th more experi ence.

Multivariate Analysis of Variance 5 - 11

SPSS Training

The Advanced Model s modul e wi thi n SPSS adds several General Li near

Model (GLM) procedures (mul ti vari ate (GLM) and repeated measures

(GLM)) to Uni vari ate wi thi n the SPSS Base system. These procedures

have several desi rabl e features from the perspecti ve of MANOVA: 1) Post

hoc tests on margi nal means (uni vari ate onl y), 2) Type I through Type I V

sums of squares avai l abl e (greater fl exi bi l i ty i n handl i ng unbal anced

desi gns/mi ssi ng cel l s), 3) Mul ti pl e Random Effect model s can be easi l y

speci fi ed, and 4) Resi dual s, predi cted val ues and i nfl uence measures can

be saved as new vari abl es to the data set. However, the MANOVA

procedure (whi ch was the ori gi nal procedure wi thi n SPSS performi ng

MANOVA, and i t i s sti l l avai l abl e through syntax) contai ns several useful

advanced functi ons. Wi thi n the MANOVA procedure are: 1) Roy-

Bargmann step-down tests (testi ng for mean di fferences on a si ngl e

dependent measure whi l e control l i ng for the other dependent measures),

and 2) Di mensi on reducti on anal ysi s and di scri mi nant coeffi ci ents. These

l atter functi ons provi de i nformati on as to how the dependent vari abl es

i nterrel ate wi thi n the context of group di fferences (for a si ngl e mai n-

effect anal ysi s, thi s i s equi val ent to a di scri mi nant anal ysi s).

I n short, whi l e we expect the SPSS General Li near Model procedure

wi l l be your fi rst choi ce for mul ti vari ate anal ysi s of vari ance, the

MANOVA procedure can contri bute addi ti onal i nformati on. (Pl ease note,

MANOVA can onl y be run from syntax.)

Cl i ck Analyze..General Linear Model..Multivariate

Move cost and time i nto the Dependent Variables l i st box

Move capacity and exper i nto the Fixed Factors l i st box

Figure 5.14 Multivariate Dialog Box

THE

MULTIVARIATE

ANALYSIS

Multivariate Analysis of Variance 5 - 12

SPSS Training

We must speci fy the dependent measure(s) and at l east one factor.

The di al og box for Mul ti vari ate contai ns l i st boxes for the dependent

vari abl es, factors and covari ates. The term “Fi xed Factor(s)” i n the

Mul ti vari ate di al og box remi nds us that the factors are assumed to be

fi xed, that i s, l evel s of the factor(s) used i n the anal ysi s were chosen by

the researcher (not randoml y sampl ed) and cover the range to whi ch

popul ati on concl usi ons wi l l be drawn. The Mul ti vari ate di al og box al so

permi ts a wei ght vari abl e to be i ncorporated i n the anal ysi s (performs

wei ghted l east squares). Al though rarel y used i n mul ti vari ate anal yses

(when used i t i s typi cal l y for uni vari ate anal ysi s), i t adjusts the anal ysi s

based on di fferent l evel s of preci si on (or heterogenei ty of vari ance) for

di fferent i ndi vi dual s or groups.

The Mul ti vari ate di al og box contai ns several pushbuttons. The Pl ots

pushbutton produces for each dependent measure a profi l e pl ot

di spl ayi ng group means. The Post Hoc pushbutton performs post hoc

tests on the margi nal means (for mul ti vari ate anal yses, each dependent

vari abl e i s anal yzed separatel y). The Contrasts pushbutton performs any

pl anned contrasts that the researcher wants to conduct; whi l e the

Opti ons pushbutton control s many opti ons for the anal ysi s. Fi nal l y the

Save pushbutton permi ts you to save predi cted val ues, resi dual s, and

i nfl uence measures for l ater exami nati on.

We coul d run the anal ysi s at thi s poi nt, but wi l l exami ne the di al og

boxes wi thi n Mul ti vari ate and request some addi ti onal opti ons.

Cl i ck Model pushbutton

Figure 5.15 Multivariate: Model Dialog Box

For most anal yses the Model di al og box i s not used. Thi s i s because

by defaul t a ful l factori al model (al l mai n effects, i nteracti ons, covari ates)

Multivariate Analysis of Variance 5 - 13

SPSS Training

i s fi t and the vari ous effects tested usi ng Type I I I sums of squares (each

effect i s tested after stati sti cal l y adjusti ng for al l other effects i n the

model ). I f there are any mi ssi ng cel l s i n your anal ysi s, you mi ght swi tch

to Type I V sums of squares, whi ch better adjusts for mi ssi ng cel l s. I f you

are runni ng speci al i zed factori al desi gns that are i ncompl ete (by pl an

every possi bl e combi nati on of factor l evel s i s not present), or i n whi ch

there are no repl i cates (i nteracti on effects are used as error terms), you

woul d cl i ck the Custom opti on button i n the Speci fy Model area and

i ndi cate whi ch mai n effects and i nteracti ons to be i ncl uded i n the model .

A custom model i s someti mes used i f there i s no i nterest i n testi ng hi gh

order i nteracti on effects. Si nce we are i nterested i n both mai n effects and

the one i nteracti on there i s no need to modi fy thi s di al og box.

Cl i ck Cancel to exi t the Model di al og box

Cl i ck Contrasts pushbutton

Figure 5.16 Multivariate: Contrasts Dialog Box

The Contrasts di al og box i s i denti cal for mul ti vari ate and uni vari ate

anal yses. You woul d use i t to speci fy mai n effect group compari sons of

i nterest, for whi ch parameter esti mates can be di spl ayed and tests

performed. I n stati sti cal l i terature, these contrasts are someti mes cal l ed

pl anned compari sons. For exampl e, i n an experi ment i n whi ch there are

three treatment groups and a control group there i s a very speci fi c

i nterest i n testi ng each experi mental group agai nst the control . One of

the contrast choi ces (Si mpl e) al l ows thi s. Several types of contrasts are

avai l abl e wi thi n the di al og box and usi ng syntax you can speci fy your

own (Speci al ). To request a set of contrasts, sel ect the factor from the

Factor(s) l i st box, sel ect the desi red contrast from the Contrast drop-down

l i st, and cl i ck the Change pushbutton. Si nce we have no speci fi c pl anned

contrasts that we wi shed to appl y to the mai n effects, we wi l l exi t the

Contrast di al og box.

Cl i ck Cancel to exi t the Contrasts di al og box

Cl i ck Post Hoc pushbutton

Multivariate Analysis of Variance 5 - 14

SPSS Training

Figure 5.17 Multivariate: Post Hoc Dialog Box

The Post Hoc di al og box i s used to request post hoc compari sons on

the observed subgroup means. Post hocs test for si gni fi cant di fferences

between every possi bl e pai ri ng of l evel s of a factor. Si nce many tests may

be i nvol ved, most post hocs adjust the si gni fi cance cri teri on based on the

number of tests i n order to control the fal se posi ti ve error rate (Type I

error). Usual l y post hocs are performed after a si gni fi cant mai n effect i s

found (i n the i ni ti al anal ysi s), and we wi l l vi si t thi s di al og box l ater i n

thi s chapter.

Cl i ck Cancel pushbutton to exi t the Post Hoc di al og box

Cl i ck Save pushbutton

Cl i ck the Unstandardized Predicted Values,

Unstandardized Residual, and Standardized Residual

check boxes

Multivariate Analysis of Variance 5 - 15

SPSS Training

Figure 5.18 Multivariate: Save Dialog Box

The Save di al og box al l ows you to save predi cted val ues, and vari ous

types of resi dual s and i nfl uence measures as new vari abl es i n the data

fi l e. Exami ni ng them mi ght i denti fy outl i ers and i nfl uenti al data poi nts

(data poi nts whose excl usi on substanti al l y effects the anal ysi s). Such

anal yses are standard for seri ous practi ti oners of regressi on and can be

appl i ed i n thi s context. I n addi ti on, the coeffi ci ent stati sti cs (coeffi ci ent

esti mates, standard errors, etc.) can be saved to an SPSS data fi l e (i n

matri x format) and mani pul ated l ater (for exampl e, appl y the coeffi ci ents

to generate predi cti ons for future cases). Al though we strongl y

recommend an exami nati on of the resi dual s, wi th the l i mi ted amount of

ti me avai l abl e i n thi s cl ass, we wi l l ski p thi s step.

Cl i ck Continue to process the resi dual requests

Cl i ck Options pushbutton

Sel ect capacity, exper, and the capacity*exper i nteracti on,

and move them i nto the Display Means for l i st box

Cl i ck the Homogeneity tests check box i n the Di spl ay area.

Multivariate Analysis of Variance 5 - 16

SPSS Training

Figure 5.19 Multivariate: Options Dialog Box

Cl i ck Continue to process our opti on requests.

Cl i ck OK to run the anal ysi s.

Movi ng these factor vari abl es and thei r i nteracti on term i nto the

Di spl ay Means for l i st box wi l l resul t i n esti mated means, predi cted from

the chosen model , appeari ng for the subgroups. These means can di ffer

from the observed means i f covari ates are speci fi ed or i f an i ncompl ete

model (one not contai ni ng al l mai n effects and i nteracti ons) i s used. I f no

covari ates are i ncl uded (our si tuati on), then post hoc anal yses can be

appl i ed to the observed margi nal means usi ng the Post Hoc pushbutton.

The Compare mai n effects’ checkbox can be used to have SPSS test for

si gni fi cant di fferences between every pai r of esti mated margi nal means

for each of the mai n effects i n the Di spl ay Means for l i st box. Note that by

defaul t, a si gni fi cance l evel of .05 (see Si gni fi cance l evel text box) i s

appl i ed to each test. Al so noti ce the confi dence i nterval s for the mean

di fferences have no adjustment (LSD (none)) based on the number of

tests made, al though Bonferroni and Si dak adjustments can be

requested.

I n the Di spl ay area, we requested that homogenei ty of vari ance tests

be performed. The Di spl ay choi ces al l ow you to vi ew suppl emental

i nformati on. Checki ng Descri pti ve Stati sti cs wi l l di spl ay means, standard

Multivariate Analysis of Variance 5 - 17

SPSS Training

devi ati ons, and counts for each cel l (subgroup) i n the anal ysi s. I f effect

si ze i s checked, then parti al eta-square val ues wi l l be presented for each

effect (mai n effects, i nteracti ons). Eta-square i s equi val ent to the r-

square i n regressi on; the parti al eta-square measures the proporti on of

vari ati on i n the dependent measure that can be attri buted to each effect

i n the model after adjusti ng for the other effects. Parameter esti mates

are the esti mates for the coeffi ci ents i n the model . Typi cal l y, they woul d

be requested i f you wanted to construct a predi cti on equati on. The

vari ous sums of square matri ces are computati onal summari es and not

i nterpreted di rectl y.

The Si gni fi cance l evel text box al l ows you to speci fy the si gni fi cance

l evel used to test for di fferences i n the esti mated margi nal means (defaul t

.05), and the confi dence i nterval s around parameter esti mates (defaul t

.95).

We are now ready to proceed. The SPSS command bel ow wi l l run the

anal ysi s.

GLM

cost ti me BY capaci ty exper

/METHOD = SSTYPE(3)

/I NTERCEPT = I NCLUDE

/SAVE = PRED RESI D ZRESI D

/EMMEANS = TABLES(capaci ty)

/EMMEANS = TABLES(exper)

EMMEANS = TABLES(capaci ty*exper)

/PRI NT = HOMOGENEI TY

/CRI TERI A = ALPHA(.05)

/DESI GN = capaci ty exper capaci ty*exper.

I n the GLM command the dependent vari abl es (cost, ti me) precede

the BY keyword whi l e the factor vari abl es (capaci ty, exper) fol l ow i t. Type

I I I (each effect i s eval uated after adjusti ng for al l other effects) sums of

squares i s requested (the defaul t). The Emmeans subcommand wi l l pri nt

a tabl e of esti mated margi nal means for the factor vari abl es.

Homogenei ty tests are obtai ned from the pri nt subcommand and the

al pha val ue (used for confi dence i nterval s and si gni fi cance tests of the

esti mated margi nal means) i s set to .05 (defaul t). The Desi gn

subcommand i s used to speci fy the model to be appl i ed to the data; i f

nothi ng were speci fi ed, a ful l factori al model woul d be fi t.

The fi rst pi ece of Mul ti vari ate output descri bes the factors i nvol ved i n the

anal ysi s. They are l abel ed between-subject factors; thi s i s appropri ate

because the three capaci ty groups and the three experi ence groups were

composed of di fferent pl ants. We wi l l see wi thi n-subject anal ysi s of

vari ance (repeated measures) i n a l ater chapter.

EXAMINING THE

RESULTS

Multivariate Analysis of Variance 5 - 18

SPSS Training

Figure 5.20 Between-Subject Factor Summary

The next two pi vot tabl es provi de i nformati on about the homogenei ty

of vari ance assumpti on. Box’s M tests for equal i ty of covari ance matri ces

(si nce there i s more than a si ngl e dependent measure) across the

di fferent subgroups. Levene’s homogenei ty test i s a uni vari ate test and i s

appl i ed separatel y to each dependent vari abl e.

Figure 5.21 Box’s Test of Equality of Covariance Matrices

Multivariate Analysis of Variance 5 - 19

SPSS Training

Figure 5.22 Levene’s Test of Equality of Error Variances

As menti oned above, Box’s M stati sti c can be used to test for equal i ty

of vari ance-covari ance matri ces i n the popul ati on. Thi s general i zes the

homogenei ty of vari ance test to a mul ti vari ate si tuati on, testi ng for

equal i ty of group vari ances (as uni vari ate homogenei ty tests woul d) and

covari ances (whi ch uni vari ate tests cannot) for al l dependent measures i n

one test. Box’s test i s not si gni fi cant (.541) i ndi cati ng no group di fferences

i n the covari ance matri ces made up of the dependent measures. As a

uni vari ate stati sti c, Levene’s test i s appl i ed to each dependent measure.

The ti me measure i s consi stent wi th homogenei ty assumpti on (si g. =

.427), whi l e cost measure does show group di fferences i n vari ance (si g. =

.014). Gi ven that the Box’s test i s not si gni fi cant, we wi l l proceed to vi ew

the mul ti vari ate test resul ts.

As wi th ANOVA, MANOVA i s robust under fai l ure of homogenei ty i f the

sampl e si zes i n the cel l s are l arge and roughl y equal . I f the sampl e si zes

are unequal , and l arger vari ances are associ ated wi th l arger cel l s, the

MANOVA tests are conservati ve so you can be confi dent of si gni fi cant

fi ndi ngs. I f smal l er cel l s have l arger vari ances, the MANOVA tests are

l i beral so the Type I error i s greater than i t shoul d be (see Haksti an,

Roed, and Li nd (1979)). I f vari ance i s rel ated to the mean l evel of the

group, a vari ance stabi l i zi ng transform i s a possi bi l i ty.

There are four mul ti vari ate test stati sti cs commonl y appl i ed: Pi l l ai ’s

cri teri on, Hotel l i ng’s Trace cri teri on, Wi l k’s Lambda, and Roy’s l argest

root. The fi rst three gi ve i denti cal resul ts i n a two-group anal ysi s, and

then can di ffer. They al l test the nul l hypothesi s of no group mean

di fferences i n the popul ati on. Resul ts of Monte Carl o si mul ati ons

focussi ng on robustness and stati sti cal power, suggest that under general

ci rcumstances Pi l l ai ’s test i s preferred. However, there are speci fy

si tuati ons, for exampl e when the dependent measures are hi ghl y rel ated

(formi ng a strong core), that one of the others i s the most powerful test.

As a general rul e, i f di fferent mul ti vari ate tests gi ve you markedl y

di fferent resul ts, i t suggests somethi ng about the di mensi onal i ty and type

of group di fferences. For an accessi bl e di scussi on of thi s see Ol sen (1976).

WHAT IF

HOMOGENEITY

FAILED?

MULTIVARIATE

TESTS

Multivariate Analysis of Variance 5 - 20

SPSS Training

The di stri buti on of the fi rst three mul ti vari ate stati sti cs fol l ows the

general i zed “F” di stri buti on. Whi l e more compl i cated than the si mpl e “F”,

and havi ng 3 sets of degrees of freedom i t yi el ds a probabi l i ty val ue just

as the regul ar “F” does. Thi s general i zed “F” test assumes a mul ti vari ate

normal di stri buti on of the errors.

Figure 5.23 Multivariate Analysis of Variance Table

The upper part of the tabl e tests whether the overal l mean (I ntercept)

di ffers from zero i n the popul ati on. I t i s not i nteresti ng si nce al l i t shows

i s that overal l , cost and ti me were not both zero.

The Val ue col umn di spl ays the sampl e val ue of each of the four

mul ti vari ate test stati sti cs. They are converted to “F” stati sti cs (“F”

col umn) and the associ ated hypothesi s (Hypothesi s df) and error (error df)

degrees of freedom fol l ow. These four col umns are techni cal summari es;

we are pri mari l y i nterested i n the si gni fi cance val ues that appear under

the “Si g.” Headi ng. Here we see that for the capaci ty factor, three of the

four tests show that there are di fferences i n the dependent measures

(si gni fi cance val ues of .055, .043, .034, .007). Noti ce the tests are not i n

agreement i f you test at the .05 l evel . Whi l e Pi l l ai ’s i s often the

recommended test, i t woul d be safe to concl ude at l east there i s a

margi nal effect, perhaps somethi ng worth l ooki ng at wi th a l arger

sampl e. We al so see that for the experi ence factor, al l four tests show that

there are group di fferences i n the means (si gni fi cance val ues of .035, .028,

.023, and .005). The test of an i nteracti on between capaci ty and

experi ence was not si gni fi cant (si gni fi cance val ues of .307, .294, .286, and

.073). Gi ven these fi ndi ngs, we are next i nterested i n l ooki ng at whether

both cost and ti me show di fferences (uni vari ate tests), and knowi ng

whi ch groups di ffer from whi ch others (post hocs).

Multivariate Analysis of Variance 5 - 21

SPSS Training

Two addi ti onal col umns can appear i n the mul ti vari ate (or

uni vari ate) anal ysi s of vari ance tabl e, but do not do so by defaul t. The

noncentral i ty parameter i s a techni cal summary that descri bes the

magni tude of the mean group di fferences i n the form of a parameter for

the “F” di stri buti on. I t can be used to cal cul ate the appropri ate sampl e

si ze (stati sti cal power anal ysi s) i f thi s study were to be repeated whi l e

expecti ng to fi nd the same group di fferences. The Observed Power

i ndi cates how l i kel y you are to obtai n a si gni fi cant group di fference

(testi ng at the .05 l evel ) i f the popul ati on group means matched the

means i n the sampl e. Thi s can be useful i n conducti ng postmortems of

your anal ysi s, that i s, expl ori ng why you fai l ed to fi nd si gni fi cant

di fferences.

We now exami ne the test resul ts for each dependent measure.

Figure 5.24 Univariate Test Results

Al though both dependent measures appear i n thi s tabl e the resul ts

are cal cul ated i ndependentl y, and are i denti cal to what you woul d obtai n

i f separate anal yses were run on each dependent measure (uni vari ate

ANOVA). Thus we fi nd whether both of the dependent measures showed

si gni fi cant group di fferences. The sums of squares, df (degrees of

freedom), mean square, and “F” col umns are what we woul d expect i n an

ordi nary tabl e. We descri bed and di sregarded the I ntercept i nformati on

i n the mul ti vari ate summary. Movi ng to the capaci ty secti on, we fi nd cost

i s ri ght on the border of si gni fi cance (.05) whi l e ti me i s not si gni fi cant

(.101). From the experi ence summary we fi nd that agai n cost i s

si gni fi cant (.046) whi l e ti me i s not (.066). I n the i nteracti on area we fi nd

that nei ther cost nor ti me are si gni fi cant (.701 and .089). The Error

Multivariate Analysis of Variance 5 - 22

SPSS Training

secti on summari zes the wi thi n-group vari ati on. The Corrected Model

summary pool s together al l model effects (excl udi ng the i ntercept), and i s

equal to the Corrected Total mi nus the Error Total . Some anal ysts turn

to thi s overal l test fi rst to see i f any effects are si gni fi cant, and then

proceed to exami ne i ndi vi dual effects. However, most researchers move

di rectl y to the tests of speci fi c mai n effects and i nteracti ons. The Total

summary pool s together everythi ng i n the anal ysi s (i ncl udi ng the error. I t

shoul d be noted that i f the sampl e si zes are not equal when mul ti pl e

factors are i ncl uded i n the anal ysi s, then under Type I I I sums of squares

(the defaul t), the sums of squares for the total s wi l l not general l y be

equal to the sums of thei r component sums of squares.

Fi nal l y, r-square val ues (based on the corrected model ) for each

vari abl e appear as footnotes. Noti ce that the adjusted r-square for ti me

(.320) i s hi gher than that of cost (.222). Thi s i s consi stent wi th ti me

havi ng a hi gher “F” stati sti c i n the corrected model secti on.

Figure 5.25 Estimated Marginal Means for Capacity

Figure 5.26 Estimated Marginal Means for Experience

Multivariate Analysis of Variance 5 - 23

SPSS Training

Figure 5.27 Estimated Marginal Means for Capacity by Experience

Subgroups

Esti mated margi nal means are means esti mated for each l evel of a

factor averagi ng across al l l evel s of other factors (margi nal s), based on

the speci fi ed model (esti mated). By defaul t, SPSS fi ts a compl ete model

(al l mai n-effects and i nteracti ons), and i n such cases these esti mated

means are i denti cal to the (unwei ghted) observed means. However, i f a

parti al model were fi t (for exampl e, i f al l mai n effects were i ncl uded but

hi gher order i nteracti ons were not) then the esti mated means wi l l di ffer

from the (unwei ghted) observed means. We see i n the tabl es above that

the average ti me and cost i ncrease wi th the pl ant capaci ty. I nteresti ngl y,

regardi ng experi ence, ti me and cost have thei r l owest means i n the

mi ddl e experi ence group.

We fi rst vi ew the casewi se l i sti ng of resi dual s for ti me. We wi l l ski p the

l i sti ng for cost si nce i t i s i denti cal to that seen i n Chapter 4, when we ran

the same model on cost al one.

Cl i ck Analyze..Reports..Case Summary

Move time, pre_2, res_2, and zre_2 i nto the Variables l i st box.

CHECKING THE

RESIDUALS

Multivariate Analysis of Variance 5 - 24

SPSS Training

Figure 5.28 Case Summary Dialog Box

Cl i ck on OK

The fol l owi ng syntax wi l l al so produce the case summary tabl e.

SUMMARI ZE

/TABLE=ti me pre_2 res_2 zre_2

/FORMAT=VALI DLI ST NOCASENUM TOTAL LI MI T=100

/TI TLE=’Case Summari es’ /FOOTNOTE ‘ ‘

/MI SSI NG=VARI ABLE

/CELLS=COUNT.

Multivariate Analysis of Variance 5 - 25

SPSS Training

Figure 5.29 Casewise Listing of Residuals

There seem to be no especi al l y l arge standardi zed resi dual s. Once

agai n the predi cted val ues are i denti cal for al l members of the same

group.

From the mul ti vari ate anal ysi s of vari ance we concl ude that the

dependent vari abl es show si gni fi cant mean di fferences across experi ence

groups, al though not i n a stri ctl y i ncreasi ng fashi on. There i s a modest

effect across capaci ty groups and no si gn of an i nteracti on. Of the two

measures, cost seems more sensi ti ve to the group di fferences. What mi ght

qual i fy the resul t? You coul d argue that the groupi ngs of experi ence and

capaci ty l evel s are arbi trary and di fferent groupi ngs coul d yi el d di fferent

resul ts. Al so, wi th onl y 32 observati ons over a ni ne-cel l desi gn wi th two

dependent measures, we expect very l i ttl e power to detect di fferences.

CONCLUSION

Multivariate Analysis of Variance 5 - 26

SPSS Training

At thi s poi nt of the anal ysi s i t i s natural to ask just whi ch groups di ffer

from whi ch others. The GLM procedure i n SPSS wi l l perform separate

post hoc tests on each dependent vari abl e i n order to determi ne thi s. Post

hoc tests are usual l y performed to i nvesti gate whi ch pai rs of l evel s wi thi n

a factor di ffer after an overal l (mai n effect) di fference has been

establ i shed. SPSS offers many post hoc tests and characteri sti cs of them

were revi ewed i n Chapter 3. Recal l the basi c i dea behi nd post hoc testi ng

i s that some adjustment of the Type I (fal se posi ti ve or al pha) error rate

must be made due to the number of pai rwi se compari sons made. I n our

exampl e, onl y three tests need to be performed wi thi n each factor (group

1 vs. 2, 1 vs. 3, and 2 vs. 3). However, i f there were ten l evel s of

experi ence (or capaci ty or both), then there woul d be [10*9]/2 or 45

pai rwi se tests, and the probabi l i ty of one or more fal se posi ti ve resul ts

woul d be qui te substanti al . We asked for the fol l owi ng types of post hoc

tests to be performed: LSD (the most l i beral ), Scheffe (the most

conservati ve), and the Games-Howel l (does not assume equal vari ances—

recal l the Levene test i ndi cated there mi ght be a homogenei ty of vari ance

probl em wi th cost). Al though both experi ence and capaci ty were found

si gni fi cant, bel ow we request post hocs onl y for experi ence. I n practi ce

you woul d vi ew post hoc resul ts for each si gni fi cant mai n effect.

Cl i ck the Di al og Recal l tool , then sel ect Multivariate

Cl i ck Post Hoc pushbutton

Move exper i nto the Post Hoc Tests for l i st box

Cl i ck LSD, Scheffe, and Games-Howell checkboxes

Figure 5.30 Post Hoc Dialog Box

POST HOC

TESTS

Multivariate Analysis of Variance 5 - 27

SPSS Training

Cl i ck Continue to process the post hoc requests

Cl i ck OK to run

The SPSS syntax bel ow wi l l produce the post hoc anal ysi s.

GLM

cost ti me BY capaci ty exper

/METHOD = SSTYPE(3)

/I NTERCEPT = I NCLUDE

/SAVE = PRED RESI D ZRESI D

/POSTHOC = exper ( SCHEFFE LSD GH

/EMMEANS = TABLES(capaci ty)

/EMMEANS = TABLES(exper)

EMMEANS = TABLES(capaci ty*exper)

/PRI NT = HOMOGENEI TY

/CRI TERI A = ALPHA(.05)

/DESI GN = capaci ty exper capaci ty*exper.

The Posthoc subcommand i nstructs GLM to appl y Scheffe, LSD and

Games-Howel l (GH) mul ti pl e compari son tests to the experi ence (exper)

factor.

Al though both mul ti pl e compari son and homogeneous subset tabl es

wi l l be produced, we present onl y the former. Al so note that for ease of

readi ng, the post hoc resul ts, whi ch appear i n a si ngl e pi vot tabl e, are

di spl ayed bel ow as three fi gures (wi thi n the pi vot tabl e edi tor, the pi vot

tray wi ndow was opened and the post hoc test (Test) i con was moved i nto

the l ayer di mensi on).

Figure 5.31 LSD Post Hoc Test for Experience

Note

The mul ti pl e compari son tabl e was edi ted i n the Pi vot Tabl e Edi tor and

the tests (TEST i con) were pl aced i n the l ayer di mensi on (see Chapter 4

for i nstructi ons) so we can separatel y vi ew the resul ts from each post hoc.

Multivariate Analysis of Variance 5 - 28

SPSS Training

Figure 5.32 Scheffe Post Hoc Test for Experience

Figure 5.33 Games-Howell Post Hoc Test for Experience

We can see that for both cost and ti me, every possi bl e group pai ri ng

appears for the factor. The “Mean Di fference” col umn contai ns the

di fference i n sampl e means between the two groups, and the “Standard

Error” col umn contai ns the standard error of the di fference between the

means. The “Si g” col umn contai ns the si gni fi cance val ue when the

parti cul ar test i s appl i ed to the group di fferences. These post hoc test

resul ts provi de detai l concerni ng si gni fi cant mai n effects.

Not surpri si ngl y, the Scheffe resul ts show fewer si gni fi cant group

di fferences than LSD. Noti ce that there are no group di fferences on ti me

usi ng the Games-Howel l tests, al though both Scheffe and LSD show

di fferences. Thi s i s probabl y due to the Games-Howel l bei ng l ess powerful

Multivariate Analysis of Variance 5 - 29

SPSS Training

SUMMARY

when homogenei ty of vari ance hol ds, as i t does for ti me.

I n thi s chapter we di scussed mul ti vari ate anal ysi s of vari ance and

appl i ed i t to the pl ant data. We exami ned resi dual s from the anal ysi s and

performed post hoc tests.

Multivariate Analysis of Variance 5 - 30

SPSS Training

Within-Subject Designs: Repeated Measures 6 - 1

SPSS Training

Within-Subject Designs:

Repeated Measures

The objecti ve of thi s chapter i s to understand the di sti ngui shi ng

characteri sti cs, assumpti ons, and methods of approachi ng wi thi n-subject

(repeated measures) ANOVA, and to see how SPSS i mpl ements such

anal yses. We wi l l di scuss the uni vari ate and mul ti vari ate approaches to

the repeated measures anal ysi s.

We di scuss the l ogi c and assumpti ons of repeated measures and use the

Expl ore procedure to exami ne the data. We then use the GLM Repeated

Measures procedure to run a repeated-measures ANOVA wi th a si ngl e

wi thi n-subject factor. Pai rwi se compari sons are run and pl anned

compari sons are set up.

The data set contai ns vocabul ary test scores obtai ned from the same

chi l dren over four years (grades 8 through 11). The sex of each chi l d i s

al so recorded, but not used i n thi s anal ysi s.

I

n thi s chapter, we di scuss yet another speci es of ANOVA, the speci al

case where each subject (or uni t of anal ysi s) appears i n several

condi ti ons. We wi l l see that thi s repeated measurement feature

requi res some addi ti onal assumpti ons and a more compl i cated approach

to computi ng error terms. The vari ati on wi thi n each group, our constant

compani on to thi s poi nt, must undergo some revi si on to accommodate the

fact the same subject i s tested i n mul ti pl e condi ti ons. We wi l l di scuss the

general features and assumpti ons of wi thi n-subject ANOVA, then anchor

the di scussi on wi th an actual anal ysi s.

Chapter 6

Objectives

INTRODUCTION

Method

Data

Within-Subject Designs: Repeated Measures 6 - 2

SPSS Training

Repeated measures (al so cal l ed wi thi n-subject) studi es are used for

several reasons. Fi rst, by usi ng a subject as her own control a more

powerful (greater l i kel i hood of fi ndi ng a real di fference) anal ysi s i s

possi bl e. For exampl e, consi der testi ng me under two drug condi ti ons

compared to testi ng two i ndi vi dual s, each under a si ngl e condi ti on. By

testi ng me twi ce i nstead of di fferent peopl e each ti me, the vari abi l i ty due

to person-to-person di fferences i s reduced when compari ng the two

means, whi ch shoul d provi de a more sensi ti ve anal ysi s. A second reason

i n practi ce i s cost reducti on; recrui tment costs are l ess i f an i ndi vi dual

can contri bute data to mul ti pl e condi ti ons.

However, repeated measures anal yses have potenti al probl ems. Si nce

an i ndi vi dual appears i n mul ti pl e condi ti ons there may be practi ce,

fati gue, or carryover effects. Counterbal anci ng the order of condi ti ons

addresses the carryover probl em, and the di fferent tri al s or condi ti ons are

often wel l spaced to reduce the practi ce and fati gue i ssues.

Exampl es of Repeated Measures Anal ysi s:

1. Marketi ng – Compare customer’s rati ng on four di fferent

brands, or di fferent products, for exampl e four di fferent

perfume fragrances.

2. Medi ci ne – Compare test resul ts before, i mmedi atel y after,

and si x months after a procedure.

3. Educati on – Compare performance test scores before and

after an i nterventi on program.

4. Engi neeri ng – Compare output from di fferent machi nes after

runni ng 1 hour, 8 hours, 16 hours, and 24 hours.

5. Agri cul ture – The ori gi nal research area for whi ch these

methods were devel oped. Di fferent chemi cal treatments are

appl i ed to di fferent areas wi thi n a pl ot of l and (spl i t pl ots).

6. Human Factors – Compare performance (reacti on ti me,

accuracy) under di fferent envi ronmental condi ti ons. For

exampl e, exami ne pi l ot accuracy i n readi ng di fferent types of

di al s under varyi ng l i ghti ng condi ti ons.

For an accessi bl e i ntroducti on to repeated measures wi th a number of

worked exampl es, see Hand and Tayl or (1987). For more techni cal and

broad (beyond ANOVA) di scussi ons of repeated measures anal ysi s see

Li ndsey (1993) or Crowder and Hand (1990).

I n the si mpl est case of repeated measures anal ysi s two val ues are

compared for each subject. For exampl e, suppose that for each i ndi vi dual

we record a physi ol ogi cal measure under two condi ti ons. We can obtai n

sampl e means for each drug and want to determi ne whether there are

si gni fi cant di fferences between the drugs i n the l arger popul ati on. One

di rect way to approach thi s woul d be to compute a di fference or change

score for each i ndi vi dual , obtai ned by subtracti ng the two drug measures,

and testi ng whether the mean di fference score i s di fferent from zero. We

i l l ustrate thi s i n the spreadsheet bel ow.

WHY DO A

REPEATED

MEASURES

STUDY?

THE LOGIC OF

REPEATED

MEASURES

Within-Subject Designs: Repeated Measures 6 - 3

SPSS Training

Table 6.1 Difference Scores with Two Conditions

We see a di fference score i s cal cul ated for every i ndi vi dual and these

scores are averaged together. I f there were no drug di fferences then we

woul d expect the average di fference score to be about zero. To determi ne

i f the popul ati on mean di fference score i s di fferent from zero, we need

some measure of the vari abi l i ty of sampl e mean di fference scores. We can

obtai n such a vari abi l i ty measure by cal cul ati ng the vari ati on of

i ndi vi dual di fference scores around the sampl e mean di fference score. I f

the sampl e mean di fference score i s far enough from zero that i t cannot

be accounted for by the vari ati on of i ndi vi dual di fference scores, we say

there i s a si gni fi cant popul ati on di fference. Thi s i s what a pai red t test

does.

The anal ysi s becomes a bi t more compl ex when each subject (uni t of

anal ysi s) appears i n more than two l evel s (condi ti ons) of a repeated

measure factor. Now no si ngl e di fference score can summari ze the

di fferences. We i l l ustrate thi s bel ow.

Table 6.2 Difference Scores with Four Conditions

Within-Subject Designs: Repeated Measures 6 - 4

SPSS Training

Al though no one di fference score can summari ze al l drug di fferences

here, we can compute addi ti onal di fference scores, and thus account for

drug effects. As you woul d i magi ne the number of these di fferences, or

contrasts, i s equal to the degrees of freedom avai l abl e (one l ess than the

number of l evel s i n the factor). For two condi ti ons, onl y one contrast i s

possi bl e; for four condi ti ons, there are three; for k condi ti ons, k-1

contrasts are requi red. I f the assumpti ons of repeated measures ANOVA

are met then these di fferences, or contrasts between condi ti ons, can be

pool ed together to provi de a si gni fi cance test for an overal l effect.

We used si mpl e di fferences to compare the drug condi ti ons (drug 1

mi nus drug 2, etc.) There are many other contrasts that coul d be appl i ed.

For exampl e, we coul d have cal cul ated drug 1 mi nus the mean of drugs 2,

3, and 4; then drug 2 versus the mean of drugs 3 and 4; and fi nal l y drug 3

versus drug 4. As l ong as the assumpti ons of repeated measures are met,

the speci fi c choi ce of contrasts doesn’t matter when the overal l test i s

cal cul ated. However, i f you have pl anned compari sons you want tested,

then you woul d request those.

I n each of the two above exampl es, we wound up wi th one fewer

di fference vari abl e than the ori gi nal number of condi ti ons. There i s

another vari abl e that i s cal cul ated i n repeated measures, whi ch

represents the mean across al l condi ti ons. I t i s used when testi ng effects

of between-group factors, havi ng averaged across al l l evel s of the

repeated measure factor(s). Thi s mean effect i s shown i n the i l l ustrati on

bel ow:

Table 6.3 Mean and Difference Scores with Four Conditions

The mean score across drug condi ti ons for each subject i s recorded i n

the mean col umn. As menti oned above any tests i nvol vi ng onl y between-

group factors (for exampl e, sex, age group) woul d use thi s vari abl e.

Thi s i dea of computi ng di fference scores or contrasts across condi ti ons

for each subject, then usi ng the means and subject to subject vari ati on as

the basi s of testi ng whether the average contrast val ue i s di fferent from

zero i n the popul ati on, i s the core concept of repeated measures ANOVA.

Once you become comfortabl e wi th i t, the rest fal l s i nto pl ace. SPSS

performs repeated measures ANOVA by computi ng contrasts across the

Within-Subject Designs: Repeated Measures 6 - 5

SPSS Training

repeated measures factor l evel s for each subject, and then testi ng

whether the means of the contrasts are si gni fi cantl y di fferent from zero.

A matri x of coeffi ci ents detai l i ng these contrasts can be di spl ayed and i s

cal l ed the transformati on matri x.

A repeated measure ANOVA has several assumpti ons common to al l

ANOVA. Fi rst, that the model i s correctl y speci fi ed and addi ti ve.

Secondl y, that the errors fol l ow a normal di stri buti on and are

i ndependent of the effects i n the model . Thi s l atter assumpti on i mpl i es

homogenei ty of vari ance when more than a si ngl e group i s i nvol ved. As

wi th general ANOVA, moderate departures from normal i ty do not have a

substanti al effect on the anal ysi s, especi al l y i f the sampl e si zes are l arge

and the shape of the di stri buti on i s si mi l ar from group to group (i f

mul ti pl e groups are i nvol ved). I n mul ti -group studi es, fai l ure of

homogenei ty of vari ance i s a probl em unl ess the sampl e si zes are about

equal .

I n addi ti on to standard ANOVA assumpti ons, there i s one speci fi c to

repeated measures when there are more than two l evel s to a repeated

measures factor. I f a repeated measures factor contai ns onl y two l evel s,

there i s onl y one di fference vari abl e that can be cal cul ated, and you need

not be concerned about the assumpti on. However, i f a repeated measures

factor has more than two l evel s, you general l y want an overal l test of

di fferences (mai n effect). Pool i ng the resul ts of the contrasts (descri bed

above) between condi ti ons creates the test stati sti c (F). The assumpti on

cal l ed spheri ci ty deal s wi th when such pool i ng i s appropri ate. The basi c

i dea i s that i f the resul ts of two or more contrasts (the sums of squares)

are to be pool ed, then they shoul d be equal l y wei ghted and uncorrel ated.

To i l l ustrate why thi s i s i mportant, vi ew the spreadsheet bel ow:

Table 6.4 Scale Differences and Redundancies in Contrasts

ASSUMPTIONS

The fi rst contrast vari abl e represents the di fference between drug 1

and drug 2 (Drug 1 – Drug 2). However, the second i s 100 ti mes the

di fference between Drug 2 and Drug 3. I t i s cl ear from the mean and

standard devi ati on val ues of the second di fference vari abl e that thi s

vari abl e woul d domi nate the other di fference vari abl es i f the resul ts were

pool ed. I n order to protect agai nst thi s, normal i zati on i s appl i ed to the

coeffi ci ents used i n creati ng the contrasts (each coeffi ci ent i s di vi ded by

the square root of the sum of the squared coeffi ci ents).

Within-Subject Designs: Repeated Measures 6 - 6

SPSS Training

Al so, noti ce that the thi rd contrast i s a dupl i cate of the fi rst.

Admi ttedl y, thi s i s an extreme exampl e, but i t serves to make the poi nt

that si nce the resul ts from each contrast are pool ed (summed), then any

correl ati on among the contrast vari abl es wi l l yi el d i ncorrect test

stati sti cs. I n order to provi de the best chance of uncorrel ated contrasts

vari abl es, the contrasts or transformati ons are forced to be orthogonal

(uncorrel ated) before appl yi ng them to the data.

Thi s combi nati on of normal i zati on and forci ng the ori gi nal contrasts

to be orthogonal (uncorrel ated) i s cal l ed orthonormal i zati on. Agai n, when

actual l y appl i ed to the data, these properti es may not hol d, and that i s

where the test of spheri ci ty pl ays an i mportant rol e.

Thi s combi nati on of assumpti ons, equal vari ances of the contrast

vari abl es and zero correl ati on among them, i s cal l ed the spheri ci ty

assumpti on. I t i s cal l ed spheri ci ty because a sphere i n mul ti di mensi onal

space woul d be defi ned by an equal radi us val ue al ong each

perpendi cul ar (uncorrel ated) axi s. Al though contrasts are chosen so that

spheri ci ty wi l l be mai ntai ned, when appl i ed to a parti cul ar data set,

spheri ci ty may be vi ol ated. The vari ance-covari ance matri x of a group of

contrast vari abl es that mai ntai n spheri ci ty woul d exhi bi t the pattern

shown bel ow.

Table 6.5 Covariance Matrix of Contrast Variables when Sphericity

Holds

The di agonal el ements represent the vari ance of each contrast when

appl i ed to the data and the off-di agonal el ements are the covari ances. I f

the spheri ci ty assumpti on hol ds i n the popul ati on, the vari ances wi l l

have the same val ue (represented by the V) and the covari ances wi l l be

zero.

A test of the spheri ci ty assumpti on i s avai l abl e. I f the spheri ci ty

assumpti on i s met then the usual “F” test (pool i ng the resul ts from each

contrast) i s the most powerful test. When spheri ci ty does not hol d, there

are several choi ces avai l abl e. Techni cal correcti ons (Greenhouse-Gei sser,

Huynh-Fel dt) can be made to the “F” tests (adjusti ng the number of the

degrees of freedom) that modi fy the resul ts based on the degree of

spheri ci ty vi ol ati on. Another al ternati ve i s to take a mul ti vari ate

approach i n whi ch contrasts are tested si mul taneousl y whi l e taki ng

expl i ci t account of the correl ati on and vari ance di fferences. The di ffi cul ty

i n choosi ng between these approaches i s that no si ngl e method has been

Within-Subject Designs: Repeated Measures 6 - 7

SPSS Training

found (i n Monte Carl o studi es) to be best under al l condi ti ons exami ned.

Al so, the test for spheri ci ty i tsel f i s not al l that sensi ti ve. For a summary

of the vari ous approaches and a suggested strategy for testi ng, see

Looney and Stanl ey (1989).

The data are reported i n Bock (1975, p.454) and consi st of vocabul ary

scores obtai ned from a cohort of pupi l s at the ei ghth through el eventh

grade l evel . Al ternati ve forms of the vocabul ary secti on of the

Cooperati ve Readi ng Tests were admi ni stered and rescal ed to an

arbi trary ori gi n. I nterest i s i n the growth rate of vocabul ary at a ti me

when physi cal growth i s sl owi ng. Si xty-four subjects were studi ed.

We wi l l perform a repeated measures anal ysi s on the vocabul ary growth

data. There i s speci fi c i nterest i n the trend over ti me – i s i t l i near? The

data wi l l be exami ned then repeated measures ANOVA appl i ed wi th

attenti on pai d to the assumpti ons menti oned above.

The key concept to repeated measures anal ysi s i s that the contrasts

(whi ch are data transformati ons) wi l l be appl i ed across condi ti ons of the

wi thi n-subject factors, and i f we concl ude the contrasts are non-zero i n

the popul ati on, there are si gni fi cant di fferences between the condi ti ons.

Cl i ck File..Open..Data (move to the c:\ Train\ Anova di rectory)

Sel ect SPSS Portable (.por) on the Fi l es of Type drop-down l i st

Doubl e-Cl i ck on Vocab

Cl i ck on Analyze..Descriptive Statistics..Explore

Sel ect the vari abl es Grade8, Grade9, Grade 10, and Grade11

and move them to the Dependent List box.

Data Set

PROPOSED

ANALYSIS

KEY CONCEPT

Within-Subject Designs: Repeated Measures 6 - 8

SPSS Training

Figure 6.1 Explore Dialog Box

Cl i ck on the Plots pushbutton

Cl i ck Dependents together opti on button i n the Boxpl ots area

Cl i ck the Normality tests with plots check box

Figure 6.2 Plots Dialog Box

Pl aci ng the dependent vari abl es together i n a si ngl e boxpl ot, i nstead

of separate pl ots (Factor l evel s together), permi ts di rect compari son of the

vari abl es. Normal probabi l i ty pl ots and tests are al so requested.

Within-Subject Designs: Repeated Measures 6 - 9

SPSS Training

Cl i ck Continue

Cl i ck OK

The command bel ow wi l l run the anal ysi s.

EXAMI NE

VARI ABLES=grade8 grade9 grade10 grade11

/PLOT BOXPLOT STEMLEAF NPPLOT

/COMPARE VARI ABLES

/STATI STI CS DESCRI PTI VES

/CI NTERVAL 95

/MI SSI NG LI STWI SE

/NOTOTAL.

We request summari es of the four vari abl es, a normal probabi l i ty pl ot

wi th normal i ty test wi l l appear (/Pl ot Nppl ot) for each vari abl e. Al so, the

four vari abl es wi l l appear i n a si ngl e boxpl ot (/Compare Vari abl es).

Figure 6.3 Descriptives for Grade 8

Within-Subject Designs: Repeated Measures 6 - 10

SPSS Training

Figure 6.4 Descriptives for Grade 9

Figure 6.5 Descriptives for Grade 10

Figure 6.6 Descriptives for Grade 11

Within-Subject Designs: Repeated Measures 6 - 11

SPSS Training

As we can see from the descri pti ve stati sti cs, the mean score for the

readi ng tests i s goi ng up i n each year (from 1.1372 i n Grade 8 to 3.4716

i n Grade 11), but the vari ances are fai rl y constant across the years (from

3.568 to 4.704).

Figure 6.7 Normality Tests

The normal i ty tests show that there i s no probl em wi th the

assumpti on of normal i ty for grades 8 and 10. However, grades 9 and 11

show that there i s some devi ati on from normal i ty i n those grade resul ts.

Al though the assumpti on of normal i ty i s vi ol ated, the sampl e si ze i s l arge

enough that we can probabl y i gnore that vi ol ati on.

Figure 6.8 Q-Q Plot for Grade 8

Within-Subject Designs: Repeated Measures 6 - 12

SPSS Training

Figure 6.9 Q-Q Plot for Grade 9

Figure 6.10 Q-Q Plot for Grade 10

Figure 6.11 Q-Q Plot for Grade 11

Within-Subject Designs: Repeated Measures 6 - 13

SPSS Training

These Q-Q pl ots al so gi ve us some i ndi cati on of the degree to whi ch

the normal i ty assumpti on i s vi ol ated. Al though the normal i ty tests

showed that grades 9 and 11 had some devi ati on from the normal , the Q-

Q pl ots are si mi l ar for al l the grades. Agai n, we shoul d note that the

sampl e si ze i s somewhat l arge and that we can probabl y not worry about

these vi ol ati ons of normal i ty.

Figure 6.12 Box and Whiskers Plot

The Box pl ot i ndi cates that the vari ati on of scores wi thi n a test year i s

fai rl y constant. There are a few outl i ers; the case i d i nformati on i ndi cates

that for the most part, the same few i ndi vi dual s stand out. From the

medi ans we see that vocabul ary scores grow over the several year peri od,

and thi s growth seems to be sl owi ng.

We have onl y one group of subjects. Each subject has a vocabul ary

score under the four grade l evel s. Noti ce al l four of the vocabul ary scores

are attached to a si ngl e case (exami ne data i n Data Edi tor wi ndow – not

shown). I f the four measures for a subject were spread throughout the

fi l e, the anal ysi s can sti l l be run wi thi n SPSS, but onl y by usi ng the

General Li near Model Uni vari ate di al og box.

Cl i ck Analyze..General Linear Model..Repeated Measures

COMPARING THE

GRADE LEVELS

Within-Subject Designs: Repeated Measures 6 - 14

SPSS Training

Here we provi de names for any repeated measures factors and

i ndi cate the number of l evel s for each. Unl i ke a between-group factor

whi ch woul d be a vari abl e (for exampl e, regi on), a repeated measures

factor i s expressed as a set of vari abl es.

Repl ace factor1 wi th Time i n the Wi thi n-Subject Factor Name

text box

Press Tab key to move to the Number of Level s text box

Type 4

Cl i ck Add pushbutton

Figure 6.13 Define Repeated Measures (Within-Subject) Factor

We have defi ned one factor wi th four l evel s. I n a more compl ex study

(we wi l l see one l ater i n thi s chapter) addi ti onal repeated measures can

be added. The Measure pushbutton i s used to provi de two pi eces of

i nformati on. Fi rst i f there are mul ti pl e dependent measures i nvol ved i n

the anal ysi s (for exampl e, suppose we al so took four measures of

mathemati cal ski l l s for each of our 64 subjects), thi s i s decl ared i n the

measure area. Secondl y, you can use the Measure area to provi de a l abel

for the dependent measure i n the resul ts. Recal l we named our four

vari abl es Ti me1 to Ti me4 so there woul d be no ambi gui ty about whi ch

factor l evel each represented. However, thi s choi ce on names does not

i ndi cate that these vari abl es al l measure vocabul ary scores. You can

suppl y such l abel i ng i nformati on i n the Measures area.

Cl i ck Define pushbutton

Within-Subject Designs: Repeated Measures 6 - 15

SPSS Training

I n thi s di al og box we l i nk the repeated measures factor l evel s to

vari abl e names, and decl are any between-subject factors and covari ates.

Noti ce the Wi thi n-Subjects Vari abl es box l i sts Ti me as the factor and

provi des four l i nes l abel ed wi th l evel numbers 1 through 4. We must

match the proper vari abl e to each of the factor l evel s. Thi s step shoul d be

done very careful l y si nce i ncorrect matchi ng of names and l evel s wi l l

general l y produce an i ncorrect anal ysi s (especi al l y i f more than one

repeated measure factor i s i nvol ved). We can move the vari abl es one by

one, but si nce they are i n the correct order we wi l l move them as a group.

Move grade8, grade9, grade10, and grade11 i nto the Wi thi n-

Subjects Vari abl es box (mai ntai n thi s vari abl e order)

Cl i ck grade11 to sel ect i t.

Figure 6.14 Main Repeated Measures Dialog Box

The vari abl e correspondi ng to each grade l evel i s matched wi th the

proper ti me l evel . Si nce grade11 i s sel ected, the up arrow button i s acti ve.

These up and down buttons wi l l move vari abl es up and down the l i st, so

you can easi l y make changes i f the ori gi nal orderi ng i s i ncorrect. We have

nei ther between-subject factors nor covari ates and can proceed wi th the

anal ysi s, but fi rst l et us exami ne some of the avai l abl e features.

Within-Subject Designs: Repeated Measures 6 - 16

SPSS Training

I n the model di al og box (not shown) by defaul t a compl ete model (al l

factors and i nteracti ons) wi l l be fi t. As wi th procedures we saw earl i er i n

the course, a customi zed model can be fi t for ei ther between or wi thi n-

subject factors. Thi s i s usual l y done when speci al ty desi gns (Lati n

squares, i ncompl ete desi gns) are run. The Contrasts pushbutton i s used

to request that parti cul ar contrasts be appl i ed to a factor (recal l our

di scussi on of di fference or contrast vari abl es earl i er).

Cl i ck Contrasts pushbutton

Figure 6.15 Contrasts Dialog Box

Check to see whi ch contrast i s sel ected

I f i t i s not pol ynomi al then change i t to Polynomial

Cl i ck Continue

Cl i ck Plots pushbutton

The Pl ots pushbutton generates profi l e pl ots that graph means at

factor l evel combi nati ons for up to three factors at a ti me. Such pl ots are

powerful tool s i n understandi ng i nteracti on effects. We wi l l onl y request

a pl ot for ti me, our repeated measure factor.

Within-Subject Designs: Repeated Measures 6 - 17

SPSS Training

Cl i ck on Time and move i t to the Horizontal Axis l i st box

Cl i ck Add

Figure 6.16 Plots Dialog Box

Cl i ck Continue

The Post Hoc di al og box was di scussed earl i er i n the cl ass; i t performs

post hoc tests of means for between-subject factors. We wi l l l ook at the

avai l abl e tests for repeated measures factors shortl y. The Save di al og box

al l ows you to save predi cted val ues, vari ous resi dual s and i nfl uenti al

poi nt measures. Al so, you can save the esti mated coeffi ci ents to a fi l e for

l ater mani pul ati on (perhaps i n a predi cti on model , or to compare resul ts

from di fferent data sets). We wi l l l ook at the Opti on di al og box more

cl osel y.

Cl i ck Options pushbutton

Move Time i nto the Display Means for l i st box

Cl i ck to check the Compare Main Effects checkbox

Sel ect Bonferroni on the Confidence interval adjustment

drop-down l i st

Cl i ck on Descriptive statistics check box

Cl i ck Transformation Matrix check box

Within-Subject Designs: Repeated Measures 6 - 18

SPSS Training

Figure 6.17 Options Dialog Box

We request descri pti ve stati sti cs. Esti mated margi nal means can be

produced for any factors i n the model (here ti me). Si nce we are fi tti ng a

compl ete model , the esti mated margi nal means are i denti cal to the

esti mated means. We request pai rwi se compari sons for the ti me factor

usi ng Bonferroni adjustments (the avai l abl e adjustments for repeated

measure factors are LSD, Bonferroni and Si dak). I n addi ti on, we have

asked to see the transformati on matri x. The transformati on matri x

contai ns the contrast coeffi ci ents that are appl i ed to the repeated

measures factor(s) to create the di fference or contrast vari abl es used i n

the anal ysi s. Here we di spl ay i t onl y to rei nforce our earl i er di scussi on of

thi s topi c. Di agnosti c resi dual pl ots are avai l abl e and there i s a control to

modi fy the confi dence l i mi ts (defaul t i s 95%). The SSCP (sums of squares

and cross products) matri ces are not ordi nari l y vi ewed. However, they do

contai n the sums of squares for each of the contrast vari abl es. By vi ewi ng

them you can see that the overal l test si mpl y sums up the i ndi vi dual

contrast sums of squares, whi ch i s why spheri ci ty i s necessary.

Within-Subject Designs: Repeated Measures 6 - 19

SPSS Training

Cl i ck Continue to process the opti on requests

Cl i ck OK to run the anal ysi s

The SPSS syntax bel ow wi l l run the repeated measures anal ysi s.

GLM

grade8 grade9 grade10 grade11

/WSFACTOR = ti me 4 Pol ynomi al

/METHOD = SSTYPE(3)

/PLOT = PROFI LE( ti me )

/EMMEANS = TABLES(ti me) COMPARE ADJ(BONFERRONI )

/PRI NT = DESCRI PTI VE TEST(MMATRI X)

/CRI TERI A = ALPHA(.05)

/WSDESI GN = ti me .

Fi rst the vari abl es that consti tute the repeated measures factor are

l i sted. The WSFACTOR (wi thi n-subject factor) subcommand decl ares

ti me to be a wi thi n-subject factor wi th four l evel s. I n addi ti on, pol ynomi al

contrasts wi l l be appl i ed when creati ng the contrast vari abl es.

Pol ynomi al contrasts wi l l perform l i near, quadrati c, and cubi c contrasts

on the ti me factor. I f there are si gni fi cant changes i n vocabul ary over

ti me, as we expect, these contrasts wi l l al l ow us to exami ne i ts speci fi c

form. The Pri nt TEST (MMATRI X) speci fi cati on wi l l have the

transformati on (cal l ed the M Matri x) di spl ay. Method decl ares the sums

of squares type.

The fi rst summary di spl ays i nformati on about the factors i n the model .

Figure 6.18 Factor Summary

EXAMINING

RESULTS

There i s onl y a si ngl e wi thi n-subject (repeated measures) factor and

no between-subject factors.

Within-Subject Designs: Repeated Measures 6 - 20

SPSS Training

Figure 6.19 Descriptive Statistics

Means, standard devi ati ons, and sampl e si zes appear for each factor

l evel . I f you were unsure of your matchi ng the vari abl e names to factor

l evel s i n the Defi ne Repeated Measures Factors di al og box, you can

compare these means to those you woul d obtai n from the Descri pti ves,

Means, or Expl ore procedures to i nsure the proper vari abl es are matched

wi th the proper factor l evel s. Agai n, we see the i ncrease i n the mean

scores as grade l evel i ncreases.

Mul ti vari ate test resul ts appear next. Si nce they woul d typi cal l y be

used onl y i f the spheri ci ty assumpti on fai l s, we wi l l ski p these resul ts for

now and exami ne the spheri ci ty test.

Figure 6.20 Mauchly’s Sphericity Test

We see from the Si gni fi cance (Si g.) i nformati on that the data are

consi stent wi th the spheri ci ty assumpti on. The si gni fi cance val ue i s above

.05 (.277), i ndi cati ng that the covari ance matri x of orthonormal i zed

transformati on vari abl es i s consi stent wi th spheri ci ty (di agonal el ements

i denti cal and off-di agonal el ements zero i n the popul ati on). Si nce

spheri ci ty has been mai ntai ned we can use the standard (pool ed) ANOVA

Within-Subject Designs: Repeated Measures 6 - 21

SPSS Training

resul ts, and need not resort to al ternati ve (mul ti vari ate) or adjusted

(degree of freedom adjustment) tests. The Epsi l on secti on of the pi vot

tabl e provi des the degree of freedom modi fi cati on factor that shoul d be

appl i ed i f the spheri ci ty resul t were si gni fi cant. Let us take a bri ef l ook at

the mul ti vari ate resul ts.

Figure 6.21 Multivariate Tests

Remember these resul ts need not be vi ewed si nce spheri ci ty has been

mai ntai ned. Here the test i s whether al l of the contrast vari abl es

(representi ng vocabul ary score di fferences) are zero i n the popul ati on,

whi l e expl i ci tl y taki ng i nto account any correl ati on and vari ance

di fferences i n the contrast vari abl es. So i f spheri ci ty were vi ol ated these

resul ts coul d be used. Expl anati ons about the vari ous mul ti vari ate tests

were gi ven i n Chapter 5. The mul ti vari ate tests i ndi cate there are

vocabul ary score di fferences by grade.

Figure 6.22 Within-Subject Effects

Thi s tabl e contai ns the standard repeated measures output based on

summi ng the resul ts from each contrast, as wel l as spheri ci ty corrected

resul ts. I t shows the resul ts for (1) spheri ci ty assumed, and then (2)

Within-Subject Designs: Repeated Measures 6 - 22

SPSS Training

Greenhouse-Gei sser, (3) Huynh-Fel dt, and (4) Lower Bound adjustments.

The test resul t (spheri ci ty assumed) i s hi ghl y si gni fi cant, more so than

the mul ti vari ate test, whi ch i s what we expect i f spheri ci ty hol ds: the

pool ed test i s more powerful . Thus, we concl ude there are si gni fi cant

di fferences i n vocabul ary across grade l evel s.

Figure 6.23 Test of Contrasts

Si gni fi cant tests wi l l be performed on each of the contrast vari abl es

used to construct a repeated measure factor. Recal l that by defaul t,

pol ynomi al contrasts are used. Si nce the repeated measure factor i s ti me,

these contrasts test whether there are si gni fi cant l i near, quadrati c and

cubi c trends i n vocabul ary growth over ti me. Note that l i near and

quadrati c trends are si gni fi cant (the l i near contrast has a very l arge F

val ue), whi l e cubi c i s not. Thi s i s consi stent wi th the earl i er comment

that vocabul ary scores i ncrease over ti me, but the growth seemed to be

sl owi ng down.

Figure 6.24 Test of Between-Subjects Effects

There were no between-subject factors i n thi s study. I f there were,

the test resul ts for them woul d appear i n thi s secti on. There i s a test of

the i ntercept, or grand mean; thi s si mpl y tests whether the average of al l

vocabul ary scores i s equal to zero i n the popul ati on – not an i nteresti ng

hypothesi s to test.

Within-Subject Designs: Repeated Measures 6 - 23

SPSS Training

Figure 6.25 Transformation Matrix

The transformati on vari abl es are spl i t i nto two groups: one

correspondi ng to the average across the repeated measures factor, the

others defi ni ng the repeated measures factor. The coeffi ci ents for the

Average vari abl e are al l .5, meani ng each vari abl e i s wei ghted equal l y i n

creati ng the Average transformati on vari abl e. I f you wonder why the

wei ghts are not .25, recal l that normal i zati on requi res the sum of the

squared wei ghts to equal one. Turni ng to the transformed vari abl es that

represent the ti me effect, the three sets of coeffi ci ents are orthogonal

pol ynomi al s correspondi ng to l i near, quadrati c, and cubi c terms. Looki ng

at the fi rst we see that there i s a constant i ncrease (of about .447) i n the

val ue of the coeffi ci ents across the four grade l evel s. I n a si mi l ar way, the

second transformati on vari abl e has two si gn changes (negati ve to

posi ti ve, then posi ti ve to negati ve) over the grade l evel s; thi s consti tutes

a quadrati c effect. The SPSS Advanced Models manual has addi ti onal

i nformati on about the commonl y used transformati ons.

Recal l that the transformati ons are orthogonal ; you can veri fy thi s for

any pai r by mul ti pl yi ng thei r coeffi ci ents at each l evel of the factor and

summi ng these products. The sum shoul d be zero. For l i near and

quadrati c we can cal cul ate (-.671*.5 -.224*.5 +.224*.5 +.671*.5), whi ch i s

i ndeed zero.

Within-Subject Designs: Repeated Measures 6 - 24

SPSS Training

Figure 6.26 Transformation Matrix (M Matrix)

Si nce we ask for anal yses to compare mai n effects i n the Opti ons

di al og box, a new transformati on matri x i s used to create four vari abl es

equi val ent to the four grade l evel s: the i denti ty transformati on. Noti ce

thi s i s a separate anal ysi s after the others have been compl eted usi ng the

ori gi nal transformati on matri x.

Al so, note the esti mated margi nal means match the observed means

(thi s pi vot tabl e i s not shown).

Figure 6.27 Pairwise Test Results

Within-Subject Designs: Repeated Measures 6 - 25

SPSS Training

Each grade l evel mean i s tested agai nst every other; essenti al l y we

are performi ng al l pai rwi se tests wi th a Bonferroni correcti on. The

footnotes i ndi cate that each test i s performed at an adjusted l evel of

si gni fi cance usi ng a Bonferroni correcti on. Thus the probabi l i ty of

obtai ni ng one or more fal se posi ti ve resul ts i s .05. We fi nd i n our study

that the grade 8 (ti me 1) scores are si gni fi cantl y di fferent from al l the

others; grade 9 (ti me 2) i s di fferent from grade 8 and grade 11 (ti me 4);

grade 10 (ti me 3) i s di fferent from grade 8 and 11; whi l e grade 11 i s

si gni fi cantl y di fferent from grades 8, 9, and 10. Substi tuti ng Bonferroni

corrected pai red t tests for post hoc compari sons provi des a means to

i nvesti gate di fferences wi thi n a repeated measures factor.

The program wi l l al so run a mul ti vari ate ANOVA attempti ng to test

the pai rwi se compari sons si mul taneousl y; thi s i s of no i nterest to us.

Figure 6.28 Profile Plot of Means

Thi s pl ot (not real l y necessary si nce wi th one factor there can be no

i nteracti on) shows us how the mean of the vocabul ary scores i s i ncreasi ng

wi th grade l evel .

Within-Subject Designs: Repeated Measures 6 - 26

SPSS Training

Suppose we had some speci fi c hypothesi s about the grade l evel s that we

wi shed to test. For exampl e, i f we thought that the grade to grade

promoti on made a di fference i n the student’s vocabul ary score, we mi ght

want to test grade 8 versus grade 9; grade 9 versus grade 10; and grade

10 versus grade 11. The Contrast pushbutton provi des a vari ety of

pl anned compari sons and customi zed contrasts can be i nput usi ng

syntax.

Cl i ck the Di al og Recal l tool , then cl i ck Repeated

Measures

Cl i ck Define pushbutton

Cl i ck Contrasts pushbutton

Cl i ck Contrast drop-down arrow and sel ect Repeated

Cl i ck Change pushbutton

Figure 6.29 Requesting Planned Comparisons

PLANNED

COMPARISON

Repeated contrasts wi l l compare each category to the one adjacent.

Ri ght cl i ck on any contrast on the l i st to obtai n a bri ef descri pti on of i t.

The SPSS Advanced Models manual contai ns more detai l s. Al so be aware

that you can provi de custom contrasts usi ng the Speci al keyword i n

syntax.

Cl i ck Continue to process the contrasts

Cl i ck OK to run the anal ysi s

The command bel ow wi l l run the anal ysi s.

Within-Subject Designs: Repeated Measures 6 - 27

SPSS Training

GLM

grade8 grade9 grade10 grade11

/WSFACTOR = ti me 4 Repeated

/METHOD = SSTYPE(3)

/PLOT = PROFI LE( ti me )

/EMMEANS = TABLES(ti me) COMPARE ADJ(BONFERRONI )

/PRI NT = DESCRI PTI VE TEST(MMATRI X)

/CRI TERI A = ALPHA(.05)

/WSDESI GN = ti me .

The Wsfactor subcommand now requests that repeated contrasts be

used i n pl ace of the defaul t pol ynomi al s.

Agai n, most of the output i s i denti cal to the previ ous runs; we focus

on the contrast tests and the transformati on matri x.

Figure 6.30 Tests of Contrasts

We see that al l three contrasts are si gni fi cant at the .05 l evel . The

fi rst contrast, compari ng grade 8 to grade 9 has by far the greatest F

val ue. The second compares grade 9 to grade 10, and the thi rd compares

grade 10 to grade 11. These seem i nconsi stent wi th the pai rwi se tests we

just ran i n whi ch the grade 9 scores were not di fferent from the grade 10

scores. However, recal l that we performed Bonferroni correcti ons on those

tests and the second contrast (here wi th si gni fi cance l evel of .013) woul d

not be si gni fi cant we testi ng at the adjusted Bonferroni l evel (about .008).

I f you return to the pai rwi se anal ysi s you wi l l see the resul ts are qui te

cl ose.

To confi rm our understandi ng of the contrasts, we vi ew the

transformati on matri x.

Within-Subject Designs: Repeated Measures 6 - 28

SPSS Training

Figure 6.31 Transformation Matrix

We see the transformed vari abl es do compare each grade l evel to the

adjacent one. The transformed matri x i s very useful i n understandi ng

and veri fyi ng whi ch contrasts are bei ng performed. These contrasts are

not orthogonal , and woul d not be used wi thout modi fi cati on

(orthonormal i zati on) i n the spheri ci ty and pool ed si gni fi cance tests

appeari ng earl i er.

I n thi s chapter we revi ewed how repeated measures ANOVA di ffers from

between-group ANOVA and why i t i s used. Assumpti ons were di scussed

and an anal ysi s was run based on student vocabul ary scores measured

over ti me. A second anal ysi s appl i ed pl anned compari sons (a pri ori

contrasts) to a repeated measure anal ysi s.

SUMMARY

Between and Within-Subject ANOVA: (Split-Plot) 7 - 1

SPSS Training

Between and Within-Subject

ANOVA: (Split-Plot)

I n thi s chapter we wi l l expand upon the l ast chapter to i ncl ude both

between and wi thi n-subject factors i n one anal ysi s. We wi l l di scuss the

assumpti ons of thi s desi gn and show an exampl e. We wi l l al so expl ore the

i nteracti ons usi ng si mpl e effects.

We wi l l fi rst use the Expl ore command to exami ne the data and then run

the repeated measures ANOVA to do the basi c mi xed-model anal ysi s

(spl i t-pl ot) and l ook at i nformati on regardi ng the assumpti ons.

The data set we wi l l use i s the same data as we used i n Chapter 6,

contai ni ng vocabul ary test scores obtai ned from the same chi l dren over

four years (grades 8 through 11). However, i n thi s anal ysi s we wi l l use

the sex of the subject as a between-subject factor.

The term “mi xed model ” techni cal l y refers to ANOVA model s contai ni ng

fi xed and random factors. The desi gns we di scuss, where subject i s a

random effect, are a speci al case of the mi xed model . The common usage

of “mi xed model ” refers to desi gns wi th between and wi thi n-subject

factors.

M

any studi es, especi al l y experi mental work, i ncorporate both

between and wi thi n-subject factors. Wi thi n-subject factors wi l l

hopeful l y l ead to a more sensi ti ve anal ysi s, whi l e between-

subject factors are necessary i f any demographi c characteri sti cs are

i ncl uded or i f there i s reason to bel i eve there woul d be strong carry-over

effects. Mi xed model refers to a mi xture of between and wi thi n factors

and i s a di rect general i zati on of the wi thi n-subjects anal ysi s. These

desi gns are al so cal l ed spl i t-pl ot desi gns, the term taken from

agri cul tural experi ments i n whi ch a gi ven pl ot of l and woul d recei ve

si ngl e l evel of one treatment factor, but woul d be spl i t i nto subpl ots that

woul d recei ve al l treatment l evel s of a second factor. Thi s woul d yi el d

between-pl ot and wi thi n-pl ot factors equi val ent to the between and

wi thi n-subject effects we have covered. We wi l l di scuss the features and

assumpti ons of such anal ysi s and run an exampl e.

Chapter 7

Objective

Method

Data

Technical Note

INTRODUCTION

Between and Within-Subject ANOVA: (Split-Plot) 7 - 2

SPSS Training

I f we take the assumpti ons of wi thi n-subject anal yses as a starti ng poi nt,

normal i ty of the vari abl es and spheri ci ty when there are more than two

l evel s of a wi thi n-subject factor, mi xed model anal yses i nvol ve l i ttl e more.

Si nce there are mul ti pl e groups, the normal i ty of the vari abl es now

appl i es to the vari ati on wi thi n each group. Al so, homogenei ty of

covari ance matri ces i s assumed (thi s can be appl i ed to the ori gi nal

vari abl es or the transformed vari abl e – homogenei ty of one i mpl i es

homogenei ty of the other). Thi s combi nati on of assumpti ons, homogenei ty

and spheri ci ty, i s someti mes cal l ed compound symmetry.

We wi l l fi t a model wi th one between-subject factor (Sex) and one wi thi n-

subject factor (Ti me) wi th four l evel s. Thus for thi s anal ysi s the

spheri ci ty i ssue i s rel evant and wi l l be approached just as i t was i n

Chapter 6. Whi l e we deal wi th a si ngl e between and a si ngl e wi thi n-

subject factor, no addi ti onal assumpti ons are requi red to expand the

anal ysi s to handl e mul ti pl e factors of each type.

As before, we wi l l use the Expl ore procedure to exami ne the di stri buti on

of vocabul ary scores across grades and sex groups. Si nce we know from

Chapter 6 that there are changes i n vocabul ary scores over ti me (grades),

we wi l l focus on the compari son of the two sex groups. Thi s wi l l provi de

some i ndi cati on of normal i ty and homogenei ty of the vocabul ary scores.

Cl i ck File..Open..Data

Move to the c:\ Train\ Anova di rectory

Sel ect SPSS Portable (.por) from the Fi l es of Type drop-down

l i st

Doubl e-cl i ck on vocab

Figure 7.1 Data from Vocabulary Study

ASSUMPTIONS

OF MIXED

MODEL ANOVA

PROPOSED

ANALYSIS

A LOOK AT THE

DATA

Between and Within-Subject ANOVA: (Split-Plot) 7 - 3

SPSS Training

Cl i ck Analyze..Descriptive Statistics..Explore

Move Grade8, Grade9, Grade10, and Grade 11 i nto the

Dependent l i st box.

Move Sex i nto the Factors box

Figure 7.2 Explore Dialog Box

Cl i ck on the Statistics pushbutton

Make sure that the Descriptives checkbox i s the onl y one

sel ected

Figure 7.3 Explore Statistics Dialog Box

Cl i ck on Continue to process the Stati sti cs choi ces

Cl i ck on the Plots pushbutton

Veri fy that the Factor levels together opti on i s sel ected.

Veri fy that the Stem-and-leaf checkbox i s checked

Cl i ck Normality plots with tests checkbox

Sel ect Power Estimation opti on button i n Spread vs. Level wi th

Levene Test area

Between and Within-Subject ANOVA: (Split-Plot) 7 - 4

SPSS Training

Figure 7.4 Explore Plots Dialog Box

Cl i ck on Continue to process the Pl ots choi ces

Cl i ck on OK to run the EXPLORE procedure.

The syntax command bel ow wi l l run the anal ysi s.

EXAMI NE

VARI ABLES=grade8 grade9 grade10 grade11 BY sex

/PLOT BOXPLOT STEMLEAF NPPLOT SPREADLEVEL

/COMPARE GROUP

/STATI STI CS DESCRI PTI VES

/CI NTERVAL 95

/MI SSI NG LI STWI SE

/NOTOTAL.

Normal i ty tests and pl ots are generated by the Nppl ot keyword and

homogenei ty tests are due to the Spreadl evel keyword on the Pl ot

subcommand.

Figure 7.5 Descriptives for Grade 8 Males

Between and Within-Subject ANOVA: (Split-Plot) 7 - 5

SPSS Training

Figure 7.6 Stem and Leaf for Grade 8 Males

Figure 7.7 Descriptives for Grade 8 Females

Figure 7.8 Stem and Leaf for Grade 8 Females

Between and Within-Subject ANOVA: (Split-Plot) 7 - 6

SPSS Training

Noti ce that the range for 8

th

grade mal es i s about hal f the range for

femal es, but the i nterquarti l e ranges are about the same. Thi s i s due i n

part to an outl i er among the femal es. Whi l e not shown, the vocabul ary

scores of the femal es were consi stent wi th the normal di stri buti on usi ng

the Shapi ro-Wi l ks cri teri on, whi l e the mal es showed a si gni fi cant

departure from normal i ty.

Figure 7.9 Box Plots for Grade 8 Scores

The medi ans for the sex groups are very si mi l ar and the vari ati on i n

the femal e group seems greater. Despi te appearances i n the pl ot, the 8

th

grade sex groups do not show si gni fi cant di fferences i n vari ati on of test

scores as evi denced by the Levene homogenei ty test (not shown).

Figure 7.10 Descriptives for Grade 11 Males

Between and Within-Subject ANOVA: (Split-Plot) 7 - 7

SPSS Training

Figure 7.11 Stem and Leaf for Grade 11 Males

Figure 7.12 Descriptives for Grade 11 Females

Figure 7.13 Stem and Leaf for Grade 11 Females

Between and Within-Subject ANOVA: (Split-Plot) 7 - 8

SPSS Training

Most of the summary stati sti cs are si mi l ar for mal es and femal es i n

the grade 11

th

grade. The normal i ty tests (not shown) i ndi cate that the

scores for mal es, but not femal es, are consi stent wi th normal di stri buti on.

Figure 7.14 Box Plots for Grade 11

Once agai n, the medi ans are very cl ose and there are a few outl i ers.

The Levene test (not shown) i ndi cated that the sex popul ati ons do not

di ffer i n vari ance on 11

th

grade vocabul ary scores.

The di stri buti on of vocabul ary scores wi thi n sex group was consi stent

wi th the normal for 9

th

and 10

th

grade, the onl y excepti on bei ng 10

th

grade

femal es. Nei ther grade departed from homogenei ty of vari ance between

sex groups. (These resul ts are not shown.)

Overal l , the data l ook good as far as homogenei ty i s concerned, and the

departures from normal i ty are not dramati c. I f we had access to the

ori gi nal test sheets, we mi ght want to check the accuracy of the scores for

the outl i ers. We wi l l proceed wi th the mi xed-model ANOVA.

Cl i ck Analyze..General Linear Model..Repeated Measures

Repl ace factor1 wi th Time i n the Within-Subject Factor

Name text box

Press Tab and type 4 i n the Number of Levels text box

Cl i ck Add pushbutton

9

th

and 10

th

Grades

SUMMARY OF

EXPLORE

SPLIT-PLOT

ANALYSIS

Between and Within-Subject ANOVA: (Split-Plot) 7 - 9

SPSS Training

Figure 7.15 Define Factors Dialog Box

Cl i ck Define pushbutton

I n the Repeated Measures di al og box, cl i ck and drag Grade8,

Grade9, Grade10, and Grade 11 to the Within-Subject

Variables l i st box.

Move Sex i nto the Between-Subjects Factors l i st box

Figure 7.16 Between and Within-Subject Factors Defined

Cl i ck Contrasts pushbutton

Between and Within-Subject ANOVA: (Split-Plot) 7 - 10

SPSS Training

Sel ect time i n the Factors: l i st box

Sel ect Repeated from the Contrast drop-down l i st

Cl i ck Change button

Veri fy Sex i s set to none for the contrast

Figure 7.17 Contrasts Dialog Box

Repeated contrasts wi l l compare each factor l evel wi th the one

fol l owi ng i t. Thus wi th four ti me l evel (8, 9, 10 and 11), the three

repeated contrasts compare 8

th

to 9

th

, 9

th

to 10

th

, and 10

th

to 11

th

grades,

respecti vel y.

Cl i ck Continue to process the Contrast changes

Cl i ck Options pushbutton

I ndi vi dual l y move Sex and Time i nto the Display Means for

l i st box

Cl i ck the Compare Main Effects checkbox

Cl i ck Descriptive statistics, Transformation matrix, and

Homogeneity tests opti on buttons

Between and Within-Subject ANOVA: (Split-Plot) 7 - 11

SPSS Training

Figure 7.18 Options Dialog Box

Cl i ck on Continue to process the Opti ons requests

Cl i ck on OK to run the anal ysi s

Besi des descri pti ve stati sti cs, we request esti mated margi nal means

(whi ch equal the observed means si nce we are fi tti ng a ful l model ) for

each of the factors. Si nce there are several groups i nvol ved i n the

anal ysi s, we ask for homogenei ty of vari ance tests. We al so request

pai rwi se compari sons for sex and ti me wi th no adjustement (LSD (none)).

Si nce sex has onl y two l evel s, pai rwi se tests are not needed.

We wi l l proceed wi th the anal ysi s. The GLM command shown bel ow

wi l l produce thi s anal ysi s (obtai ned by cl i cki ng the Di al og Recal l tool

, then Repeated Measures, and the Paste pushbutton)

Between and Within-Subject ANOVA: (Split-Plot) 7 - 12

SPSS Training

Figure 7.19 Syntax for This Analysis

The four vocabul ary vari abl es form the basi s of the ti me factor.

Esti mated margi nal means wi l l be computed for the sex and ti me mai n

effects. The Pri nt subcommand requests that descri pti ve stati sti cs, the

transformati on matri x (TEST(MMATRI X)) and homogenei ty test

summari es appear. The Wsdesi gn subcommand decl ares ti me as the onl y

repeated measure factor i n the model ; si mi l arl y sex (see Desi gn

subcommand) i s the onl y between-subject factor.

Figure 7.20 Factors in the Analysis

EXAMINING

RESULTS

The factors i n the anal ysi s are l i sted al ong wi th the sampl e si zes for

the between-subject factor.

Between and Within-Subject ANOVA: (Split-Plot) 7 - 13

SPSS Training

Figure 7.21 Descriptive Statistics

Subgroup means appear separatel y for each of the repeated measure

vari abl es.

Al though they do not appear together i n the output, we wi l l fi rst exami ne

resul ts pertai ni ng to the assumpti ons of the anal ysi s. Concerni ng

homogenei ty of vari ance, the program provi des Box’s M stati sti c and

Levene’s test. Box’s M i s a mul ti vari ate stati sti c testi ng whether the

vari ance-covari ance matri ces composed of the four repeated measures

vari abl es are equal across the between-subject factor subgroup

popul ati ons (mul ti vari ate homogenei ty). Levene’s test i s uni vari ate and

tests homogenei ty of vari ance for each of the four repeated measure

vari abl es separatel y (uni vari ate homogenei ty).

Figure 7.22 Box’s M Test of Homogeneity

TESTS OF

ASSUMPTIONS

Between and Within-Subject ANOVA: (Split-Plot) 7 - 14

SPSS Training

Box’s M i s not si gni fi cant (si gni fi cance val ue i s .158), i ndi cati ng that

the data are consi stent wi th the hypothesi s of homogenei ty of covari ance

matri ces (based on the four repeated measures vari abl es) across the

popul ati on subgroups.

Figure 7.23 Levene’s Test of Homogeneity

Not surpri si ngl y, the resul ts of Levene’s test are consi stent wi th Box’s

M. Box’s M test has the advantage of bei ng a si ngl e mul ti vari ate test.

However, Box’s M test i s sensi ti ve to both homogenei ty and normal i ty

vi ol ati ons, whi l e Levene’s i s rel ati vel y i nsensi ti ve to l ack of normal i ty.

Si nce homogenei ty of vari ance vi ol ati ons are general l y more probl emati c

for ANOVA, Levene’s test i s useful .

Si nce the wi thi n-subject factor (Ti me) has more than two l evel s, we wi l l

test for the spheri ci ty assumpti on. As di scussed i n Chapter 6, i f the

assumpti on i s met the usual averaged F tests are correct and are the test

of choi ce. I f spheri ci ty condi ti ons are not met, several choi ces are

avai l abl e: mul ti vari ate tests may be used, correcti ons to the averaged F

test can be made (Greenhouse-Gei sser, Huynh-Fel dt, etc.), or more

compl i cated deci si on rul es may be appl i ed (Looney & Stanl ey, 1989). We

now vi ew the spheri ci ty test resul ts.

Figure 7.24 Mauchly’s Sphericity Test

SPHERICITY

The Mauchl y test shows no evi dence of spheri ci ty vi ol ati ons and the

Greenhouse-Gei sser and Huynh-Fel dt degree of freedom adjustments are

cl ose to or equal to one. Thi s resul t i ndi cates we can proceed di rectl y to

Between and Within-Subject ANOVA: (Split-Plot) 7 - 15

SPSS Training

the averaged F tests for effects i nvol vi ng Ti me. However for compari son

purposes, we wi l l al so vi ew the mul ti vari ate tests.

Figure 7.25 Multivariate Tests MULTIVARIATE

TESTS

INVOLVING TIME

As expected from the anal ysi s i n Chapter 6, there are si gni fi cant

di fferences i n vocabul ary scores over ti me. I n addi ti on, there i s a

si gni fi cant i nteracti on between Sex and Ti me. Thi s can be phrased i n two

ways; the popul ati on sex di fference i s not uni form across grades, or the

trend over ti me i s not i denti cal for the two sex popul ati ons.

Figure 7.26 Between-Subjects Tests

TESTS OF

BETWEEN-

SUBJECT

FACTORS

The Repeated Measures procedure al so presents the tests for the

between-subject factors, i n thi s case Sex. There i s no si gni fi cant

di fference i n overal l vocabul ary score between the femal es and mal es

(si gni fi cance val ue i s .101).

Between and Within-Subject ANOVA: (Split-Plot) 7 - 16

SPSS Training

Figure 7.27 F Tests

AVERAGED F

TESTS

INVOLVING TIME

The averaged F tests i ndi cate a si gni fi cant effect of ti me and a sex by

ti me i nteracti on. Here we vi ew onl y the test resul ts l abel ed “spheri ci ty

assumed” si nce the spheri ci ty assumpti on was met.

Figure 7.28 Repeated Measures Contrasts

As we saw i n Chapter 6, the contrasts show that there are si gni fi cant

di fferences between each pai r of grades on the vocabul ary scores.

However, the onl y si gni fi cant sex by ti me i nteracti on term i nvol ves

grades 8 and 9. Thus the i nteracti on between sex and ti me centers on

these two grades. The means i nvol vi ng both sex and ti me (see Fi gure

7.21) can be exami ned for more detai l .

Between and Within-Subject ANOVA: (Split-Plot) 7 - 17

SPSS Training

Figure 7.29 Transformation Matrix

Thi s i s shown onl y to veri fy that the repeated contrasts were used.

Figure 7.30 Pairwise Comparisons Involving Time

Pai rwi se compari sons appear for both sex and ti me. Si nce there are

onl y two sex groups, the pai rwi se compari sons tel l us no more than the

overal l mai n effect, and so are not of i nterest (not shown). The pai rwi se

Between and Within-Subject ANOVA: (Split-Plot) 7 - 18

SPSS Training

compari sons i nvol vi ng ti me (wi th no adjustment due to the number of

tests performed) are al l si gni fi cant.

The spheri ci ty assumpti on appl i es to al l wi thi n-subject factors wi th more

than two l evel s. I n such desi gns Repeated Measures wi l l perform

spheri ci ty tests for the appropri ate wi thi n-subject factors and rel evant

i nteracti ons (effects i nvol vi ng wi thi n-subject i nteracti ons). The approach

taken above appl i es to these si tuati ons as wel l .

A techni que that can be used to expl ore i nteracti ons i nvol ves si mpl e

effects, that i s, l ooki ng at the di fferences i n one factor wi thi n a si ngl e

l evel of a second factor. For exampl e, an i nteracti on mi ght be cl ari fi ed by

a factor showi ng a si gni fi cant di fference at one l evel of a second factor

whi l e showi ng no di fference at a second l evel . Bel ow we run si mpl e effects

exami ni ng sex di fferences wi thi n each grade and al so exami ne ti me

di fferences wi th each sex group. Typi cal l y, both anal yses woul d not be

run, but we wi sh to demonstrate how to set them up.

Wi thi n the SPSS Uni vari ate (Uni anova) or Repeated Measures

(GLM) procedures, the method to obtai n si mpl e effects i nvol ves

requesti ng the esti mated margi nal means tabl e for the two factors

i nvol ved, and then obtai ni ng tests on the factor of i nterest, appl i ed to the

means tabl e. For exampl e, i f we want tests performed on the ti me factor

wi thi n each sex group, we need to request tests on the ti me factor, based

on the sex-by-ti me tabl e of esti mated margi nal means. Currentl y, thi s

anal ysi s cannot be run di rectl y from the Uni vari ate and Repeated

Measures di al og boxes, but i nvol ves onl y a mi nor change to syntax pasted

from the di al ogs. To demonstrate, we fi rst return to the Repeated

Measures di al og box.

Cl i ck the Di al og Recal l tool , and then cl i ck Repeated

Measures

Cl i ck the Define pushbutton

Cl i ck the Options pushbutton

Move sex*time i nto the Display Means for l i st box

ADDITIONAL

WITHIN-SUBJECT

FACTORS AND

SPHERICITY

EXPLORING THE

INTERACTION -

SIMPLE EFFECTS

Between and Within-Subject ANOVA: (Split-Plot) 7 - 19

SPSS Training

Figure 7.31 Requesting the Sex by Time Table

We have requested esti mated margi n means for the sex*ti me tabl e

and must l ater i ndi cate the tests we want performed. We coul d have

dropped the means di spl ay and mai n effects compari son for the

i ndi vi dual factors, sex and ti me, but wi l l keep them to better i l l ustrate

the changes we must make concerni ng the sex*ti me tabl e.

Cl i ck Continue

Cl i ck Paste pushbutton

Between and Within-Subject ANOVA: (Split-Plot) 7 - 20

SPSS Training

Figure 7.32 Syntax for Repeated Measures Analysis

There are three EMMEANS (esti mated margi nal means)

subcommands. The fi rst two, i nvol vi ng the sex, and ti me tabl es, contai n

the COMPARE keyword. I t requests that mai n- or si mpl e mai n-effect

tests (dependi ng on how many factors are speci fi ed under TABLES) and

pai rwi se compari sons be performed. Pai rwi se tests can be adjusted usi ng

Bonferroni or Si dak adjustments, but, by defaul t, no adjustment (LSD) i s

made. Our task i s to obtai n these tests for ti me wi thi n the sex*ti me tabl e.

We must add the COMPARE keyword referenci ng one of the factors to

the /EMMEANS subcommand that contai ns the sex*ti me tabl e.

Type COMPARE (SEX) at the end of the /EMMEANS =

TABLES (sex*ti me) l i ne

Copy and paste the modi fi ed /EMMEANS = TABLES (sex*ti me)

l i ne just bel ow the ori gi nal

Change COMPARE (SEX) to COMPARE (TIME) i n the second

/EMMEANS = TABLES (sex*ti me) subcommand

Between and Within-Subject ANOVA: (Split-Plot) 7 - 21

SPSS Training

Figure 7.33 Syntax Requesting Tests for Simple Effects

Now si gni fi cance tests wi l l be appl i ed to the sex factor and then to the

ti me factor, each performed wi thi n every l evel of the other factor, based

on the sex-by-ti me tabl e of esti mated margi nal means. Si nce the tabl e

i nvol ves more than one factor, the tests wi l l be run separatel y at each

l evel of the other factor(s). Thi s l ogi c can be extended to addi ti onal

factors, so you can perform si mpl e effect tests on one factor wi thi n a tabl e

i nvol vi ng more than two factors.

Note that we do not need the esti mated margi nal means and tests for

the i ndi vi dual factors sex and ti me (/EMMEANS subcommands

contai ni ng TABLES(SEX) and TABLES(TI ME)) and these subcommands

coul d be removed. They were l eft i n the “Di spl ay means for” l i st box i n

the Repeated Measures Opti ons di al og so we coul d see the syntax needed

to request the tests.

Cl i ck Run..Current to run the anal ysi s

Most of the resul ts are i denti cal to those vi ewed earl i er. Here we

focus on the summari es i nvol vi ng si mpl e effects.

Scrol l down to the first Sex * Time secti on under Esti mated

Margi nal Means headi ng

Between and Within-Subject ANOVA: (Split-Plot) 7 - 22

SPSS Training

Figure 7.34 Estimated Marginal Means

The si mpl e effects anal ysi s wi l l be based on thi s tabl e. Even i f a more

compl ex model were bei ng anal yzed, say wi th three or four between-

subject factors, then the si mpl e effects anal ysi s of a two-factor

i nteracti on, woul d be based on the esti mated means tabl e i nvol vi ng the

two factors of i nterest.

Figure 7.35 Pairwise Comparisons of Sex within Grade Levels (Time)

Recal l that the COMPARE keyword on the /EMMEANS subcommand

wi l l produce both overal l tests and pai rwi se compari sons for the speci fi ed

factor. The Pai rwi se Compari sons tabl e presents for each l evel of the ti me

factor (whi ch represent grades 8, 9, 10 and 11), the femal e-mal e

Between and Within-Subject ANOVA: (Split-Plot) 7 - 23

SPSS Training

compari son. Because there are onl y two l evel s to sex, there i s onl y one

uni que compari son at each grade l evel . However, for factors wi th more

than two l evel s, al l pai rwi se compari sons woul d appear.

Exami ni ng the compari sons, we see that onl y at ti me 2 (9

th

grade)

was there a si gni fi cant di fference between mal es and femal es. Thus we

can descri be the nature of the sex by ti me i nteracti on: there are no

si gni fi cant di fferences between mal es and femal es i n vocabul ary scores

except i n the 9

th

grade.

Figure 7.36 Univariate Tests (Simple Effects)

I n addi ti on to the si mpl e pai rwi se compari sons, overal l si mpl e effect

tests are presented. As the capti on i ndi cates, each F test represents a test

of the si mpl e effect of sex wi thi n a grade l evel (ti me). Si nce sex has onl y

two l evel s, these tests match the pai rwi se resul ts vi ewed above, whi ch

were al ready di scussed. However, for a factor wi th more than two l evel s,

thi s summary woul d present an overal l test of a factor wi thi n each l evel

of the second factor.

Now we exami ne the i nteracti on questi on usi ng si mpl e effects of ti me

wi thi n each sex group.

Scrol l down to the second Sex * Time secti on under Esti mated

Margi nal Means headi ng

Between and Within-Subject ANOVA: (Split-Plot) 7 - 24

SPSS Training

Figure 7.37 Pairwise Comparisons for Time Performed Separately for

Males and Females (Complete Table Not Shown)

Mul ti pl e compari sons of the grade l evel s are done separatel y for

femal es and the mal es (l evel s of the sex factor). A capti on appears at the

bel ow the tabl e (not shown) i ndi cati ng that the compari sons are based on

the esti mated margi nal means. Al l grade (ti me) compari sons are

si gni fi cant for the mal es, whi l e al l but two (9

th

versus 10

th

, and 10

th

versus

11

th

) are si gni fi cant for femal es. Thus mal es show a si gni fi cantl y i ncrease

i n vocabul ary scores at each grade l evel , whi l e femal es di dn’t show

si gni fi cant change from 9

th

to 10

th

or 10

th

to 11

th

grades. Femal es and

mal es thus show di fferent patterns of vocabul ary change over ti me, whi ch

i s the basi s of the i nteracti on.

I n thi s way, understandi ng of a two-way i nteracti on can be i mproved

by exami ni ng the si mpl e effects of ei ther factor. A pl ot i s al so hel pful

(shown l ater).

Between and Within-Subject ANOVA: (Split-Plot) 7 - 25

SPSS Training

Figure 7.38 Multivariate Tests (Simple Effects)

When exami ni ng the si mpl e effects of sex wi thi n grade, overal l F

tests were presented i n the “Uni vari ate Tests” tabl e. Thi s wi l l be the case

for si mpl e effects of any between-subject factor. Mul ti vari ate tests are

used to perform overal l tests of si mpl e effects for wi thi n-subject factors.

Mul ti vari ate tests are used to avoi d compl i cati ons that woul d occur i f

Bonferroni or Si dak correcti ons were requested and spheri ci ty were

vi ol ated. However, thi s means that when the spheri ci ty assumpti on

hol ds, whi ch i s the case here, the si mpl e effects test used (mul ti vari ate

test) i s not the most powerful test. We fi nd that there are si gni fi cant

overal l grade di fferences i n vocabul ary scores for both mal es and femal es.

I n thi s i nstance the overal l test sheds l ess l i ght on the nature of the

i nteracti on than di d the pai rwi se compari sons. For thi s reason, i t i s

useful to exami ne both resul ts.

Profi l e pl ots, seen earl i er i n the course, provi de a means of vi sual i zi ng a

two- or three-factor i nteracti on. We wi l l request a pl ot of vocabul ary

means for the grade and sex groups.

Cl i ck the Di al og Recal l tool , and then cl i ck Repeated

Measures

Cl i ck the Define pushbutton

Cl i ck Plots pushbutton

Move time i nto the Horizontal Axis box

Move sex i nto the Separate Lines box

Cl i ck Add pushbutton

GRAPHING THE

INTERACTION

Between and Within-Subject ANOVA: (Split-Plot) 7 - 26

SPSS Training

Figure 7.39 Requesting a Profile Plot

Cl i ck Continue, and then OK

Figure 7.40 Profile Plot of Vocabulary Scores Across Grades for Males

and Females

The pl ot of the esti mated margi nal means (here i denti cal to the

observed means) shows the steady i ncrease i n vocabul ary score over ti me

for the mal es. I n compari son, the femal es show a sharper i ncrease from

grade 8 to grade 9, and more gradual i ncreases i n the l ater grades. The

greatest di fference between mal es and femal es occurs i n grade 9. These

patterns, as you woul d expect, are consi stent wi th the si mpl e effects tests

we performed earl i er.

More Split-Plot Design 8 - 1

SPSS Training

More Split-Plot Design

Understand the i ssues i nvol ved wi th more compl ex spl i t-pl ot anal yses.

Use GLM to run a between- and wi thi n-subject anal ysi s (spl i t-pl ot)

i nvol vi ng mul ti pl e between- and mul ti pl e wi thi n-subject factors.

A marketi ng study i n whi ch di fferent groups of subjects (groups based on

sex and current brand used) rated di fferent brands before and after

vi ewi ng a commerci al . The ai m of the anal ysi s was to determi ne i f rati ngs

i mproved for a speci fi c brand and whether thi s rel ated to sex or brand

used.

T

he exampl e i n thi s chapter wi l l i nvol ve a more compl ex anal ysi s,

but wi l l be done wi th fewer vari ati ons. A marketi ng experi ment

was devi sed to eval uate whether vi ewi ng a commerci al produces

i mproved rati ngs for a speci fi c brand. Rati ngs on three brands (on a 1 to

10 scal e, where 10 i s the hi ghest rati ng) were obtai ned from subjects

before and after vi ewi ng the commerci al . Si nce the hope was that the

commerci al woul d i mprove rati ngs of onl y one brand (A), researchers

expected a si gni fi cant brand by pre-post commerci al i nteracti on (onl y

brand A rati ngs woul d change). I n addi ti on, there were two between-

group factors: sex and brand used by subject. Thus the study had four

factors overal l : sex, brand used, brand rated, and pre-post commerci al .

We vi ew the data bel ow.

Chapter 8

Objective

Method

Data

INTRODUCTION:

AD VIEWING

WITH PRE-POST

BRAND RATINGS

More Split-Plot Design 8 - 2

SPSS Training

Cl i ck File..Open..Data (move to the c:\Trai n\Anova di rectory i f

necessary)

Cl i ck SPSS Portable(*.por) i n the Fi l es of Type drop-down l i st

Doubl e-cl i ck on brand

Figure 8.1 Data from the Brand Study

SETTING UP THE

ANALYSIS

Sex and user are the between-subject factors. The next si x vari abl es

pre_a to post_c contai n the three brand rati ngs before and after vi ewi ng

the commerci al .

Cl i ck Analyze..General Linear Model..Repeated Measures

Repl ace factor1 wi th prepost i n the Within-Subject Factor

Name text box

Press Tab and type 2 i n the Number of Levels text box

Cl i ck Add pushbutton

Type brand i n the Within-Subject Factor Name text box

Press Tab and type 3 i n the Number of Level s text box

Cl i ck the Add pushbutton

Cl i ck the Measure pushbutton

Type rating i n the Measure Name text box

Cl i ck the Add pushbutton i n the Measure Name area

More Split-Plot Design 8 - 3

SPSS Training

Figure 8.2 Two Within-Subject Factors Declared

SPSS now expects vari abl es that compri se two wi thi n-subject factors.

The order you name the factors onl y matters i n that SPSS wi l l order the

factor l evel s l i st i n the next di al og so that the l ast factor named here has

i ts l evel s change most rapi dl y. Therefore dependi ng on how your

vari abl es are ordered i n the data, some factor orders make the l ater

decl arati ons easi er.

Cl i ck the Define pushbutton

More Split-Plot Design 8 - 4

SPSS Training

Figure 8.3 Repeated Measures Dialog with Two Factors

Both prepost and brand are l i sted as wi thi n-subject factors. There are

si x rows, so every possi bl e combi nati on of l evel s between the two factors

i s represented. Noti ce that the brand l evel changes fi rst goi ng down the

l i st. Thi s was due to defi ni ng brand l ast i n the Repeated Measures Defi ne

Factor di al og. Defi ni ng the factors i n an order consi stent wi th the order of

vari abl es i n your data fi l e makes thi s step easi er.

Move the fol l owi ng vari abl es i nto the Within-Subjects

Variables l i st i n the order gi ven: pre_a pre_b pre_c

post_a post_b post_c

Move sex and user i nto the Between-Subjects Factors l i st box

More Split-Plot Design 8 - 5

SPSS Training

Figure 8.4 Between and Within-Subject Factors Defined

We can proceed wi th the anal ysi s, but fi rst l et us request some

opti ons.

Cl i ck the Options pushbutton

Cl i ck the Descriptives checkbox

I ndi vi dual l y move sex, user, prepost, and brand i nto the

Display Means for l i st box

Cl i ck the Homogeneity check box

More Split-Plot Design 8 - 6

SPSS Training

Figure 8.5 Options Dialog Box

Besi des the descri pti ve stati sti cs, we request esti mated margi nal

means (whi ch equal observed means si nce we are fi tti ng a ful l model ) for

each of the factors. Si nce there are several groups i nvol ved i n the

anal ysi s, we ask for homogenei ty of vari ance tests.

We wi l l proceed wi th the anal ysi s. Contrasts can be appl i ed to any

factors i n the same way as we have done earl i er.

Cl i ck Continue to process the opti ons

Cl i ck OK to run the anal ysi s

The command syntax bel ow wi l l produce thi s anal ysi s.

More Split-Plot Design 8 - 7

SPSS Training

Figure 8.6 Syntax to Run Analysis

Vari abl es whi ch compri se the repeated-measures factors precede the

BY keyword and the between-subject factors fol l ow i t. Noti ce the repeated

measure vari abl es are ordered so that brand l evel s change fi rst and

brand i s menti oned l ast i n the WSFACTOR subcommand. Thi s order i s

cri ti cal for the anal ysi s, so care must be taken when runni ng from syntax.

The l evel s of each repeated measures factor are gi ven and pol ynomi al

contrasts (here uni nteresti ng) are used. We requested esti mated

margi nal means for each of the factors. The PRI NT subcommand wi l l

di spl ay the descri pti ve stati sti cs and the homogenei ty tests.

Figure 8.7 Factors in the Analysis

EXAMINING

RESULTS

More Split-Plot Design 8 - 8

SPSS Training

The factors i n the anal ysi s are l i sted al ong wi th the sampl e si zes for

the between-subject factor groups.

Figure 8.8 Descriptive Statistics (Beginning)

Subgroup means appear separatel y for each repeated measure

vari abl e. Means for the repeated measures factors can be seen i n the

esti mated margi nal means pi vot tabl es, or vi ewed i n profi l e pl ots.

Al though they do not appear together i n the output, we fi rst exami ne

some assumpti ons of the anal ysi s. Concerni ng homogenei ty of vari ance,

the program provi des Box’s M stati sti c and Levene’ test. Box’s M i s a

mul ti vari ate stati sti c testi ng whether the vari ance-covari ance matri ces

composed of the si x repeated measures vari abl es are equal across the

between-subject factor subgroup popul ati ons. Levene’s test i s uni vari ate

and tests for homogenei ty across subgroup popul ati ons for each of the si x

repeated measure vari abl es separatel y.

TESTS OF

ASSUMPTIONS

More Split-Plot Design 8 - 9

SPSS Training

Figure 8.9 Box’s M Test of Homogeneity

Box’s M test i s not si gni fi cant, i ndi cati ng that the data are consi stent

wi th the assumpti on of homogenei ty of covari ance matri ces (based on the

si x repeated measures vari abl es) across the popul ati on subgroups.

Figure 8.10 Levene’s Test of Homogeneity

Not surpri si ngl y, the resul ts of Levene’s test are consi stent wi th Box’s

M. Box’s test i s sensi ti ve to both homogenei ty and normal i ty vi ol ati ons,

whi l e Levene’s i s rel ati vel y i nsensi ti ve to l ack of normal i ty. Si nce

homogenei ty of vari ance vi ol ati ons are general l y more probl emati c for

ANOVA, the Levene’s test i s useful .

More Split-Plot Design 8 - 10

SPSS Training

Now l et us exami ne the spheri ci ty assumpti on si nce thi s determi nes

whether we si mpl y vi ew the pool ed ANOVA resul ts, or move to

mul ti vari ate or degree of freedom adjusted resul ts.

Figure 8.11 Sphericity Tests

Noti ce no spheri ci ty test i s appl i ed to the prepost factor. Thi s i s

because i t has onl y two l evel s, so onl y one di fference vari abl e i s created,

and there i s no pool i ng of effects. The spheri ci ty test for brand i s not

si gni fi cant (Si g. = .832), nor i s the spheri ci ty test for the brand by prepost

i nteracti on (Si g. = .975). Thus the data are consi stent wi th spheri ci ty. As

a resul t we wi l l not vi ew the mul ti vari ate test resul ts or the adjusted

pool ed resul ts (Huynh-Fel dt, etc.), and i nstead focus on the standard

(averaged) resul ts.

More Split-Plot Design 8 - 11

SPSS Training

Figure 8.12 Within-Subject Tests

ANOVA RESULTS

Note that this table has been edited in Pivot Table Editor (epsilon

corrected results with sphericity assumed were placed in the top layer) to

display only these results.

Thi s tabl e contai ns al l tests that i nvol ve a wi thi n-subject factor; those

i nvol vi ng onl y between-subject effects appear l ater. Looki ng at the

si gni fi cance (Si g.) col umn, we see a hi ghl y si gni fi cant di fference for pre-

post commerci al and a brand by user i nteracti on. The brand by pre-post

commerci al effect i s not si gni fi cant, i ndi cati ng that al though the

commerci al may have shi fted rati ngs (pre-post commerci al i s si gni fi cant)

i t di d not di fferenti al l y i mprove the rati ng of brand A, whi ch was the ai m

of the commerci al . We wi l l vi ew the means and profi l e pl ots to

understand the si gni fi cant effects.

We wi l l not vi ew the mul ti vari ate resul ts or the degree of freedom

corrected (appropri ate i f spheri ci ty i s vi ol ated) resul ts. Nor wi l l we

exami ne the tests of speci fi c contrasts si nce we had no pl anned contrasts

and the pol ynomi al contrasts over brand categori es make no conceptual

sense.

Note

More Split-Plot Design 8 - 12

SPSS Training

Figure 8.13 Between-Subjects Tests

Of the between-subjects effects, onl y sex shows a si gni fi cant

di fference. Let us take a l ook at some of the means.

Figure 8.14 Means for Sex and Pre-Post Commercial

We see mal es gi ve hi gher rati ngs than femal es and the post-

commerci al rati ngs are hi gher than the pre-commerci al rati ngs. I t seems

that the commerci al was a success, but a success for al l brands, not just

brand A as hoped.

More Split-Plot Design 8 - 13

SPSS Training

To better vi ew the i nteracti on between user (brand used) and brand

(brand rated) we request a profi l e pl ot

Cl i ck the Di al og Recal l tool , then sel ect Repeated

Measures

Cl i ck Define pushbutton

Cl i ck Plots pushbutton

Move user i nto the Horizontal Axis l i st box

Move brand i nto the Separate Lines l i st box

Cl i ck Add pushbutton

Figure 8.15 Requesting a Profile Plot

PROFILE PLOTS

As many as three factors can be di spl ayed i n a profi l e pl ot, and so up

to a three-way i nteracti on can be exami ned. Note that mul ti pl e profi l e

pl ots can be requested, whi ch al l ows for many vi ews of your data.

Cl i ck Continue to process the pl ot request

Cl i ck OK to run the anal ysi s

The command bel ow wi l l produce the profi l e pl ot.

More Split-Plot Design 8 - 14

SPSS Training

GLM

Pre_a pre_b pre_c post_a post_b post_c BY sex user

/WSFACTOR = prepost 2 Pol ynomi al brand 3 Pol ynomi al

/MEASURE = rati ng

/METHOD = SSTYPE(3)

/PLOT = PROFI LE(user*brand)

/EMMEANS = TABLES(sex)

/EMMEANS = TABLES(user)

/EMMEANS = TABLES(prepost)

/EMMEANS = TABLES(brand)

/PRI NT = DESCRI PTI VES HOMOGENEI TY

/CRI TERI A = ALPHA(.05)

/WSDESI GN

/DESI GN .

The Pl ot subcommand requests a profi l e pl ot of user by brand.

Figure 8.16 Profile Plot of Brand Used by Brand Rating

I n the pl ot, brand l evel s 1, 2, and 3 correspond to brands A, B, and C,

respecti vel y. The Brand used by Brand i nteracti on shows (as we surel y

woul d expect) that those who regul arl y use a parti cul ar brand rate i t

hi gher than the other brands. Especi al l y when there are many factor

l evel s, or several factors i nvol ved, profi l e pl ots can be very hel pful i n

practi ce.

More Split-Plot Design 8 - 15

SPSS Training

The homogenei ty and spheri ci ty assumpti ons were met. We di d not

exami ne normal i ty, but coul d do so by requesti ng resi dual pl ots i n the

Opti ons di al og box. We found that men gave hi gher brand rati ngs than

women, that the post-commerci al rati ngs were hi gher than pre-

commerci al rati ngs, and that respondents rated thei r own brand hi ghest.

The expected brand by pre-post commerci al i nteracti on was not evi dent.

I n thi s chapter we exami ned a more compl ex spl i t-pl ot ANOVA i nvol vi ng

two between and two wi thi n-subject factors. We al so used a profi l e pl ot to

descri be an i nteracti on effect.

SUMMARY OF

RESULTS

SUMMARY

More Split-Plot Design 8 - 16

SPSS Training

Analysis of Covariance 9 - 1

SPSS Training

Analysis of Covariance

I n thi s chapter we wi l l di scuss the purpose, assumpti ons, and

i nterpretati on of anal ysi s of covari ance. I n addi ti on, we wi l l demonstrate

an approach i f the paral l el i sm assumpti on i s not met. We wi l l then

extend the anal ysi s to i ncl ude wi thi n-subject desi gns whi l e usi ng

constant and varyi ng covari ates.

We wi l l use the General Li near Model Uni vari ate and Repeated

Measures procedures to perform the vari ous runs to do the anal yses and

check the assumpti ons.

The data presented here are taken from page 806 of Wi ner(1971).

However, we provi de a di fferent scenari o that wi l l i nfl uence the

i nterpretati on of the resul ts. I t shoul d be noted that the data fi l e i s very

smal l and i s onl y for i l l ustrati ve purposes. Suppose a study was done to

eval uate the effecti veness of three treatment drugs on pai n-reducti on of

ankl e i njuri es. There are three types of treatment drugs (vari abl e Drug

wi th l abel s A, B, and C) and each pati ent i s i n one of the three drug

groups (between-subjects factor). There i s wi thi n-subject factor, whi ch

i nvol ves measures taken duri ng the earl y and l ater stages of the drug

i nterventi on (ti me peri ods 1 and 2). The dependent measure i s a pai n

rati ng scal e. Al so, physi cal therapy was performed throughout the study,

and the amount of physi cal therapy vari ed from pati ent to pati ent. Si nce

physi cal therapy may i nfl uence the l evel of pai n reported, i t i s treated a

covari ate i n the study. There are two measures of the hours of physi cal

therapy a pati ent experi enced, one taken from the peri od just after the

drug treatment was i ni ti ated (ti me peri od 1) and one taken l ater i n the

course of treatment (ti me peri od 2).

The mai n questi on concerns whether the drugs are effecti ve i n pai n

reducti on after control l i ng for the amount of physi cal therapy.

We wi l l run vari ous desi gns usi ng the same data set. There wi l l be a fi xed

between-subject factor Drug wi th 3 l evel s and a wi thi n-subject factor

(ti me) wi th two l evel s. The dependent vari abl e i s refl ected i n PAI N1 and

PAI N2. The covari ate (hours of physi cal therapy) was measured at the

same ti me poi nts as the dependent vari abl e and i s stored i n PT1 and

PT2.

We wi l l run many di fferent anal yses on the same data set to demonstrate

the fl exi bi l i ty of the techni que and reduce possi bl e confusi on to constantl y

swi tchi ng data. I n practi ce, you woul d be i nterested i n a speci fi c set of

model s.

Chapter 9

Objective

Method

Data and

Scenario

Design

Note

Analysis of Covariance 9 - 2

SPSS Training

A

nal ysi s of covari ance can be vi ewed as an attempt to provi de some

stati sti cal control i n pl ace of l ack of experi mental control .

I ncl usi on of a covari ate al l ows the researcher to run the usual

ANOVAs whi l e control l i ng for some other vari abl e. Thi s i s not control i n

the experi mental sense, but control i n the sense of maki ng a stati sti cal

adjustment to equate al l groups on the covari ate. Covari ates are i nterval

scal e vari abl es; i f they were categori cal then they woul d be i ncl uded as

addi ti onal factors i n the desi gn.

One purpose of anal ysi s of covari ance i s to obtai n a more sensi ti ve

ANOVA by reduci ng the wi thi n-group vari abi l i ty. I f the covari ate i s

rel ated to the dependent vari abl e the same way i n each group, the wi thi n

group vari ati on can be reduced by removi ng the effect of the covari ate.

The cl assi c case i s an experi ment i n whi ch subjects are randoml y

assi gned to groups, but vary on some background measure; anal ysi s of

covari ance (ANCOVA) wi l l control for thi s source of vari ati on.

Anal ysi s of covari ance i s often spoken of as a condi ti onal anal ysi s.

Removi ng the effect of the covari ate essenti al l y equates al l subjects on

the covari ate, so i nstead of speaki ng of factor A havi ng an effect we speak

of factor A havi ng an effect i f subjects had i denti cal val ues on the

covari ate. A common exampl e of anal ysi s of covari ance i s the adjusti ng

for body wei ght i n medi cal experi ments. I n thi s context, anal ysi s of

covari ance adjusts the anal ysi s as i f each subject began at the same

wei ght.

ANCOVA i s al so used i n non-experi mental studi es to substi tute

stati sti cal control for factors beyond the control of the researcher. Care

must be taken si nce i f the covari ate rel ates to factors i n the study,

control l i ng for covari ate modi fi es the esti mated effects of the factors

themsel ves.

Basi cal l y, the dependent vari abl e i s regressed on the covari ate, but the

rel evant vari ati on of the dependent vari abl e i s not i ts vari ati on around

the grand mean but i nstead i s based on the pool ed wi thi n-group

vari ati on. Thus a wi thi n-group regressi on wi th the covari ate(s) i s run and

the anal ysi s of vari ance i s performed on the resi dual s from the

regressi on.

The major assumpti ons speci fi c to ANCOVA are: 1) The rel ati onshi p

between the covari ate and the dependent vari abl e (wi thi n groups) i s

l i near; 2) The wi thi n-group di stri buti on of the resi dual s i s normal ; and 3)

The rel ati onshi p between the covari ate and the dependent vari abl e (the

sl opes i n the wi thi n-group regressi ons) i s the same across al l groups.

Assumpti on (1) need not hol d, but the routi nes avai l abl e i n most

software are based on a l i near rel ati onshi p. Assumpti on (2) i s the usual

normal i ty assumpti on, thi s ti me after the covari ate has been appl i ed. The

HOW IS

ANALYSIS OF

COVARIANCE

DONE?

ASSUMPTIONS

OF ANCOVA

INTRODUCTION

Analysis of Covariance 9 - 3

SPSS Training

l ast assumpti on i s i mportant and can be tested. The degree of adjustment

made i s based on the pool ed wi thi n-groups regressi on. I f the sl ope

rel ati ng the covari ate to the dependent vari abl e vari es across groups,

then the common sl ope used to adjust each group does not refl ect the true

rel ati onshi p for that group. We wi l l see that a di fferent sl ope can be fi t to

each group, but thi s requi res rethi nki ng just what we hope to accompl i sh

wi th the anal ysi s.

Pl ots of the dependent vari abl e and the covari ate can be made separatel y

for each group (i f there are rel ati vel y few cel l s i n the anal ysi s) to take an

i nformal l ook at the homogenei ty of sl opes. The resi dual s can be

di spl ayed i n normal pl ots. The homogenei ty of sl opes assumpti on can be

formal l y tested.

We fi rst run a one-factor ANOVA to provi de a basel i ne. We use the

Uni vari ate procedure i nstead of a One-Way ANOVA i n order to use the

same procedure throughout.

Cl i ck File..Open..Data

Move to the c:\ Train\ Anova di rectory (i f necessary)

Sel ect SPSS Portable (.por) from the Fi l es of Type drop-down

l i st

Doubl e-cl i ck on PainTreat.por

Figure 9.1 Data for Analysis of Covariance Example

CHECKING THE

ASSUMPTIONS

BASELINE

ANOVA

Cl i ck Analyze..General Linear Model..Univariate

Move PAIN1 i nto the Dependent Variable l i st box

Move Drug i nto the Fixed Factors l i st box

Analysis of Covariance 9 - 4

SPSS Training

Figure 9.2 Univariate Dialog Box

Cl i ck on OK to run the anal ysi s.

PAI N1 (pai n duri ng peri od 1) i s the dependent vari abl e and there i s

one between-subjects factor (Drug) wi th three l evel s. The fol l owi ng

command wi l l run thi s anal ysi s.

UNI ANOVA

pai n1 BY drug

/METHOD = SSTYPE(3)

/I NTERCEPT = I NCLUDE

/CRI TERI A = ALPHA(.05)

/DESI GN = drug .

Analysis of Covariance 9 - 5

SPSS Training

Figure 9.3 ANOVA Table

Thi s tabl e shows no suggesti on of a mai n effect of Drug i n thi s

anal ysi s (si gni fi cance l evel i s .376). Thus duri ng the earl y treatment

peri od, there were no di fferences attri butabl e to drug found i n the pai n

measure.

I n the second run we wi l l i ncl ude the covari ate and the i nteracti on term

of the covari ate and the between-subject factor. The assumpti on of

equal i ty (homogenei ty) of regressi on sl opes can be tested by fi tti ng a

model contai ni ng the mai n effects of Drug and PT1, as wel l as the

Drug*PT1 i nteracti on. The i nteracti on term provi des the test of the nul l

hypothesi s of equal sl opes. I f the sl opes rel ati ng the covari ate to the

dependent vari abl e are i denti cal (paral l el ) across the di fferent groups,

thi s i nteracti on wi l l not be si gni fi cant.

Cl i ck the Dialog Recall tool , then cl i ck Univariate

(Veri fy that PAI N1 i s i n the Dependent vari abl e box and Drug i s

i n the Fi xed Factor(s) l i st box)

Move PT1 i nto the Covari ate(s) l i st box

Cl i ck the Model pushbutton

Sel ect Custom model opti on button

Cl i ck Drug, then cl i ck the Bui l d Term arrow to add Drug to the

model

Cl i ck PT1, and then cl i ck the Bui l d Term arrow to add PT1 to

the model

Sel ect both Drug and PT1 i n the Factors & Covari ates l i st box

(use Ctrl -cl i ck), then cl i ck the Bui l d Term arrow to add the

Drug*PT1 i nteracti on term to the model

ANCOVA –

HOMOGENEITY

OF SLOPES

Analysis of Covariance 9 - 6

SPSS Training

Figure 9.4 Univariate Dialog Box

Cl i ck Continue to process the model requests

Cl i ck on OK to run the anal ysi s

The fol l owi ng command wi l l al so run the anal ysi s:

UNI ANOVA

pai n1 BY drug WI TH pt1

/METHOD = SSTYPE(3)

/I NTERCEPT = I NCLUDE

/CRI TERI A = ALPHA(.05)

/DESI GN = drug pt1 drug*pt1 .

The keyword WI TH precedes covari ates i n UNI ANOVA, just as BY

precedes factors. From the fi rst command l i ne al one, UNI ANOVA woul d

run a standard anal ysi s of covari ance, not testi ng the i nteracti on term. I n

the DESI GN subcommand we i ncl ude the between-subjects factor (Drug),

the covari ate (PT1), and the factor by covari ate i nteracti on (Drug BY

PT1). Thi s i nteracti on effect i s the mai n focus of our i nterest.

Analysis of Covariance 9 - 7

SPSS Training

Figure 9.5 ANCOVA Table with Homogeneity of Slopes Test

The summary tabl e i ndi cates that the i nteracti on i s not si gni fi cant

(Si g. = .390), so the homogenei ty of sl opes assumpti on seems to be met.

Thus the l i near rel ati onshi p between hours of physi cal therapy and pai n

l evel does not di ffer across drug treatment groups. There i s a suggesti on

of an effect of the covari ate (Si g. = .067), but no effect due to Drug. To

repeat, wi th such a smal l sampl e there i s l i ttl e power to detect

assumpti on vi ol ati ons, but we wi sh to demonstrate the method.

Havi ng checked the paral l el i sm of sl opes assumpti on, we proceed wi th

the standard ANCOVA.

Cl i ck the Dialog Recall tool , then cl i ck Univariate

Cl i ck the Model pushbutton

Sel ect the Full factorial model opti on button

Cl i ck Continue to process the change

Cl i ck Options pushbutton

Cl i ck Parameter Estimates checkbox

Cl i ck Continue

Cl i ck OK to run the anal ysi s.

The fol l owi ng command wi l l run the anal ysi s usi ng syntax.

UNI ANOVA

pai n1 BY drug WI TH pt1

/METHOD = SSTYPE(3)

/I NTERCEPT = I NCLUDE

/PRI NT = DESCRI PTI VE PARAMETER

/CRI TERI A = ALPHA(.05)

/DESI GN = pt1 drug.

STANDARD

ANCOVA

Analysis of Covariance 9 - 8

SPSS Training

Figure 9.6 ANCOVA Summary Table

The resul ts are si mi l ar to the previ ous anal ysi s, no effect due to factor

Drug, and a si gni fi cant rel ati onshi p between the covari ate and dependent

measure. The reason for the covari ate now bei ng si gni fi cant probabl y has

to do wi th the extra degrees of freedom added to the error term – goi ng

from 3 to 5 degrees of freedom i s a bi g jump.

The Uni vari ate procedure al so presents some i nformati on to characteri ze

the rel ati on between the covari ate and dependent vari abl e.

Figure 9.7 Parameter Estimates

I n the GLM parameteri zati on, the i ntercept parameter esti mate gi ves

the esti mated val ue of the l ast category of Drug (Drug = 3) when the

covari ate i s equal to 0. The Drug = 1 and Drug =2 coeffi ci ents subtract

the l evel 3 predi cted val ue from the l evel 1 and l evel 2 predi cted val ues,

respecti vel y. Addi ng one of these coeffi ci ents to the i ntercept esti mate

gi ves the esti mated val ue for that l evel of Drug when the covari ate i s

equal to 0.

The B coeffi ci ent for PT1 i s the regressi on coeffi ci ent used to predi ct

the dependent vari abl e based on the covari ate. I ts posi ti ve coeffi ci ent

i ndi cates that hi gher l evel s of physi cal therapy are associ ated wi th

DESCRIBING THE

RELATIONSHIP

Analysis of Covariance 9 - 9

SPSS Training

greater pai n l evel s. Thi s i s not the expected rel ati onshi p and i f thi s were

real data, i t shoul d be exami ned more careful l y (perhaps pati ents wi th

more seri ous and pai nful i njuri es recei ved more physi cal therapy). The

95% confi dence band for the regressi on coeffi ci ent i s rather wi de.

I f an i nteracti on between a covari ate and a factor i n the model i s

si gni fi cant, i t i ndi cates that the sl opes rel ati ng the covari ate to the

dependent vari abl e vary across groups. I f there i s i nterest i n model i ng

thi s, that i s, fi tti ng di fferent sl opes to each group, thi s can be speci fi ed i n

the Uni vari ate procedure. I t i s no l onger the standard anal ysi s of

covari ance si nce the degree of adjustment vari es wi th the group, but the

anal ysi s may be of i nterest i n i ts own ri ght.

Cl i ck the Di al og Recal l tool , then cl i ck Univariate

Cl i ck the Model pushbutton

Sel ect the Custom Model opti on button

I f Drug and Drug*PT1 are not al ready i n the Model l i st box (from

our earl i er anal ysi s) then move Drug and Drug*PT1 (Ctrl -

cl i ck to sel ect both) i nto the Model box

Remove PT1 from the Model l i st box (i f necessary)

Figure 9.8 Model for Separate Slope Analysis

FITTING NON-

PARALLEL

SLOPES

Si nce PT1 i s removed from the model , Uni vari ate wi l l assi gn three

degrees of freedom to the PT1 by Drug i nteracti on. Thus i t wi l l fi t a

separate sl ope (between PT1 and the dependent measure) for each l evel

of Drug. I f we l eft PT1 i n the model , as we di d when testi ng sl ope

Analysis of Covariance 9 - 10

SPSS Training

homogenei ty, then the PT1 effect woul d represent the overal l sl ope

between PT1 and the dependent measure, whi l e the two degrees of

freedom PT1 by Drug effect woul d test the i nteracti on of PT1 and Drug.

Cl i ck Continue to process the change

Cl i ck OK to run the anal ysi s

The fol l owi ng syntax wi l l run the anal ysi s.

UNI ANOVA

pai n1 BY drug WI TH pt1

/METHOD = SSTYPE(3)

/I NTERCEPT = I NCLUDE

/PRI NT = PARAMETER

/CRI TERI A = ALPHA(.05)

/DESI GN = drug drug*pt1 .

Figure 9.9 Between-Subject Tests

We noti ce that nei ther the mai n effect of Drug, nor the covari ate

mai n effect or i nteracti on of the factor and the covari ate (bundl ed

together i n Drug*PT1) are si gni fi cant (.577 and .109, respecti vel y).

Analysis of Covariance 9 - 11

SPSS Training

Figure 9.10 Parameter Estimates

The val ues for the i ntercept and Drug = 1, Drug = 2, and Drug = 3 are

the same as expl ai ned earl i er. Noti ce that there are 3 degrees of freedom

for Drug*PT1: one for each of the three sl opes. The parameter esti mates

for Drug*PT1 provi de the sl ope esti mates, rel ati ng hours of physi cal

therapy to reported pai n l evel , for each of the three groups.

To i l l ustrate anal ysi s of covari ance i n the context of repeated measures

we wi l l fi rst run a spl i t-pl ot anal ysi s (between-subject factor Drug,

wi thi n-subject factor ti me wi th two l evel s), usi ng onl y the fi rst

measurement of the covari ate PT1 to i l l ustrate the anal ysi s. Thi s i s al so

termed repeated measures wi th a constant covari ate si nce a si ngl e

covari ate val ue appl i es across l evel s of the repeated measure factors.

Cl i ck Analyze..General Linear Model..Repeated Measures

Repl ace factor1 wi th time

Enter 2 i n the number of l evel s box

Cl i ck the Add pushbutton

Cl i ck the Measure pushbutton

Type pain i n the Measure Name text box

Cl i ck the Add pushbutton i n the Measure Name area

REPEATED

MEASURES

ANCOVA WITH A

SINGLE

COVARIATE

Analysis of Covariance 9 - 12

SPSS Training

Figure 9.11 Repeated Measures Define Factors Dialog Box

Cl i ck the Define pushbutton

Move PAIN1 and PAIN2 i nto the Within-Subject Variables

l i st box i n that order

Move Drug i nto the Between-subject Factor(s) l i st box

Move PT1 i nto the Covariates l i st box

Figure 9.12 Repeated Measures Dialog Box

Analysis of Covariance 9 - 13

SPSS Training

Cl i ck the Options pushbutton

Sel ect Descriptives, Parameter Estimates, Transformation

Matrix, and Homogeneity Tests

Figure 9.13 Options Dialog Box

Cl i ck Continue to process the request

Cl i ck OK to run the anal ysi s

The fol l owi ng command wi l l al so run the anal ysi s.

GLM

pai n1 pai n2 BY drug WI TH pt1

/WSFACTOR = ti me 2 Pol ynomi al

/MEASURE = pai n

/METHOD = SSTYPE(3)

/PRI NT = DESCRI PTI VE PARAMETER TEST(MMATRI X)

HOMOGENEI TY

/CRI TERI A = ALPHA(.05)

/WSDESI GN = ti me

/DESI GN = pt1 drug .

Scrol l down to Transformation secti on of resul ts

Analysis of Covariance 9 - 14

SPSS Training

Figure 9.14 Transformation Matrix

The fi rst transformati on i s the average of PAI N1 and PAI N2. Thus

the covari ate wi l l be appl i ed to the effects that i nvol ve the average of

PAI N1 and PAI N2, that i s, onl y between-subject effects.

Scrol l up to the Tests of Between-Subjects Effects’ pi vot tabl e

Figure 9.15 Tests of Between-Subject Effects

The covari ate i s not qui te si gni fi cant (Si g. = .066). I f i t were

si gni fi cant, thi s woul d i ndi cate that the amount of physi cal therapy

duri ng the earl y phase of drug treatment i s rel ated to overal l (peri od 1

and peri od 2 measures, averaged together) pai n rati ngs. The effect of the

Drug factor, adjusted for the covari ate, i s not si gni fi cant.

Analysis of Covariance 9 - 15

SPSS Training

Figure 9.16 Tests of Within-Subjects Effects (Sphericity Assumed)

Note: the pivot table above was edited in the Pivot Table Editor so only

the sphericity assumed results appear. Si nce there are onl y two l evel s of

the repeated measure factor, the spheri ci ty test and correcti ons are not

rel evant). We fi nd no effects si gni fi cant: mai n effect of ti me, i nteracti on

between ti me and factor Drug, i nteracti on between the covari ate and

ti me. Thi s l atter effect (Ti me by PT1) tests whether the sl ope rel ati ng the

covari ate (physi cal therapy) to the dependent measure (pai n) i s the same

(paral l el ) for each of the two ti me peri ods.

Figure 9.17 Parameter Estimates

Thi s i s the standard ANCOVA tabl e of parameters under the general

l i near model . Note a separate sl ope coeffi ci ent (for covari ate PT1) i s

cal cul ated for PAI N1 and PAI N2; the model effects are adjusted for both.

Analysis of Covariance 9 - 16

SPSS Training

Si nce covari ates are not al ways fi xed measures at one ti me poi nt,

covari ates that are measured under each condi ti on can be used i n the

anal ysi s. We wi l l anal yze the same data usi ng PT1 and PT2, measures of

the covari ates at the two ti me poi nts. Note that GLM wi l l adjust each

l evel of the repeated measure factor (PAI N1, PAI N2) for every covari ate.

Thus a covari ate that vari es over ti me i s treated i denti cal l y to the

si tuati on i n whi ch mul ti pl e covari ates are recorded at a si ngl e ti me poi nt.

Cl i ck on the Dialog Recall tool , then cl i ck Repeated

Measures

Cl i ck on the Define button

Add PT2 to the Covari ates box

Cl i ck on OK

The fol l owi ng command wi l l run thi s anal ysi s.

GLM

pai n1 pai n2 BY drug WI TH pt1 pt2

/WSFACTOR = ti me 2 Pol ynomi al

/MEASURE = pai n

/METHOD = SSTYPE(3)

/PRI NT = PARAMETER TEST(MMATRI X) HOMOGENEI TY

/CRI TERI A = ALPHA(.05)

/WSDESI GN = ti me

/DESI GN = pt1 pt2 drug .

Figure 9.18 Test of Within-Subjects Factors

REPEATED

MEASURES

ANCOVA WITH A

VARYING

COVARIATE

As before, the pivot table has been edited so only the results that

assume sphericity appear.

As we can see Ti me, the i nteracti on of Ti me and PT1, and the

i nteracti on of Ti me and PT2 are both si gni fi cant, but the Ti me by Drug

i nteracti on i s not si gni fi cant. Thi s suggests that there i s a change i n pai n

l evel over ti me and that thi s change i s rel ated to the amount of physi cal

therapy i n both the earl y and l ate stages of drug treatment. Al though not

si gni fi cant, a Drug by Ti me i nteracti on woul d suggest that the effect of

Drug i s not uni form across the two ti me peri ods.

Analysis of Covariance 9 - 17

SPSS Training

Figure 9.19 Test of Between-Subjects Factors

The onl y si gni fi cant effect i s the covari ate PT2 (the amount of

physi cal therapy duri ng the second ti me peri od). The resul ts of the other

effects are consi stent wi th the previ ous anal ysi s.

Figure 9.20 Parameter Estimates

The i nterpretati on of the parameter esti mates i s i denti cal to our

earl i er di scussi on (see Fi gure 9.17); each of the covari ates i s summari zed

separatel y. As we found earl i er wi th PT1 (physi cal therapy duri ng the

fi rst peri od), the esti mates for the covari ate coeffi ci ents of PT2 (physi cal

therapy duri ng the second peri od) i ndi cate that pai n l evel i ncreases wi th

more physi cal therapy. Oddl y, hi gher l evel s of physi cal therapy duri ng

the fi rst peri od rel ate to l ower pai n l evel s duri ng the second peri od.

Noti ce a coeffi ci ent rel ates PT2 (amount of physi cal therapy duri ng the

second peri od) to the fi rst peri od pai n measure; i t mi ght be argued on

l ogi cal grounds that thi s coeffi ci ent shoul d not be i ncl uded i n the model .

The fact that the i ntercept for the second peri od i s greater (11.583)

than that for the fi rst peri od (7.342) i ndi cates that pai n l evel s i ncreased

Analysis of Covariance 9 - 18

SPSS Training

over ti me! Thi s coul d be shown more cl earl y by requesti ng the esti mated

margi nal means for the Ti me factor usi ng the Opti ons di al og.

Thi s process can be general i zed to addi ti onal covari ates and repeated

measures factors, and even more compl i cated vari ati ons. I f you attempt

these anal yses, make sure you di spl ay the transformati on matri x that

wi l l i nform you of the actual anal ysi s that the General Li near Model

procedures are performi ng.

FURTHER

VARIATIONS

Special Topics 10 - 1

SPSS Training

Special Topics

We wi l l di scuss the setups for some speci al ty stati sti cal model s: Lati n

Square Desi gns and Random Effects Desi gns. We wi l l not di scuss

substanti ve i nterpretati on of the resul ts.

I

n addi ti on to the more or l ess standard anal yses di scussed i n the

previ ous chapters, there are a number of more speci al i zed ANOVA

appl i cati ons. The fami l y of i ncompl ete desi gns, whi ch i ncl udes Lati n

Squares, al l ows experi menters to control for nui sance factors, or to study

a number of factors wi thout i ncl udi ng al l possi bl e combi nati ons of the

factor l evel s i n the anal ysi s. The pri ce of thi s i nvol ves gi vi ng up the

opportuni ty to test for i nteracti on effects. I n thi s chapter we demonstrate

that the GLM procedure can perform such anal yses wi th experi mental

data col l ected from a Lati n Square desi gn. A second appl i cati on we wi l l

expl ore i nvol ves ANOVA desi gns that contai n more than a si ngl e random

factor. Agai n, such model s can be run usi ng the GLM procedure.

Chapter 10

Objective

INTRODUCTION

Special Topics 10 - 2

SPSS Training

Lati n Square desi gns are useful when there i s i nterest i n performi ng a

mul ti pl e factor ANOVA, but i t i s i mpossi bl e or undesi rabl e to represent

al l combi nati ons of l evel s of factors i n the anal ysi s. For exampl e, a three-

factor desi gn wi th each factor contai ni ng fi ve l evel s i mpl i es 125 groups!

Another common use of Lati n Square desi gns i nvol ves control l i ng for

nui sance factors, that i s, control l i ng for the effects of factors that may

i nfl uence the outcome, but are not themsel ves of experi mental i nterest.

The basi c i dea i s that not al l combi nati ons of l evel s of l evel s of factors

are i ncl uded, but those i ncl uded are counterbal anced so that i ndependent

mai n effects can be tested. The counterbal anci ng i s desi gned to confound

mai n effects wi th certai n i nteracti on terms, and an assumpti on i s made

that the i nteracti on terms are not si gni fi cant. As a resul t of not i ncl udi ng

al l cel l s, at l east some and possi bl y al l i nteracti on questi ons cannot be

tested. When there are no repl i cates wi thi n cel l s, the vari ati on usual l y

attri buted to hi gher-order i nteracti ons i s used as the error term i n testi ng

mai n effects.

To i l l ustrate a Lati n Square desi gn, Montgomery (1984) provi des an

exampl e of a dynami te manufacturer i nterested i n eval uati ng the resul ts

of fi ve chemi cal formul ati ons on the expl osi ve force of the resul ti ng

compound. I n addi ti on, two other factors have been i denti fi ed as

potenti al l y i nfl uenci ng the compound, namel y the qual i ty of the raw

materi al s and the person mi xi ng the materi al s. These wi l l be consi dered

to be systemati c sources of error (nui sance factors) that need to be

removed from the anal ysi s. Thus we consi der three factors: formul ati on,

batch of raw materi al s, and operator. I deal l y, an experi ment woul d be

performed so that each operator uses each batch of raw materi al s i n

prepari ng each formul ati on (5 x 5 x 5, or 125 cel l s). Here we run i nto the

practi cal probl em of there bei ng not enough raw materi al s i n a batch to

suppl y each operator for each formul ati on (25 combi nati ons). For thi s

reason the researcher cannot perform the ful l y bal anced experi ment and

i nstead a Lati n Square wi l l be used.

Bel ow we show the formul ati on assi gnments (A-E) wi th fi ve operators

(1-5) and fi ve batches of materi al (1-5). The equal number of l evel s wi thi n

each factor i s requi red for bal anci ng and i s a feature of such desi gns.

LATIN SQUARE

DESIGNS

AN EXAMPLE

Noti ce that each formul ati on appears once i n each row and col umn of

the tabl e – that i s, once wi th each batch of materi al s and once wi th each

operator. I f i nteracti ons between batches, operators, and compounds are

negl i gi bl e, then we can test for the effects of di fferent formul ati ons of

compounds on expl osi ve force wi thout the noi se i ntroduced by raw

Special Topics 10 - 3

SPSS Training

materi al s and operators (by adjusti ng for i t).

Cl i ck on File..Open..Data

Move to the c:\ Train\ Anova di rectory

Sel ect SPSS Portable (.por) from Fi l es of Type drop-down l i st

Doubl e cl i ck on Latinsq

Figure 10.1 Data from Latin Square Design

Cl i ck Analyze..General Linear Model..Univariate

Move Force i nto the Dependent Variable l i st box

Move Batch, Operator, and Form i nto the Fixed Factors l i st

box

Special Topics 10 - 4

SPSS Training

Figure 10.2 Univariate Dialog Box

Cl i ck the Model pushbutton

Sel ect Main Effects on the Build Term(s) drop-down l i st

Separatel y move Batch, Operator, and Form i nto the Model

l i st box

Figure 10.3 Univariate: Model Dialog Box

Special Topics 10 - 5

SPSS Training

I nstead of the ful l model (al l mai n effects and i nteracti ons), we wi l l fi t

a custom model consi sti ng of onl y mai n effects. The resi dual vari ati on

from thi s model wi l l be used as the error term.

Cl i ck on Continue to process Model

Cl i ck on OK to run the anal ysi s.

The fol l owi ng syntax wi l l al so run the anal ysi s.

UNI ANOVA

force BY batch operator form

/METHOD = SSTYPE(3)

/I NTERCEPT = I NCLUDE

/CRI TERI A = ALPHA(.05)

/DESI GN = batch operator form .

The DESI GN subcommand i ndi cates that onl y a mai n effects model

wi l l be tested. The remai ni ng effects wi l l be pool ed together i nto a

resi dual term, whi ch wi l l be used as the error term i n si gni fi cance

testi ng. Thi s i s why the assumpti on of no i nteracti ons i s so i mportant.

Si nce Uni anova i s the uni vari ate versi on of the GLM procedure, the GLM

command coul d have been used i nstead.

Figure 10.4 ANOVA Table

We see the tests of mai n effects performed usi ng the resi dual as the

error term. The mai n effect of Form (formul ati on) i s of greatest i nterest

and i s hi ghl y si gni fi cant. For some desi gns, i f there were repl i cati on

wi thi n each cel l , the wi thi n-cel l error term coul d be used for si gni fi cance

testi ng.

Special Topics 10 - 6

SPSS Training

There are addi ti onal vari ants al ong thi s theme of Lati n Square desi gns

(Greco-Lati n Square, etc.) as wel l as other cl asses of desi gns (for exampl e,

fracti onal factori al ). They can be set up i n GLM i n much the same way as

demonstrated above. Such desi gns general l y demand more knowl edge of

the user to pl an the experi ment, to understand whi ch effects (mai n

effects, mai n effects and some two-way i nteracti ons, etc.) can be tested,

and to i nterpret the resul ts. For those who need to perform such studi es,

SPSS Tri al Run i s a desi gn of experi ments program that can generate a

vari ety of compl ex desi gns. I t can then anal yze the resul ts usi ng the GLM

procedure, whi ch i s i ncl uded i n the program.

The precedi ng desi gns i n thi s course contai ned onl y si ngl e random

factors: pl ant vari ati on wi thi n group, subject vari ati on wi thi n group, and

Y vari ati on wi thi n l evel s of A. I n SPSS, by defaul t GLM assumes there i s

a si ngl e random factor refl ected i n the case to case vari ati on. GLM can

accommodate mul ti pl e random factors qui te easi l y usi ng di al og boxes

when the random factors are crossed wi th the other factors, and can be

run usi ng syntax when the random factors are nested. Thi s i s because the

nesti ng operati on cannot be expressed currentl y i n the GLM – General

Factori al Model di al og box. Most appl i ed stati sti cs books that di scuss

experi mental desi gn ei ther cover the common desi gns or suppl y rul es to

determi ne the correct error terms i n the presence of mul ti pl e random

effects (for exampl e, Ki rk (1982) or Mi l l i ken and Johnston (1984)).

To i l l ustrate we wi l l consi der a si mpl e two random-effect desi gn. An

experi ment i s performed i n whi ch subjects (5) i nfl ate rubber rafts (6). The

ti me i t takes to i nfl ate each raft i s recorded. Rafts are sampl ed from a

producti on l i ne and each subject i nfl ates each raft once. Here we have a

compl etel y crossed (each subject i nfl ates each raft) two-factor desi gn wi th

both factors assumed random. I n thi s case the i nteracti on term i s used to

test each of the mai n effects, and i f there had been repl i cati ons (i f each

subject i nfl ated each raft several ti mes), the i nteracti on i tsel f woul d be

tested usi ng the wi thi n-cel l s error term.

Cl i ck on File..Open..Data

Move to the c:\ Train\ Anova di rectory (i f necessary)

Sel ect SPSS Portable (.por) from the Fi l es of Type drop-down

l i st

Doubl e cl i ck raft.por to open the fi l e

Cl i ck No when asked to save the Data Edi tor’s contents

COMPLEX

DESIGNS

RANDOM

EFFECTS

MODELS

Special Topics 10 - 7

SPSS Training

Figure 10.5 Raft Data

Noti ce the data are arranged so each subject by raft combi nati on

appears as a di fferent case. I f we structured the data as we ordi nari l y

woul d for repeated measures, each respondent on a si ngl e row of data, we

woul d not be abl e to decl are raft as a random factor (there i s no Random

Factor(s) l i st box i n the General Li near Model – Repeated Measures

di al og box).

Cl i ck on Analyze..General Linear Model..Univariate

Move time i nto the Dependent Variable l i st box

Move subject and raft i nto the Random Factor(s) l i st box

Special Topics 10 - 8

SPSS Training

Figure 10.6 Univariate Dialog Box

Cl i ck OK to run the anal ysi s

The fol l owi ng syntax wi l l run the anal ysi s

Figure 10.7 Syntax for Two Random Effects Analysis

The Random subcommand decl ares both subject and raft to be

random factors.

Special Topics 10 - 9

SPSS Training

Figure 10.8 Results

We have tests for both subject and raft mai n effects. Noti ce that the

i nteracti on coul d not be tested (the error term for i t has 0 degrees of

freedom) because there were no repl i cati ons.

A speci fi c extensi on to the mul ti pl e random effects model i s that i n whi ch

random effects are nested wi thi n random effects. A common exampl e of

thi s i nvol ves anal yzi ng student test scores wi thi n school s when the

school s are sampl ed from school di stri cts. A vari ati on of random effect

model s, named hi erarchi cal l i near anal ysi s, can be appl i ed to such data.

SPSS wi l l perform such anal yses for a bal anced desi gn, but does not

currentl y handl e the general case. Speci al i zed programs are avai l abl e to

run hi erarchi cal l i near anal ysi s.

Armed wi th knowl edge of experi mental desi gn and the appropri ate error

terms, i ncompl ete and random effects desi gns can be tested usi ng SPSS.

SUMMARY

Extensions

Special Topics 10 - 10

SPSS Training

References R - 1

SPSS Training

References

Andrews, F. M., Kl em, L., Davi dson, T. N. O’Mal l ey, P. M., and W. L.

Rogers, A Gui de fort Sel ecti ng Stati sti cal Techni ques for Anal yzi ng Soci al

Sci ence Data, Ann Arbor: I nsti tute for Soci al Research, Uni versi ty of

Mi chi gan, 1981.

Bock, R. D., Mul ti vari ate Stati sti cal Methods i n Behavi oral Research,

New York: McGraw-Hi l l , 1975.

Conover, W. J., Practi cal Nonparametri c Stati sti cs, 2

nd

edi ti on, New York:

Wi l ey, 1980.

Crowder, M. J. and D. J. Hand, Anal ysi s of Repeated Measures, London:

Chapman and Hal l , 1990.

Fi nn, J. D., A General Model for Mul ti vari ate Anal ysi s, New York: Hol t,

Ri nehart and Wi nston, 1974.

Hand, D. J. and C. C. Tayl or, Mul ti vari ate Anal ysi s of Vari ance and

Repeated Measures, London: Chapman and Hal l , 1987.

Hakstai n, A. R, J. C. Roed, and J. C. Li nd, Two Sample T2 Procedure and

the Assumption of Homogeneous Covariance Matrices, Psychol ogi cal

Bul l eti n, 86, Pgs. 1255-1263, 1979.

Huberty, C. J., Multivariate Analysis versus Multiple Univariate

Analyses, Psychol ogi cal Bul l eti n Vol ume 105, Pgs. 302-308, 1989.

Kendal l , M. G., and A. Stuart, The Advanced Theory of Stati sti cs, Vol ume

3: Desi gn and Anal ysi s, and Ti me Seri es, New York: Hafner, 1968.

Ki rk, R. E., Experi mental Desi gn: Procedures for the Behavi oral Sci ences,

2

nd

edi ti on, Bel mont, CA: Brooks/Col e, 1982.

Kl ockars, Al an J. and Sax, G., Mul ti pl e Compari sons, SAGE Quanti tati ve

Appl i cati ons Seri es, Thousand Oaks CA: Sage, 1986.

Li ndsey, J. K., Model s for Repeated Measures, Oxford: Cl arendon Press,

1993.

Looney, S. W., and W. Stanl ey, Exploratory Repeated Measures Analysis

for Two or More Groups, The Ameri can Stati sti ci an, Vol ume 43, No. 4,

Pgs 220-225, 1989.

McCul l agh, P. and J. A. Nel der, General i zed Li near Model s, 2

nd

edi ti on,

London: Chapman and Hal l , 1989.

SPSS Training

References R - 2

Mi l l i ken, G. A., and D. E. Johnson, Anal ysi s of Messy Data, Vol ume 1:

Desi gned Experi ments, New York: Van Nostrand Rei nhol d, 1984.

Montgomery, D. C., Desi gn and Anal ysi s of Experi ments, 2

nd

edi ti on, New

York: Wi l ey, 1984.

Morri son, D. F., Mul ti vari ate Stati sti cal Methods, 2

nd

edi ti on, New York:

McGraw-Hi l l , 1976.

Ol son, C. L. On Choosing a Test Statistic in Multivariate Analysis of

Variance, Psychol ogi cal Bul l eti n, Vol ume 83, No. 4 Pgs. 579-586, 1976.

Scheffe, H. The Anal ysi s of Vari ance, New York: Wi l ey, 1959.

Searl e, S. R., Li near Model s for Unbal anced Data, New York: Wi l ey,

1987.

Tukey, J. W., Expl oratory Data Anal ysi s, Readi ng MA.: Addi son Wesl ey,

1977.

Wi l cox, Rand, R. Stati sti cs for the Soci al Sci ences, Academi c Press, New

York, 1996.

Wi l cox, Rand R. I ntroducti on to Robust Esi mati on and Hypothesi s

Testi ng, Academi c Press, New York, 1997.

Wi ner, B. J., Stati sti cal Pri nci pl es i n Experi mental Desi gn, 2

nd

edi ti on,

New York: McGraw-Hi l l , 1971.

Exercises E - 1

SPSS Training

Exercises

The exerci se fi l e for thi s cl ass (Workl oad.por) i s l ocated i n the

c:\Trai n\Anova fol der on your trai ni ng machi ne. I f you are not worki ng

i n an SPSS Trai ni ng center, the trai ni ng fi l es can be copi ed from the

fl oppy di sk that accompani es thi s course gui de. I f you are runni ng SPSS

Server (cl i ck Fi l e..Swi tch Server to check), then you shoul d copy these

fi l es to the server or a machi ne that can be accessed (mapped from) the

computer runni ng SPSS Server.

These exerci ses are based on a si ngl e, ri ch data fi l e. You wi l l perform a

vari ety of anal yses (for exampl e, a one-factor and a three-factor ANOVA)

on the same data. Typi cal l y, i f three factors were bel i eved to be rel evant,

then a one-factor ANOVA woul d not be run. Thus anal yses suggested

here conform to the topi cal sequence i n the trai ni ng gui de and are not

necessari l y opti mal to answer a speci fi c research questi on. I n fact, some

anal yses that mi ght be performed on thi s data (doubl y-mul ti vari ate

anal ysi s of vari ance) are not di scussed i n the course.

One-Factor ANOVA

Open the SPSS portabl e fi l e Workl oad.por. Thi s fi l e contai ns data from

an experi mental i nvesti gati on of the effects of a trai ni ng workshop i nto

stress and workl oad reducti on techni ques i n ai rl i ne pi l ots. The vari abl es

are i n four sets - descri pti ve groupi ng i nformati on, workl oad measures

taken before a fl i ght, workl oad measures taken after a trai ni ng course on

stress and workl oad management, and a fol l ow-up set of workl oad

measures taken three months after the trai ni ng course. Each pi l ot was

measured once each (to avoi d di stracti on) per fl i ght and the two

subsequent fl i ghts were on the same route for comparati ve purposes. I n

al l two hundred pi l ots were measured over three ti me peri ods.

AGE Age i n years

HRSEXP Previ ous fl yi ng experi ence i n fl yi ng hours

TYPE Type of ai rcraft cockpi t (1=Automated, 2=Manual )

ROUTE Desi gnati on of journey (1=Short Haul , 2=Medi um Haul ,

3=Long Haul )

STAGE Stage of fl i ght (1=Take Off, 2=Crui se, 3=Approach,

4=Landi ng)

FLYTI ME Length of fl i ght (measured i n seconds, presented i n date/

ti me format)

(Before Stress and Workl oad Trai ni ng Course)

HEART Heart Rate (Beats Per Mi nute)

BLOOD Bl ood pressure (mmhg)

TEMP Core Body Temperature (deg. f)

Note on Exercise

Data

Note About the

Exercises

Chapter 3

SPSS Training

Exercises E - 2

STRESS Stress Rati ng (1=Low stress up to 7=Hi gh stress)

CAPACI TY Spare Mental Capaci ty Rati ng (1=Al l used up to 10 None

used up)

ATTEN Percent of attenti on remai ni ng (i n percent)

TI RED Ti redness Rati ng scal e (1=I nvi gorated up to 10=Asl eep)

(After Stress and Workl oad Trai ni ng Course)

HEART2 Heart Rate (after trai ni ng)

BLOOD2 Bl ood pressure (after trai ni ng)

TEMP2 Core Body Temperature (after trai ni ng)

STRESS2 Stress rati ng (after trai ni ng)

CAPACI T2 Spare Mental Capaci ty (after trai ni ng)

ATTEN2 Percent of attenti on remai ni ng (after trai ni ng)

TI RED2 Ti redness rati ngs (after trai ni ng)

(Three Months After the Stress and Workl oad Trai ni ng Course)

HEART3 Heart Rate (after 3 months)

BLOOD3 Bl ood Pressure (after 3 months)

TEMP3 Core Body Temperature (after 3 months)

STRESS3 Stress Rati ng (after 3 months)

CAPACI T3 Spare Mental Capaci ty (after 3 months)

ATTEN3 Percent of attenti on remai ni ng (after 3 months)

TI RED3 Ti redness Rati ng (after 3 months)

Fami l i ari ze yoursel f wi th the vari abl es and data wi thi n thi s dataset by

usi ng the Frequenci es, Descri pti ves and Expl ore procedures (wi th any

associ ated graphi cal pl ots you choose).

Usi ng the Means procedure and error bar graphs, compare the mean

stress levels (use the stress vari abl e, whi ch measures stress before the

trai ni ng course) at di fferent fl i ght stages (use the vari abl e type). Recal l

that a pi l ot was tested at a si ngl e fl i ght stage. Before performi ng a one-

factor ANOVA, expl ore the data (aski ng for means, error bars etc.) and

try to predi ct the outcome of the anal ysi s. Do you thi nk there wi l l be a

si gni fi cant di fference between the groups?

Perform a one-factor ANOVA, testi ng for stage di fferences i n stress l evel .

I f di fferences are found, perform post hoc tests to expl ore these

di fferences i n more detai l . How woul d you summari ze the resul ts?

I f the assumpti ons of ANOVA were not met, perform a nonparametri c

test of group (stage) di fferences i n stress. Are the resul ts consi stent wi th

the ANOVA anal ysi s?

Exercises E - 3

SPSS Training

Multi-Way Univariate ANOVA

Open the SPSS portabl e fi l e Workl oad.por. Now we are goi ng to exami ne

stress di fferences as a functi on of fl i ght stage, route, and type of ai rcraft.

Run an expl oratory anal ysi s exami ni ng stress (use the stress vari abl e,

whi ch reports stress before taki ng the trai ni ng course) wi thi n subgroups

based on fl i ght stage (stage), l ength of route (route) and ai rcraft type

(type). Does the stress measure conform to the ANOVA assumpti ons?

Perform a three-factor ANOVA of stress wi th type, route, and stage as

the factors. Are there si gni fi cant i nteracti ons among ai rcraft type, route

l ength, and fl i ght stage as they rel ate to stress? I f there are no

i nteracti ons, but there are si gni fi cant mai n effects, then perform the

appropri ate post hoc tests to i denti fy whi ch subgroups di ffer from each

other.

Multivariate Analysis of Variance

Open the SPSS portabl e fi l e Workl oad.por. Perform a three-factor (route,

type and stage) mul ti vari ate anal ysi s of vari ance on several physi ol ogi cal

measures of stress: bl ood pressure (bl ood), heart rate (heart), and body

temperature (temp). Whi ch effects and i nteracti ons are si gni fi cant?

Exami ne the uni vari ate resul ts. Are the effects consi stent across the

three dependent measures?

Request a profi l e pl ot to exami ne the three-way i nteracti on of route by

stage by type as i t rel ates to core body temperature (temp)? Descri be the

nature of the i nteracti on.

For those wi th extra ti me: Perform a mul ti vari ate anal ysi s usi ng the

same factors, but on the subjecti ve measures of workl oad (stress,

capaci ty, atten, and ti red)? Are the resul ts si mi l ar to what you found for

the physi ol ogi cal measures?

Within-Subject Designs: Repeated Measures

Open the SPSS portabl e fi l e Workl oad.por. Perform and expl oratory data

anal ysi s on stress measured at the three ti me poi nts (stress, stress2,

stress3). Run a one-factor repeated-measures anal ysi s exami ni ng the

stress measure (stress, stress2, stress3) at the three ti me poi nts of the

study.

I f there i s a si gni fi cant mai n effect of the ti me factor, expl ore the nature

of i t wi th post hoc tests and pl ots.

Chapter 4

Chapter 5

Chapter 6

SPSS Training

Exercises E - 4

Between- and Within-Subject ANOVA: Repeated Measures

Open the SPSS portabl e fi l e Workl oad.por. Perform a repeated measures

anal ysi s wi th ti me (three l evel s) as a wi thi n-subject factor and type,

route, and stage as between-subject factors. The dependent measure wi l l

be stress (stress, stress2, stress3). Whi ch effects are si gni fi cant?

I f there are si gni fi cant i nteracti ons, expl ore then usi ng si mpl e effects and

profi l e pl ots.

For those with extra time: Run the same anal ysi s usi ng one of the

physi ol ogi cal measures.

Analysis of Covariance

Open the SPSS portabl e fi l e Workl oad.por. We wi l l add a covari ate to the

anal ysi s run i n Chapter 4. Run an anal ysi s of covari ance on stress wi th

type, route, and stage as between-subject factors and age as a covari ate.

The dependent measure wi l l be stress.

Test for the paral l el i sm of sl ope assumpti on (test for a four-way

i nteracti on among, age, type, route and stage).

I f the paral l el i sm of sl opes assumpti on i s met, then run the anal ysi s of

covari ance and assess the rel evance of the age covari ate. Ask for

parameter esti mates and i nterpret the rel ati onshi p between age and

stress (even i f nonsi gni fi cant).

Chapter 7

Chapter 9

- Motivation.docx
- Tecnicas Estadisticas SPSS - Perez
- SPSS Survival Manual a Step by Step Guide to Data Analysis Using IBM Spss, 5 Edition
- SPSS Statistcs Base User's Guide 17.0
- Pi is 0889540698800097
- Users Guide SPSS Modeler
- Human Movement Behaviour in Urban Spaces
- Spss Trainingboek Advanced Statistics and Datamining
- --1372739809-1.OPTIMIZATION and PREDICTION -full.pdf
- swproxy_url=file%3A%2F%2F00%2F276%2F00014276
- academic jounral 1
- SPSS Modeler Book
- OneWayANOVA
- ACJC Math Prelim 10
- Statistics solution for Analysis of variance
- The Roles of 360 degrees rater
- Abdullah 2014 Impact of Firms_ Life-cycle on Conservatism
- ALTAN 2010 Reducing shrinkage in injection moldings Taguchi.pdf
- Determinants of corporate social and environmental reporting in Hong Kong a research note SHUVo.doc
- Library
- Bus 308 Week 2 Problem Set Week Two
- Variance and Standard Deviation
- UT Dallas Syllabus for stat6338.501.11s taught by Michael Baron (mbaron)
- #29 LEAN in the Lab 7
- AssignmentMFC1stSemester QAMcycle6.doc
- report
- metodologie
- TomkovickWeb
- STAT 200 Week 4 Homework Problem
- Thesis - 119997392047 - Chaitali Shah

Skip carousel

- tmpA899.tmp
- tmp8C8D
- tmpA69C
- tmpAD7D
- UT Dallas Syllabus for psy2317.002 06s taught by Nancy Juhn (njuhn)
- tmpF2E
- tmp79C
- Knowledge of female genital cutting among parents in south west Nigeria
- Experimental Investigation for Nitride Ceramic Cutting Insert for Material Removal Rate on High Chrome Steel
- UT Dallas Syllabus for hcs6313.501.07s taught by Herve Abdi (herve)
- tmp18AE.tmp
- UT Dallas Syllabus for psy2317.501.08f taught by Nancy Juhn (njuhn)
- tmpA909.tmp
- tmpFDED.tmp
- Investigation of Shell & Tube Heat Exchanger Performance for Plastic Injection Molding Machine By Using RSM
- UT Dallas Syllabus for psy2317.001 06s taught by Nancy Juhn (njuhn)
- UT Dallas Syllabus for psy2317.501.08s taught by Nancy Juhn (njuhn)
- tmp24B.tmp
- UT Dallas Syllabus for psy2317.001 06f taught by Nancy Juhn (njuhn)
- Tmp 1160
- UT Dallas Syllabus for psy3392.001 06f taught by Betty-gene Edelman (bedelman)
- UT Dallas Syllabus for psy2317.001.07f taught by Nancy Juhn (njuhn)
- UT Dallas Syllabus for psy2317.501.07f taught by Nancy Juhn (njuhn)
- A Review on Various Approach for Process Parameter Optimization of Burnishing Process and TAGUCHI Approach for Optimization
- UT Dallas Syllabus for psy2317.002.07f taught by Nancy Juhn (njuhn)
- UT Dallas Syllabus for psy2317.501.07s taught by Nancy Juhn (njuhn)
- UT Dallas Syllabus for psy2317.001.09f taught by Nancy Juhn (njuhn)
- tmp9941.tmp
- tmp6766.tmp
- An Analyse of Optimum Parameter on Cutting Force and Surface Roughness by TAGUCHI Method during Turning on EN9 (Hard Steel)

- Soft Orthog (1)
- Ethiopian
- DOE-I
- Key Recommendations for Improving Nutrition Through Agiculture and Food Systems
- Cph Exam Review Bio Statistics
- Manual de Métodos y Criterios Para La Evaluación y Monitoreo de La Flora y La Vegetación
- papa costos
- Kampala
- Marco Filosofico
- Poisson
- Nutricion y Agricultura
- QSR - NVIVO - For Commercial
- DOE Course Part 14
- Matrices Stata
- CI-Stata
- Experimental Design Capt 1
- Dwl-7100ap Manual en Uk
- costos papa
- Excel Graficos
- Costos Papa
- Trends
- Reg Logistica
- CostosprincipalescultivosTransitorios
- Table
- Manual Semilla Papa
- USB Adapter _ InFocus2
- Stata
- Camana Costos
- Introduccion a R
- papa costos

Sign up to vote on this title

UsefulNot usefulClose Dialog## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

Close Dialog## This title now requires a credit

Use one of your book credits to continue reading from where you left off, or restart the preview.

Loading