This action might not be possible to undo. Are you sure you want to continue?

Welcome to Scribd! Start your free trial and access books, documents and more.Find out more

edu/~faculty/singer/ http://gseacademic.harvard.edu/alda/ http://gseacademic.harvard.edu/~willetjo/ http://www.ats.ucla.edu/stat/examples/alda/ © Judith D. Singer & John B. Willett (2006)

**Individual Growth Modeling: Modern Methods for Studying Change
**

Judith D. Singer & John B. Willett

Harvard Graduate School of Education

“Time is the one immaterial object which we cannot influence— neither speed up nor slow down, add to nor diminish.” Maya Angelou

© Judith D. Singer & John B. Willett, Harvard Graduate School of Education, Workshop Overview 1, slide 1

The fundamental problem of Muybridge research: The fundamental problem longitudinal research: The study of Making continuous of longitudinal(1830–1904) The study of TIME:continuoustime “stand still” (1830–1904) TIME: Eadweard “stand still” Making Eadweard Muybridge time

Eadweard Muybridge Animal Locomotion (1887)

© Judith D. Singer & John B. Willett, Harvard Graduate School of Education, Workshop Overview 1, slide 2

The height of the son of Count Filibert Guéneau de Montbeillard (1720-1785) The height of the son of Count Filibert Guéneau de Montbeillard (1720-1785)

Scammon, RE (1927) The first seriation study of human growth, Am J of Physical Anthropology, 10, 329-336.

The first known longitudinal study of growth: The first known longitudinal study of growth:

200

150

oops…measurement error?

Height (in cm)

100

50

**Recorded his son’s height approximately every six months from birth (in 1759) until age 18
**

0 0 5 10 Age

© Judith D. Singer & John B. Willett, Harvard Graduate School of Education, Workshop Overview 1, slide 3

15

20

Fast forward to the present: Fast forward to the present: In most fields, the quantity of longitudinal research is exploding In most fields, the quantity of longitudinal research is exploding

Annual searches for keyword 'longitudinal' in 9 OVID databases, between 1982 and 2005

10,000

medicine business biology psychology

1,000

sociology agriculture education zoology economics

100

10 '81 '84 '87 '90 '93 '96 '99 '02 '05

© Judith D. Singer & John B. Willett, Harvard Graduate School of Education, Workshop Overview 1, slide 4

the good news: First. Singer & John B. 2003 and 2006 First. the bad news: ••Very few of these longitudinal Very few of these longitudinal studies use “modern” analytic studies use “modern” analytic methods methods 0 10 20 30 40 50 0 >1 Wave 1999 2003 2006 10 20 30 40 50 60 Growth Modeling Survival Analysis Repeated Measures ANOVA Wave-on-Wave regression Separate but parallel analyses Set aside waves 2 Waves 3 Waves 4+ Waves Combine waves Ignore age heterogeneity © Judith D. I have tried to keep up and to maintain an open mind. Willett. slide 5 Part of the problem may well be reviewers’ ignorance Part of the problem may well be reviewers’ ignorance Comments received from two reviewers for Developmental Psychology of a paper that fit individual growth models to 3 waves of data on vocabulary size among young children: Reviewer A: “I do not understand the statistics used in this study deeply enough to evaluate their appropriateness. but parts of my review may be naïve. … I thus think Developmental Psychology is not really the place for this paper.” © Judith D. Willett. Harvard Graduate School of Education. if not inaccurate. Singer & John B. while the authors are to be applauded for a detailed longitudinal study. I imagine this is also true of 99% of the readers of Developmental Psychology. Workshop Overview 1. … Previous studies in this area have used simple correlation or regression which provide easily interpretable values for the relationships among variables.” Reviewer B: “The analyses fail to live up to the promise…of the clear and cogent introduction. … the statistics are difficult. and they have always aroused my suspicion to some extent.But what about the quality?: What does today’s “longitudinal research” look like? But what about the quality?: What does today’s “longitudinal research” look like? Read 150 articles in 10 issues of APA journals published in each of 1999. the bad news: Now. … In all. slide 6 . Harvard Graduate School of Education. I will note as a caveat that I entered the field before the advent of sophisticated growthmodeling techniques. Workshop Overview 1. the good news: ••More longitudinal studies are More longitudinal studies are being published being published ••More of these are “truly” More of these are “truly” longitudinal longitudinal Now.

Harvard Graduate School of Education. Within-person summary: How does a teen’s alcohol consumption change over time? 2.What kinds of research questions require longitudinal methods? What kinds of research questions require longitudinal methods? Questions about systematic change over time Questions about whether and when events occur • Curran et al (1997) studied alcohol use • 82 teens interviewed at ages 14. • Capaldi et al (1996) studied age of 1st sex • 180 boys interviewed annually from 7th to 12th grade (30% remained virgins at end of study) • Boys who experienced early parental transitions were more likely to have had sex. or remain stable over time? • Is the general pattern linear or non-linear? • Are there abrupt shifts at substantively interesting moments? You can include time varying predictors (those whose values vary over time) • Participation in an intervention • Family circumstances (employment. Workshop Overview 1. Singer & John B. slide 7 Four important advantages of modern longitudinal methods Four important advantages of modern longitudinal methods You can identify temporal patterns in the data • Does the outcome increase. Harvard Graduate School of Education. even those with just one wave! • Design can be experimental or observational • Designs can be single level (individuals only) or multilevel (e. marital status. Willett. 1. patients within physician practices) © Judith D. 15 & 16— alcohol use tended to increase over time • Children of Alcoholics (COAs) drank more but had no steeper rates of increase over time. etc) You can include interactions with time (to test whether a predictor’s effect varies over time) • Some effects dissipate—they wear off • Some effects increase—they become more important • Some effects are especially pronounced at particular times You have great flexibility in research design • Not everyone needs the same rigid data collection schedule—cadence can be person specific • Not everyone needs the same number of waves—can use all cases.and Continuous-Time Survival Analysis © Judith D. Singer & John B. decrease. Within-person summary: When are boys most at risk of having sex for the 1st time? 2. Between-person comparison: How do these trajectories vary by teen characteristics? 1. slide 8 .g. Between-person comparison: How does this risk vary by teen characteristics? Individual Growth Model/ Multilevel Model for Change Discrete.. Workshop Overview 1. Willett.

Singer & John B. Workshop Overview 1.What we’re going to cover in this workshop What we’re going to cover in this workshop © Judith D. software and other supplemental materials A word about programming. Harvard Graduate School of Education. Willett. slide 10 . Workshop Overview 1. slide 9 A word about programming. Harvard Graduate School of Education. Singer & John B. Willett.ucla.edu/stat/examples/alda Chapter Table of contents A framework for investigating change over time Exploring longitudinal data on change Introducing the multilevel model for change Doing data analysis with the multilevel model for change Treating time more flexibly Modeling discontinuous and nonlinear change Examining the multilevel model’s error covariance structure Modeling change using covariance structure analysis A framework for investigating event occurrence Describing discrete-time event occurrence data Fitting basic discrete-time hazard models Extending the discrete-time hazard model Describing continuous-time event occurrence data Fitting the Cox regression model Extending the Cox regression model Datasets Ch 1 Ch 2 Ch 3 Ch 4 Ch 5 Ch 6 Ch 7 Ch 8 Ch 9 Ch 10 Ch 11 Ch 12 Ch 13 Ch 14 Ch 15 Applied Longitudinal Data Analysis website http://gseacademic.edu/alda • materials from past workshops • videos of past workshops MLwiN SPSS Mplus SPlus Stata HLM SAS S-077: Applied Longitudinal Data Analysis more fully annotated computer code • examples of detailed computer output • course videos © Judith D.ats.harvard. software and other supplemental materials www.

Willett. Singer Harvard Graduate School of Education © Judith D.3)—what kind of population model should we hypothesize to change (§3. ALDA.5 and §3.4)—there are now Fitting the multilevel model for change to data (§3.2)—examining The level-1 submodel for individual change (§3. many software many options for model fitting. ALDA. 45) © Judith D. Willett & Judith D. Interpreting the results of model fitting (§3. Singer & John B.6) Having fit the Interpreting the results of model fitting (§3. and more practically.6) Having fit the model. we’ll save practical data analytic advice for the next session The level-1 submodel for individual change (§3. slide 1 Chapter 3: Introducing the multilevel model for change Chapter 3: Introducing the multilevel model for change General Approach: We’ll go through a worked example from start to finish. you’re finished” Benjamin Franklin John B.Introducing the Multilevel Model for Change: ALDA.5 and §3. Harvard Graduate School of Education. Chapter Three “When you’re finished changing. options.4)—there are now many options for model fitting. Harvard Graduate School of Education. Singer & John B.2)—examining empirical growth trajectories and asking what population model might empirical growth trajectories and asking what population model might have given rise these observations? have given rise these observations? The level-2 submodels for systematic interindividual differences in The level-2 submodels for systematic interindividual differences in change (§3. Chapter 3. and more practically. how do we sensibly interpret and display empirical results? model. Chapter 3 intro. many software options. Willett.3)—what kind of population model should we hypothesize to represent the behavior of the parameters from the level-1 model? represent the behavior of the parameters from the level-1 model? Fitting the multilevel model for change to data (§3. how do we sensibly interpret and display empirical results? Interpreting fixed effects Interpreting fixed effects Interpreting variance components Interpreting variance components Plotting prototypical trajectories Plotting prototypical trajectories (ALDA. p. Chapter 3. slide 2 .

46-49) © Judith D. Chapter 3. Here. slide 4 . slide 3 The fundamental building block of growth modeling The fundamental building block of growth modeling General structure: A person-period data set has one row of data General structure: A person-period data set has one row of data for each period when that particular person was observed for each period when that particular person was observed The person-period data set: The person-period data set: Fully balanced. 1.0 AGE=1. 18.0.1. how do they differ?] differ. pp. we’ll ask program participants. Willett.5.5. Chapter 3. and 24 months collected at ages 12. 3 waves per child 3 waves per child AGE=1. Fully balanced. and 2. Sample: 103 African American Sample: 103 African American children born to low income families children born to low income families 58 randomly assigned to an early 58 randomly assigned to an early intervention program intervention program 45 randomly assigned to aacontrol 45 randomly assigned to control group group Research design Research design Each child was assessed 12 times Each child was assessed 12 times between ages 66and 96 months between ages and 96 months Here.Illustrative example: The effects of early intervention on children’s IQ Illustrative example: The effects of early intervention on children’s IQ Data source: Peg Burchinal and colleagues (2000) Child Development. 1. and 2. ALDA. collected at ages 12. they do differ. Section 3. ALDA. Singer & John B.0. Harvard Graduate School of Education. we analyze only 33waves of data. Singer & John B. pp. ififthey do in the control group? [And. 18. Section 3. 46-49) © Judith D. we’ll ask whether the rate of decline is whether the rate of decline is lower lower (ALDA. Willett.1. Harvard Graduate School of Education.0 (clocked in years— (clocked in years— instead of months—so instead of months—so that we assess “annual that we assess “annual rate of change”) rate of change”) PROGRAM is a dummy variable PROGRAM is a dummy variable indicating whether the child was indicating whether the child was randomly assigned to the special randomly assigned to the special early childhood program (1) or early childhood program (1) or not (0) not (0) COG is a nationally normed scale COG is a nationally normed scale •• Declines within empirical Declines within empirical growth records growth records •• Instead of asking whether the Instead of asking whether the growth rate is higher among growth rate is higher among program participants. how do they differ?] (ALDA. we analyze only waves of data. and 24 months Research question: What is the effect Research question: What is the effect of the early intervention program on of the early intervention program on children’s cognitive performance? children’s cognitive performance? Within-individual: How does aachild’s Within-individual: How does child’s cognitive performance change between cognitive performance change between 12 and 24 months? 12 and 24 months? Between individuals: Do the Between individuals: Do the trajectories for children in the early trajectories for children in the early intervention program differ from those intervention program differ from those in the control group? [And.

Examining empirical growth plots to help suggest a suitable individual growth model Examining empirical growth plots to help suggest a suitable individual growth model (by superimposing fitted OLS trajectories) (by superimposing fitted OLS trajectories) Many trajectories are smooth and systematic Many trajectories are smooth and systematic (70. 908) 150 125 COG ID 68 150 125 COG ID 70 125 150 COG ID 71 125 100 150 COG ID 72 Overall impression: Overall impression: COG declines over COG declines over time. Chapter 3. slide 6 1. 72.2.5 AGE 2 1 1.which allows for the effects of random error from the measurement of person i 2 on occasion j. Section 3. 906) (68. Harvard Graduate School of Education. π0i and π fully describe person i’s hypothesized1i trueindividual person i’s hypothesizedtrue individual growth trajectory growth trajectory 1 year 75 50 1 (ALDA. irregular (and could Other trajectories are scattered.5 AGE • 2 75 50 • 2 1 1.which embodies our hypothesis about the shape of each person’s true trajectory of change over time Key assumption: In the population.5 AGE 2 . εi2. ALDA. irregular (and could even be curvilinear???) even be curvilinear???) (68. COGij is a linear function of child i’s AGE on occasion j Stochastic portion. Because we have “centered” AGE at 1. but there’s some time.fully describe parameters. Willett. Usually assume ε ij ~ N(0. and εi3 are deviations Individual i’s hypothesized true change trajectory 150 COG of i’s true change trajectory from linearity on each occasion (including the effects of measurement error & omitted timevarying predictors) • π0i is the intercept of i’s true change trajectory. pp. . Singer & John B. 904. slide 5 Postulating a simple linear level-1 submodel for individual change: Postulating a simple linear level-1 submodel for individual change: Examining its structural and stochastic portions Examining its structural and stochastic portions Structural portion. his true “annual rate of change” Net result: The individual growth Net result: The individual growth parameters. 49-51) © Judith D. 906) Key question when examining empirical growth Key question when examining empirical growth plots: What type of population individual growth plots: What type of population individual growth model might have generated these sample data? model might have generated these sample data? •• •• •• Linear or curvilinear? Linear or curvilinear? Smooth or jagged? Smooth or jagged? Continuous or disjoint? Continuous or disjoint? With just 33waves of data and many of the empirical growth With just waves of data and many of the empirical growth plots suggesting aalinear model would be fine. 902. Willett.5 AGE 2 Other trajectories are scattered. Chapter 3. but there’s some variation in the fit (its variation in the fit (its quality and shape) quality and shape) 100 75 50 • • • 100 75 50 • • • 100 75 50 • • 1 1. 49-51) © Judith D. Section 3.5 AGE • • 100 75 50 • • 1 1.5 AGE • • • 2 75 50 • 2 1 1. his “true initial status” 125 ε i1 ε i3 100 • • εi2 π1i is the slope of i’s true change trajectory. Singer & John B.σ ε ) COGij = π 0i + π 1i ( AGE ij − 1) + ε ij [ ] [ ] i indexes persons (i=1 to 103) j indexes occasions/periods (j=1 to 3) εi1. π0i is i’s true value of COG at AGE=1.2. 908) (70. ALDA. it makes plots suggesting linear model would be fine. 904. 71. his yearly rate of change in true COG. π0i and π1i . 71. Harvard Graduate School of Education. 72.5 AGE 150 125 100 75 50 COG ID 902 150 125 COG ID 904 150 125 COG ID 906 150 125 100 COG ID 908 • • • 100 75 50 • • • 1 1. 902. pp.5 AGE 2 1 1.5 AGE 2 1 1. it makes sense to start with aasimple linear individual growth model sense to start with simple linear individual growth model (ALDA.

10* 10. slide 7 Further developing the level-2 submodel for interindividual differences in change Further developing the level-2 submodel for interindividual differences in change Four desired features of the level-2 submodel(s) PROGRAM=0 150 COG 150 COG PROGRAM=1 125 125 100 100 75 75 50 1 1. too Each level-2 model will need its own error term.5 AGE 2 50 1 1. 6* 6. ALDA. -3* -3. -4* 0 0 79 134 4444332 99998888777765 4333322211000 99888877666655 44322211110000 9999877776655 443322100000 987 443111 Residual variance 46 44 42 40 38 36 34 32 30 28 26 24 22 20 18 16 14 12 10 8 6 4 2 0 8 150 00 8 3 4 7 1444 8 3 00011 21 44433 1118886666 77744 333844 04444888833338888888 0000111122233334444444466668111114447 125 7 100 75 What does this behavior suggest about a suitable level-2 model? 50 1 1. 12* 12. Chapter 3. ALDA. PROGRAM participation) © Judith D. Singer & John B. pp. pp. 5* 0 5568 00134 5556778999 02233344 55667777888889 000111112222233334444 55666688999 0012222244 6666677799 344 89 34 7 Fitted rate of change 2.2. Singer & John B.5 AGE 2 Average OLS trajectory across the full sample ≅ 110-10 (AGE . 7* 7. 57-60) © Judith D. slide 8 . and we will need to allow for covariance across level-2 errors (ALDA. 11* 11. Harvard Graduate School of Education.3. one per growth parameter (one for initial status. Harvard Graduate School of Education. Each level-2 submodel should allow individuals with common predictor values to nevertheless have different individual change trajectories We need stochastic variation at level-2. -2* -2. 8* 8. Section 3.Examining fitted OLS trajectories to help suggest a suitable level-2 model Examining fitted OLS trajectories to help suggest a suitable level-2 model Most children decline over time (although there are a few exceptions) COG But there’s also great variation in these OLS estimates Fitted initial status 14 13* 13. 1* 1. -1* -1. -0* -0.5 AGE 2 Program participants tend to have: Program participants tend to have: • Higher scores at age 1 (higher initial status) • Higher scores at age 1 (higher initial status) • Less steep rates of decline (shallower slopes) • Less steep rates of decline (shallower slopes) • • But these are only overall trends—there’s great But these are only overall trends—there’s great interindividual heterogeneity interindividual heterogeneity 1.1) (ALDA. 9* 9. Willett.3. 0* 0. one for change) 3. Section 3. Willett. here PROGRAM We need to specify a functional form for these relationships at level-2 (beginning with linear but ultimately becoming more flexible) 4. Each level-2 submodel must specify the relationship between a level-1 growth parameter and predictor(s). Outcomes are the level-1 individual growth parameters π0i and π1i 2. Chapter 3. Need two level-2 submodels. 55-56) • The level-2 model must capture both the averages of the individual growth parameters and variation about these averages • And…it must also provide a way to represent systematic interindividual differences in change according to variation in predictor(s) (here.

(γ00 + ζ0i ) + (γ10 + ζ1i ) (AGE-1) 125 150 PROGRAM=1 COG Key ideas behind the level-2 models: Key ideas behind the level-2 models: •• Models posit the existence of an average Models posit the existence of an average population trajectory for each program group population trajectory for each program group •• Because the level-2 models also include residuals Because the level-2 models also include residuals (the zetas). ALDA. pp. 11 level-1 model (0 for intercept. Chapter 3. Harvard Graduate School of Education. each child i ihas his own true change (the zetas). one existence of many true population trajectories.5 AGE 2 Assumptions about the level-2 residuals: Assumptions about the level-2 residuals: initial status rate of change 2 ⎛ ⎡0 ⎤ ⎡ σ 0 ⎡ζ 0i ⎤ ⎜ .3. Willett. Singer & John B. Singer & John B. γ00 + γ10 (AGE-1) 75 100 Average population trajectory.1.5 AGE 2 50 1 1. Section 3. ALDA.2. we’re less interested in their values than their population variances and covariances © Judith D. in level-2 model (0 for intercept.⎢ ⎢ζ ⎥ ~ N ⎜ ⎢ 0 ⎥ ⎣ 1i ⎦ ⎝ ⎣ ⎦ ⎣σ 10 σ 01 ⎤ ⎞ ⎥⎟ σ 12 ⎦ ⎟ ⎠ (ALDA. the shading is supposed to suggest the existence of many true population trajectories. each child has his own true change trajectory (defined by π0i and π1i)) trajectory (defined by π0i and π1i •• In the figure. (γ00 + γ01) + (γ10 + γ11) (AGE-1) 75 50 1 1. Section 3. Harvard Graduate School of Education.Level-2 submodels for systematic interindividual differences in change Level-2 submodels for systematic interindividual differences in change π 0i = γ 00 + γ 01 PROGRAM + ζ 0i For the level-1 intercept (initial status) For the level-1 slope (rate of change) π 1i = γ 10 + γ 11 PROGRAM + ζ 1i Key to remembering subscripts Key to remembering subscripts on the gammas (the γ’s) on the gammas (the γ’s) • •First subscript indicates role in First subscript indicates role in level-1 model (0 for intercept. 11for slope) for slope) (ALDA. 61-63) © Judith D. one per child per child 100 Average population trajectory. the shading is supposed to suggest the In the figure.3. 60-61) What about the zetas (theζ’s)? • They’re level-2 residuals that permit the level-1 individual growth parameters to vary stochastically across people • As with most residuals. Chapter 3. slide 9 Understanding the stochastic components of the level-2 submodels Understanding the stochastic components of the level-2 submodels π 0i = γ 00 + γ 01 PROGRAM + ζ 0i π 1i = γ 10 + γ 11 PROGRAM + ζ 1i 125 PROGRAM=0 150 COG Population trajectory for child i. Willett. slide 10 . pp. for slope) for slope) • •Second subscript indicates role Second subscript indicates role in level-2 model (0 for intercept.

Data input options—level-1/level-2 vs. Chapter 3. composite. GLS—more on this later… ML vs.dataset person-period.. Visit http://www. slide 12 . raw data or xyz. ML vs. person-period. Singer & John B. restricted. GLS—more on this later… Ability to handle design weights Ability to handle design weights Quality and range of diagnostics Quality and range of diagnostics Speed Speed Strategies for handling estimation Strategies for handling estimation problems (e.edu/stat/examples/alda for data. Estimation routines—full vs. Chapter 3. crossTypes of designs supported (e. crossnested designs.g. code in the major packages. latent variables) Estimation routines—full vs.. composite.ats. Willett. boundary constraints) Advice: Use whatever package you’d like but be sure to invest the time and energy to learn to use it well.Three general types of software options (whose numbers are increasing over time) Three general types of software options (whose numbers are increasing over time) Fitting the multilevel model for change to data Fitting the multilevel model for change to data Programs expressly Programs expressly designed for multilevel designed for multilevel modeling modeling MLwiN Multipurpose packages Multipurpose packages with multilevel with multilevel modeling modules modeling modules Specialty packages Specialty packages originally designed for originally designed for another purpose that another purpose that can also fit some can also fit some multilevel models multilevel models © Judith D. boundary constraints) problems (e.g. random effects level-2 vs. ALDA.ucla.. latent variables) nested designs. restricted. ALDA. Singer & John B. Willett. raw data or xyz.g. random effects Automatic centering options Automatic centering options Wisdom of program’s defaults Wisdom of program’s defaults Documentation & user support Documentation & user support Quality of output—text & graphics Quality of output—text & graphics (that affect research value) (that affect research value) ##of levels that can be handled of levels that can be handled Range of assumptions supported (for Range of assumptions supported (for the outcomes & effects) the outcomes & effects) Types of designs supported (e. slide 11 aML Two sets of issues to consider when comparing (and selecting) packages Two sets of issues to consider when comparing (and selecting) packages 88practical considerations practical considerations 88technical considerations technical considerations (that affect ease of use/pedagogic value) (that affect ease of use/pedagogic value) Data input options—level-1/level-2 vs. Harvard Graduate School of Education. Harvard Graduate School of Education.. and more © Judith D.dataset Programming options—graphical Programming options—graphical interfaces and/or scripts interfaces and/or scripts Availability of other statistical Availability of other statistical procedures procedures Model specification options—level-1/ Model specification options—level-1/ level-2 vs.g.

5.69 ˆ π 1i = −21.69 − 15.27(0) = −21. slide 13 Plotting prototypical change trajectories Plotting prototypical change trajectories General idea: Substitute prototypical values for the level-2 predictors General idea: Substitute prototypical values for the level-2 predictors (here.84 + 6. Chapter 3.27 higher True annual rate of change for the average non-participant is –21.13 Advice: As you’re learning these methods. slide 14 . Section 3.85 higher ˆ π 0i = 107. Question: Might these differences be due to nothing more than sampling variation? (ALDA.27 PROGRAM i For the average participant. Harvard Graduate School of Education.85(0) = 107.13 75 ˆ COG = 107. Singer & John B.13 + 65. 68-71) © Judith D.85 PROGRAM i ˆ π 1i = −21.85(1) = 114. Willett.84 + 6. Chapter 3. Section 3. it is 5.5 AGE 2 Tentative conclusion: Program participants appear to have higher initial status and slower rates of decline.13 + 5.86( AGE − 1) 125 100 ˆ π 0i = 107. it is 6. pp. ALDA.84 − 21.13 + 65.84 + 6. pp. just PROGRAM=0 or 1) into the fitted models (here.84 + 6. just PROGRAM=0 or 1) into the fitted models 150 COG ˆ π 0i = 107.84 ˆ π 1i = −21.13 + 65.Examining estimated fixed effects Examining estimated fixed effects In the population from which this sample was drawn we estimate that… True initial status (COG at age 1) for the average non-participant is 107. Singer & John B.5.27(1) = −15.86 PROGRAM = 0 ˆ π 0i = 107. ALDA. Harvard Graduate School of Education. take the time to actually write out the fitted level-1/level-2 models before interpreting computer output—It’s the best way to learn what you’re doing! (ALDA.27 PROGRAM i PROGRAM = 1 ˆ COG = 114. 69-71) © Judith D.84 Fitted model for initial status Fitted model for rate of change For the average participant. Willett.1.13( AGE − 1) 50 1 1.85 PROGRAM i ˆ π 1i = −21.

•• Interpretation is easiest when comparing different Interpretation is easiest when comparing different models that each have different predictors (which we models that each have different predictors (which we will do in the next unit). on average.64***) • There is no statistically significant residual variance in rates of change to be explained—it’s probably little use to add substantive predictors of change • The residual covariance between initial status and rates of change is not statistically significant (ALDA. Section 3.Testing hypotheses about fixed effects using single parameter tests Testing hypotheses about fixed effects using single parameter tests For initial status: Average non-participant had a non-zero level of COG at age 1 (surprise!) Program participants had higher initial status. PROGRAM) • There are still statistically significant differences in true initial status after controlling for program (124. Harvard Graduate School of Education. Singer & John B. on average. Chapter 3. Willett.5. Section 3. than non-participants (the “program effect”). will do in the next unit).29 ⎥ ⎣ ⎦ . Willett. yet in the model. ALDA.41 12. Singer & John B. t-statistic.2. they might be able to explain some of this within-person residual variability Level-2 residual variance: • Summarizes between-person variability in change trajectories (here. (ALDA. Harvard Graduate School of Education.41⎤ ⎢ − 36. pp. 72-74) © Judith D. than non-participants (probably because the intervention had already started) General formulation: z= γˆ ase(γˆ) For rate of change: Average non-participant had a nonzero rate of decline (depressing) Program participants had slower rates of decline.24***): • Summarizes within-person variability in outcomes around individuals’ own trajectories (usually non-zero) • Here. Level-1 residual variance (74. t-ratio. initial status and growth rates) after controlling for predictor(s) (here. quasi-tstatistic—which are not the same—are used interchangeably © Judith D. slide 15 Examining estimated variance components Examining estimated variance components General idea:: General idea •• Variance components quantify the amount of Variance components quantify the amount of residual variation left—at either level-1 or level-2— residual variation left—at either level-1 or level-2— that is potentially explainable by other predictors not that is potentially explainable by other predictors not yet in the model.71-72) Careful: Most programs provide appropriate tests but… different programs use different terminology Terms like z-statistic.64 * * * − 36. pp. ALDA.6. slide 16 ⎡124. we conclude there is some within-person residual variability • If we had time-varying predictors. Chapter 3.

but now we’ll delve into the practical data analytic details Composite specification of the multilevel model for change Composite specification of the multilevel model for change (§4. but we would be frightened if change were stopped” Lyman Bryson Judith D.4) Intraclass correlation Intraclass correlation Quantifying proportion of outcome variation “explained” Quantifying proportion of outcome variation “explained” Practical model building strategies (§4. Singer & John B. Singer & John B. ALDA. Willett Harvard Graduate School of Education © Judith D. Chapter 4.6) Using deviance statistics Using deviance statistics Using information criteria (AIC and BIC) Using information criteria (AIC and BIC) © Judith D. Chapter Four “We are restless because of incessant change. Harvard Graduate School of Education. Willett. Harvard Graduate School of Education. slide 2 .5) Developing and fitting aataxonomy of models Developing and fitting taxonomy of models Displaying prototypical change trajectories Displaying prototypical change trajectories Recentering to improve interpretation Recentering to improve interpretation Comparing models (§4. Willett.2) and how ititrelates to the level-1/level-2 specification just (§4. we’ll go through a worked example. ALDA.5) Practical model building strategies (§4.Doing data analysis with the multilevel model for change ALDA.2) and how relates to the level-1/level-2 specification just introduced introduced First steps: unconditional means model and unconditional First steps: unconditional means model and unconditional growth model (§4.6) Comparing models (§4. Singer & John B. slide 1 Chapter 4: Doing data analysis with the multilevel model for change Chapter 4: Doing data analysis with the multilevel model for change General Approach: Once again.4) growth model (§4. Chapter 4.

76-80) i’s true rate of change per unit of TIME portion of i’s outcome that is unexplained on occasion j © Judith D. slide 3 What’s an appropriate functional form for the level-1 submodel? What’s an appropriate functional form for the level-1 submodel? (Examining empirical growth plots with superimposed OLS trajectories) (Examining empirical growth plots with superimposed OLS trajectories) 3 features of these plots: 3 features of these plots: 1. when TIME=0) (ALDA. ALDA. aameasure of peer alcohol use At age 14. 41. PEER. was computed as follows: The outcome. Most seem approximately 1.σ ε2 ) Yij = π 0i + π 1i TIMEij + ε ij i’s true initial status (ie. and (2) peer alcohol use? © Judith D. ALCUSE. and (4) got drunk drinks in a row. (2) hard liquor. and (4) got drunk Each item was scored on an 8 point scale (0=“not at all” to Each item was scored on an 8 point scale (0=“not at all” to 7=“every day”) 7=“every day”) ALCUSE isis the square root of the sum of these 4 items ALCUSE the square root of the sum of these 4 items Research question Research question Do trajectories of adolescent alcohol use differ by: Do trajectories of adolescent alcohol use differ by: (1) parental alcoholism. 15. pp. Most seem approximately linear (but not always linear (but not always increasing over time) increasing over time) 2. Singer & John B. Section 4. ALDA. Some OLS trajectories fit well 2. Singer & John B. and (2) peer alcohol use? (1) parental alcoholism. 56. 14. Willett. 65) 3. 65) (23. Some OLS trajectories fit well (23. 56. (3) 5 or more drinks in a row. Harvard Graduate School of Education. 14. slide 4 . 41. 82) A linear model makes sense… ALCUSEij = π 0i + π 1i ( AGEij − 14) + ε ij where ε ij ~ N (0.1. 82) more scatter (04. and 16 Each was assessed times—at ages 14. 15. 32. ALCUSE. Other OLS trajectories show 3. was computed as follows: Research design Research design At age 14. PEER. Willett. Chapter 4. Sample: 82 adolescents Sample: 82 adolescents 37 are children of an alcoholic parent (COAs) 37 are children of an alcoholic parent (COAs) 45 are non-COAs 45 are non-COAs Each was assessed 33times—at ages 14.Illustrative example: The effects of parental alcoholism on adolescent alcohol use Illustrative example: The effects of parental alcoholism on adolescent alcohol use Data source: Pat Curran and colleagues (1997) Journal of Consulting and Clinical Psychology. and 16 The outcome. measure of peer alcohol use was also gathered was also gathered 4 items: (1) drank beer/wine. (2) hard liquor. Other OLS trajectories show more scatter (04. (3) 5 or more 4 items: (1) drank beer/wine. Harvard Graduate School of Education. 32. Chapter 4.

ALDA. Willett. slide 6 . Chapter 4. Section 4. pp. This tells us that the effect of one predictor This tells us that the effect of one predictor (TIME) differs by the levels of another (TIME) differs by the levels of another predictor (COA) predictor (COA) Demonstrates the complexity of the Demonstrates the complexity of the composite residual—this isis not regular composite residual—this not regular OLS regression OLS regression Is the specification used by most software Is the specification used by most software packages for multilevel modeling packages for multilevel modeling Is the specification that maps most easily Is the specification that maps most easily onto the person-period data set… onto the person-period data set… (ALDA. Singer & John B. Section 4. ⎢ 2 ⎥⎟ change trajectories around ⎜⎣ ⎦ σ change trajectories around ⎣ 1i ⎦ ⎣ 10 σ 1 ⎦ ⎠ ⎝ predicted averages predicted averages 13 14 15 AGE 16 17 -1 13 14 15 AGE 16 17 -1 (ALDA. Chapter 4. COA∗TIME. slide 5 Developing the composite specification of the multilevel model for change Developing the composite specification of the multilevel model for change by substituting the level-2 submodels into the level-1 individual growth model by substituting the level-2 submodels into the level-1 individual growth model π 0i = γ 00 + γ 01COAi + ζ 0i π 1i = γ 10 + γ 11COAi + ζ 1i Yij = π 0i + π 1i TIMEij + ε ij Y ij = (γ 00 + γ 01 COA i + ζ 0 i ) + (γ 10 + γ 11 COA i + ζ 1i )TIME ij + ε ij Yij = [γ 00 + γ 10TIMEij + γ 01COAi + γ 11 (COAi × TIMEij )] + [ζ 0i + ζ 1i TIMEij + ε ij ] The composite specification shows how The composite specification shows how the outcome depends simultaneously on: the outcome depends simultaneously on: The composite specification also: The composite specification also: the level-1 predictor TIME and the level-2 the level-1 predictor TIME and the level-2 predictor COA as well as predictor COA as well as the cross-level interaction. the cross-level interaction. 80-83) © Judith D. Harvard Graduate School of Education. Harvard Graduate School of Education. Willett.76-80) © Judith D. pp. ALDA. Singer & John B. COA∗TIME.Specifying the level-2 submodels for individual differences in change Specifying the level-2 submodels for individual differences in change Examining variation in OLS-fitted Examining variation in OLS-fitted level-1 trajectories by: level-1 trajectories by: COA: COAs have higher intercepts but no COA: COAs have higher intercepts but no steeper slopes steeper slopes PEER (split at mean): Teens whose friends at PEER (split at mean): Teens whose friends at age 14 drink more have higher intercepts but age 14 drink more have higher intercepts but shallower slopes shallower slopes COA = 0 COA = 1 4 ALCUSE Level-2 intercepts Level-2 intercepts Population average Population average initial status and rate of initial status and rate of change for a non-COA change for a non-COA Level-2 slopes Level-2 slopes Effect of COA on Effect of COA on initial status and initial status and rate of change rate of change 4 ALCUSE 3 3 2 2 1 1 π 0i = γ 00 + γ 01COAi + ζ 0i π 1i = γ 10 + γ 11COAi + ζ 1i 13 14 15 AGE 16 17 (for initial status) (for rate of change) 0 0 -1 13 14 15 AGE 16 17 -1 Low PEER 4 ALCUSE 4 ALCUSE High PEER 3 3 2 2 1 1 0 0 Level-2 residuals Level-2 residuals 2 ⎛ ⎡0⎤ ⎡σ 0 σ 01 ⎤ ⎞ ⎡ζ 0i ⎤ Deviations of individual Deviations of individual ⎟ ⎢ζ ⎥ ~ N ⎜ ⎢0⎥.2.1.

which provides a baseline for evaluating the provides a baseline for evaluating the success of subsequent model building success of subsequent model building (that includes substantive predictors) (that includes substantive predictors) don’t want to begin data analysis don’t want to begin data analysis without being reasonably confident without being reasonably confident that you have aa sound level-1 that you have sound level-1 model. slide 8 .” but first you need to understand but first you need to understand how the data behave. which within. 92+) © Judith D. Whether there is systematic variation 1. if so. which will help partition the total outcome will help partition the total outcome variation variation 2. you want to know “the answer. Section 4.00 1. amount of change.46 3. Run simple diagnostics using Run simple diagnostics using statistical programs with which statistical programs with which you’re very comfortable you’re very comfortable Once again.00 1.00 AGE-14 0 1 2 0 1 2 0 1 2 0 1 2 COA 1 1 1 1 1 1 0 0 0 0 0 0 COA*(AGE-14) 0 1 2 0 1 2 0 0 0 0 0 0 ALCUSE = [γ 00 + γ 10 ( AGE −14)ij + γ 01COA + γ 11(COA × ( AGE −14)ij )] ij i i + [ζ 0i + ζ 1i ( AGE −14)ij + ε ij ] © Judith D. which will help evaluate the baseline amount of change. which with no predictors at either level. Whether there is systematic variation in the outcome worth exploring and. you don’t want to Once again. What these unconditional models tell us: What these unconditional models tell us: 1. Double check (and then triple Double check (and then triple check) your person-period check) your person-period data set. Unconditional growth model—a model 2. if in the outcome worth exploring and. How much total variation there is both within. Willett. Harvard Graduate School of Education. so instead how the data behave.00 2. Yes.32 0. How much total variation there is both 2. Unconditional growth model—a model with TIME as the only level-1 predictor with TIME as the only level-1 predictor and no substantive predictors at level and no substantive predictors at level 2.” you want to know “the answer.41 3.and between-persons.41 3. Unconditional means model—a model 1. Harvard Graduate School of Education. Yes.00 3. Willett. where that variation lies (within or between people) between people) 2. where that variation lies (within or so. you don’t want to invest too much data analytic invest too much data analytic effort in aa mis-formed data set effort in mis-formed data set Don’t jump in by fitting aa Don’t jump in by fitting range of models with range of models with substantive predictors. so instead you should… you should… (ALDA.00 2. ALDA. You First steps: Two unconditional models First steps: Two unconditional models 1. data set. Singer & John B. You fitted OLS trajectories. p.73 0. slide 7 Words of advice before beginning data analysis Words of advice before beginning data analysis Be sure you’ve examined Be sure you’ve examined empirical growth plots and empirical growth plots and fitted OLS trajectories. model. Unconditional means model—a model with no predictors at either level.00 1. Singer & John B.and between-persons. Chapter 4. Chapter 4.4. which will help evaluate the baseline 2. substantive predictors. ALDA.The person-period data set and its relationship to the composite specification The person-period data set and its relationship to the composite specification ID 3 3 3 4 4 4 44 44 44 66 66 66 ALCUSE 1.

92-97) © Judith D.The Unconditional Means Model (Model A) The Unconditional Means Model (Model A) Partitioning total outcome variation between and within persons Partitioning total outcome variation between and within persons Level-1 Model: Y ij = π 0 i + ε ij . Willett. 92-97) © Judith D. σ 0 ) Composite Model: Y ij = γ 00 + ζ 0 i + ε ij Grand mean across individuals and occasions Within-person deviations Person-specific means Within-person variance Between-person variance Let’s look more closely at these variances…. slide 10 . ALDA. 564 + 0 .4. Chapter 4. Harvard Graduate School of Education. slide 9 Using the unconditional means model to estimate Using the unconditional means model to estimate the Intraclass Correlation Coefficient (ICC or ρ)) the Intraclass Correlation Coefficient (ICC or ρ Major purpose of the unconditional Major purpose of the unconditional means model: To partition the means model: To partition the variation in Y into two components variation in Y into two components Estimated within-person variance: Quantifies the amount of variation within individuals over time Estimated between-person variance: Quantifies the amount of variation between individuals. Singer & John B. Willett. 50 0 . p. ALDA. (ALDA. regardless of time Intraclass correlation compares the relative magnitude of these VCs by estimating the ρ = 2 σ0 2 σ 0 + σ ε2 proportion of total variation in Y that lies “between” people ˆ ρ = 0 . 562 An estimated 50% of the total variation in alcohol use is attributable to differences between adolescents Having partitioned the total variation into within-persons and between-persons. p. 564 = 0 .4. Section 4.1. Chapter 4. σ ε2 ) 2 Level-2 Model: π 0i = γ 00 + ζ 0i . Section 4. Singer & John B. where ζ 0i ~ N (0. where ε ij ~ N ( 0 . Harvard Graduate School of Education.1. let’s ask: What role does TIME play? (ALDA.

The Unconditional Growth Model (Model B) The Unconditional Growth Model (Model B) A baseline model for change over time A baseline model for change over time Level-1 Model: Yij = π 0 i + π 1i TIME Level-2 Model: Composite Model: ij + ε ij . Harvard Graduate School of Education. covariance between initial status and change is n. because the definition of initial status has changed) • There is between-person residual variance in rate of change (should consider adding a level-2 predictor) • Estimated res. Section 4. slide 11 The unconditional growth model: Interpreting the variance components The unconditional growth model: Interpreting the variance components Level-1 (within person) There is still unexplained within-person residual variance Level-2 (between-persons): • There is between-person residual variance in initial status (but careful. So…what has been the effect of moving from an unconditional means model to an unconditional growth model? (ALDA.2. Harvard Graduate School of Education. Section 4.271 AGE − 14) ( 1 0 13 14 15 AGE 16 17 What about the variance components from this unconditional growth model? (ALDA. Singer & John B. Willett. Chapter 4.4.651+ 0.s. Singer & John B. ⎢ 0 ⎜ ⎣0 ⎦ σ ⎣ζ 1i ⎦ ⎣ 10 ⎝ Yij = γ 00 + γ 10TIME ij + [ζ 0 i + ζ 1iTIME ij + ε ij ] Average true rate of change Average true initial status at AGE 14 2 ALCUSE ˆ ALCUSE = 0. pp 97-102) © Judith D. where ε ij ~ N ( 0 .4. slide 12 . Chapter 4.2. pp 97-102) © Judith D. ALDA. σ ε2 ) σ 01 ⎤ ⎞ ⎥⎟ σ 12 ⎦ ⎟ ⎠ Composite residual π 0i = γ 00 + ζ 0i π 1i = γ 10 + ζ 1i ⎛ ⎡0 ⎤ ⎡ σ 2 ⎡ζ ⎤ where ⎢ 0i ⎥ ~ N ⎜ ⎢ ⎥. Willett. ALDA.

Quantifying the proportion of outcome variation explained Quantifying the proportion of outcome variation explained Rε2 = ⎛ Proportional reduction in the ⎞ ⎜ Level .3. (as you can see in this table!). Harvard Graduate School of Education.Yˆ = rY .5. About 40% of the within-teen variation in ALCUSE is explained by linear TIME 3.562 ⎠ ⎝ 40% of the within-person variation in ALCUSE is associated with linear time ˆ RY2 . Singer & John B. ALDA. we have some other options… Multiple level-2 outcomes (the individual growth parameters)—each can be related separately to predictors Two kinds of effects being modeled: Fixed effects Variance components Not all effects are required in every model Examine the effect of each predictor separately Prioritize the predictors. Chapter 4. Section 4. There is significant variation in both initial status and rate of change— so it pays to explore substantive predictors (COA & PEER) How do we build statistical models? • Use all your intuition and skill you bring from the cross sectional world – – But because the data are longitudinal.1. 21 ) = 0 . Willett.562 − 0.1 variance component ⎟ ⎝ ⎠ ⎛ 0. Chapter 4. • • Focus on your “question” predictors Include interesting and important control predictors • Progress towards a “final model” whose interpretation addresses your research questions (ALDA. Harvard Graduate School of Education. slide 14 .4. Section 4. ALDA.337 ⎞ =⎜ ⎟ = 0.Yˆ ( ) 2 = (0 .40 . pp 105-106) © Judith D. Willett.3% of the total variation in ALCUSE is associated with linear time For later: Extending the idea of proportional reduction For later: Extending the idea of proportional reduction in variance components to Level-2 (to estimate the percentage of in variance components to Level-2 (to estimate the percentage of between-person variation in ALCUSE associated with predictors) between-person variation in ALCUSE associated with predictors) PseudoRζ2 = ˆ ) ˆ σ ζ2 (UncondGrowthModel −σ ζ2 (LaterGrowthModel) ˆ σ ζ2 (UncondGrowthModel ) Careful : :Don’t do this comparison with the unconditional means model Careful Don’t do this comparison with the unconditional means model (as you can see in this table!). (ALDA. About half the total variation in ALCUSE is attributable to differences among teens 2. slide 13 Where we’ve been and where we’re going… Where we’ve been and where we’re going… What these unconditional models tell us: 1. pp 102-104) © Judith D. 043 2 4. Singer & John B.

**What will our analytic strategy be? What will our analytic strategy be?
**

Because our research interest focuses on the effect of COA, essentially treating PEER is a control, we’re going to proceed as follows…

Model C: COA predicts both Model C: COA predicts both initial status and rate of change. initial status and rate of change.

Model D: Adds PEER to both Model D: Adds PEER to both Level-2 sub-models in Model C. Level-2 sub-models in Model C.

Model E: Simplifies Model D by Model E: Simplifies Model D by removing the non-significant removing the non-significant effect of COA on change. effect of COA on change.

(ALDA, Section 4.5.1, pp 105-106)

© Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 15

Model C: Assessing the uncontrolled effects of COA (the question predictor) Model C: Assessing the uncontrolled effects of COA (the question predictor)

Fixed effects Fixed effects Est. initial value of ALCUSE for non-COAs is Est. initial value of ALCUSE for non-COAs is 0.316 (p<.001) 0.316 (p<.001) Est. differential in initial ALCUSE between Est. differential in initial ALCUSE between COAs and non-COAs is 0.743 (p<.001) COAs and non-COAs is 0.743 (p<.001) Est. annual rate of change in ALCUSE for nonEst. annual rate of change in ALCUSE for nonCOAs is 0.293 (p<.001) COAs is 0.293 (p<.001) Estimated differential in annual rate of change Estimated differential in annual rate of change between COAs and non-COAS is –0.049 (ns) between COAs and non-COAS is –0.049 (ns) Variance components Variance components Within person VC is identical to B’s because no Within person VC is identical to B’s because no predictors were added predictors were added Initial status VC declines from B: COA Initial status VC declines from B: COA “explains” 22% of variation in initial status (but “explains” 22% of variation in initial status (but still stat sig. suggesting need for level-2 pred’s) still stat sig. suggesting need for level-2 pred’s) Rate of change VC unchanged from B: COA Rate of change VC unchanged from B: COA “explains” no variation in change (but also still “explains” no variation in change (but also still sig suggesting need for level-2 pred’s) sig suggesting need for level-2 pred’s)

Next step?

• Remove COA? Not yet—question predictor • Add PEER—Yes, to examine controlled effects of COA

(ALDA, Section 4.5.2, pp 107-108)

© Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 16

Model D: Assessing the controlled effects of COA (the question predictor) Model D: Assessing the controlled effects of COA (the question predictor)

Fixed effects of COA Fixed effects of COA Est. diff in ALCUSE between COAs and nonEst. diff in ALCUSE between COAs and nonCOAs, controlling for PEER, is 0.579 (p<.001) COAs, controlling for PEER, is 0.579 (p<.001) No sig. Difference in rate of change No sig. Difference in rate of change Fixed effects of PEER Fixed effects of PEER Teens whose peers drink more at 14 also drink Teens whose peers drink more at 14 also drink more at 14 (initial status) more at 14 (initial status) Modest neg effect on rate of change (p<.10) Modest neg effect on rate of change (p<.10) Variance components Variance components Within person VC unchanged (as expected) Within person VC unchanged (as expected) Still sig. variation in both initial status and Still sig. variation in both initial status and change—need other level-2 predictors change—need other level-2 predictors Taken together, PEER and COA explain Taken together, PEER and COA explain

61.4% of the variation in initial status 61.4% of the variation in initial status 7.9% of the variation in rates of change 7.9% of the variation in rates of change

Next step?

• • If we had other predictors, we’d add them because the VCs are still significant Simplify the model? Since COA is not associated with rate of change, why not remove this term from the model?

(ALDA, Section 4.5.2, pp 108-109)

© Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 17

Model E: Removing the non-significant effect of COA on rate of change Model E: Removing the non-significant effect of COA on rate of change

Fixed effects of COA Fixed effects of COA Controlling for PEER, the estimated diff in ALCUSE Controlling for PEER, the estimated diff in ALCUSE between COAs and non-COAs is 0.571 (p<.001) between COAs and non-COAs is 0.571 (p<.001) Fixed effects of PEER Fixed effects of PEER Controlling for COA, for each 11 pt difference in PEER, Controlling for COA, for each pt difference in PEER, initial ALCUSE is 0.695 higher (p<.001) but rate initial ALCUSE is 0.695 higher (p<.001) but rate of change in ALCUSE is 0.151 lower (p<.10) of change in ALCUSE is 0.151 lower (p<.10)

Variance components are unchanged suggesting Variance components are unchanged suggesting little is lost by eliminating the main effect of COA on little is lost by eliminating the main effect of COA on rate of change (although there is still level-2 rate of change (although there is still level-2 variance left to be predicted by other variables) variance left to be predicted by other variables) Partial covariance is indistinguishable from 0. Partial covariance is indistinguishable from 0. After controlling for PEER and COA, initial After controlling for PEER and COA, initial status and rate of change are unrelated status and rate of change are unrelated

(ALDA, Section 4.5.2, pp 109-110)

© Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 18

Where we’ve been and where we’re going… Where we’ve been and where we’re going…

• Let’s call Model E our tentative “final model” (based on not just these results but many other analyses not shown here) • Controlling for the effects of PEER, the estimated differential in ALCUSE between COAs and nonCOAs is 0.571 (p<.001) • Controlling for the effects of COA, for each 1-pt difference in PEER: the average initial ALCUSE is 0.695 higher (p<.001) and average rate of change is 0.151 lower (p<.10)

Displaying prototypical trajectories Recentering predictors to improve interpretation Alternative strategies for hypothesis testing: Comparing models using Deviance statistics and information criteria Additional comments about estimation

(ALDA, Section 4.5.1, pp 105-106)

© Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 19

Displaying analytic results: Constructing prototypical fitted plots Displaying analytic results: Constructing prototypical fitted plots

Key idea: Substitute prototypical values for Key idea: Substitute prototypical values for the predictors into the fitted models to yield the predictors into the fitted models to yield prototypical fitted growth trajectories prototypical fitted growth trajectories

Review of the basic approach (with one dichotomous predictor)

Model C :

ˆ π 0i = 0.316 + 0.743COA ˆ π 1i = 0.293 − 0.049COA

**1. Substitute observed values for COA (0 and 1)
**

ALCUSE

2

COA = 1

ˆ ⎧π = 0.316 + 0.743(0) = 0.316 When COAi = 0 : ⎨ 0i ˆ ⎩π 1i = 0.293 − 0.049(0) = 0.293 ˆ ⎧π = 0.316 + 0.743(1) = 1.059 When COAi = 1 ⎨ 0i π 1i = 0.293 − 0.049(1) = 0.244 ⎩ˆ

1

COA = 0

2. Substitute the estimated growth parameters into the level-1 growth model ˆ when COAi = 0 : Yij = 0.316 + 0.293TIME ˆ when COAi = 1 : Yij = 1.059 + 0.244TIME

0 13 14 15 AGE 16 17

What happens when the predictors aren’t all dichotomous?

(ALDA, Section 4.5.3, pp 110-113)

© Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 20

use just its sample mean Remember that exposition can be easier if you select whole number values (if the scale permits) or easily communicated fractions (eg.¼. Harvard Graduate School of Education. e.655 ALCUSE High PEER: 1.5( 0. we can center as we’ll see. ALDA. ⅛) PEER: mean=1. Section 4. If you don’t want to display a predictor’s effect but just control for it. Willett.” but Often use “initial status. 50th. you can re-center by subtracting out: subtracting out: The sample mean.” but as we’ll see. re-centering TIME is At level-1. Willett. 12. A range of percentiles. and 16 years of education in the US) 2.151PEER COA = 1 High 1 PEER Low High COA = 0 PEER Intercepts for plotting Slopes for plotting Low 0 13 14 15 AGE 16 17 (ALDA. which causes the level-2 intercepts to represent the level-2 intercepts to represent average fitted values (mean average fitted values (mean PEER=1. mean COA=0.571COA ˆ π 1i = 0.381 2 Model E ˆ π 0i = −0. ¾. corresponding to interpretable. sd = 0. Substantively interesting values. e. mean COA=0. Singer & John B. slide 22 . When there are no well-known values.726) = 1.5. The sample mean ± .. corresponding to status at aaspecific age status at specific age Often use “initial status. 8. Best used with predictors with a symmetric distribution 4.018-.018.5 (or 1) standard deviation.4. slide 21 How can “centering” predictors improve the interpretation of their effects? How can “centering” predictors improve the interpretation of their effects? At level-1. IQ of 100 As expected..3. ALDA. pp 113-116) Our preference: Here we prefer model F because it leaves the dichotomous question predictor COA uncentered © Judith D. Chapter 4. Harvard Graduate School of Education.695 PEER + 0. 50th and 75th or the 10th. Section 4. we can center TIME on any sensible value TIME on any sensible value Model F centers only PEER Model G centers PEER and COA Many estimates are unaffected by centering At level-2.018.. This is easiest when the predictor has inherently appealing values (e.018. and 90th) 3. The sample mean (on its own).. consider using a range of percentiles (either the 25th. 12 yrs of ed. Another meaningful value. centering the level-2 predictors changes the level-2 intercepts F’s intercepts describe an “average” non-COA G’s intercepts describe an “average” teen (ALDA.314 + 0. which causes The sample mean. pp 110-113) © Judith D. IQ of 100 12 yrs of ed. Chapter 4.5( 0. Singer & John B.5.018+.Constructing prototypical fitted plots when some predictors are continuous Constructing prototypical fitted plots when some predictors are continuous Key idea: Select “interesting” values of continuous predictors and plot prototypical trajectories by selecting: 1. you can re-center by At level-2.425 − 0.g.451) Another meaningful value. ½. re-centering TIME is usually beneficial usually beneficial Ensures that the individual Ensures that the individual intercepts are easily intercepts are easily interpretable.g.g.726) = 0.451) PEER=1.726 Low PEER: 1.

its LL= 00and the second term fits perfectly. Harvard Graduate School of Education. setting them to 0) B)..σ 01 = 0 2: Compute difference in Deviance 2: Compute difference in Deviance 2 statistics and compare to appropriate χχ2 statistics and compare to appropriate distribution distribution Δ Deviance ==33. slide 23 Hypothesis testing using Deviance statistics Hypothesis testing using Deviance statistics You can use deviance statistics to compare You can use deviance statistics to compare two models ififtwo criteria are satisfied: two models two criteria are satisfied: Both models are fit to the same exact data Both models are fit to the same exact data —beware missing data —beware missing data 2. and effectiveness Disagreement is do strong that some software Disagreement is do strong that some software packages (e. Chapter 4. Section 4. MLwiN) won’t output them Their behavior is poorest for tests on variance Their behavior is poorest for tests on variance components components Based on the log likelihood (LL) statistic that is Based on the log likelihood (LL) statistic that is maximized under Maximum Likelihood maximized under Maximum Likelihood estimation estimation Have superior statistical properties (compared Have superior statistical properties (compared to the single parameter tests) to the single parameter tests) Special advantage: permit joint tests on Special advantage: permit joint tests on several parameters simultaneously several parameters simultaneously You need to do the tests “manually” because You need to do the tests “manually” because automatic tests are rarely what you want automatic tests are rarely what you want Deviance = -2[LLcurrent model – LLsaturated model] Quantifies how much worse the current model Quantifies how much worse the current model is in comparison to aasaturated model is in comparison to saturated model AAmodel with aasmall deviance statistic is nearly as model with small deviance statistic is nearly as good.g. model with large deviance statistic is much worse (we obviously prefer models with smaller deviance) worse (we obviously prefer models with smaller deviance) Simplification: Because aasaturated model Simplification: Because saturated model fits perfectly.001) Δ Deviance 33.. We can obtain Model A from Model B by invoking 3 constraints: H0 : γ 10 = 0..55 (3 df. parameters in the more complex model (e.1.g. pp 116-119) © Judith D. but not always. Singer & John B. ALDA. Singer & John B.55 (3 df. A) can specify the less complex model (e.g. aamodel with large deviance statistic is much good. then: Difference in the two deviance statistics is Difference in the two deviance statistics is 2 asymptotically distributed as χχ2 asymptotically distributed as df = ##of independent constraints df = of independent constraints 1. One model is nested within the other—we can specify the less complex model (e. statisticians disagree about their nature. p 116) © Judith D. making Deviance = -2LL current (ALDA. form. p<. Section 4.001) reject H0 reject H 0 (ALDA. Willett. Chapter 4. usually.6. usually. statisticians disagree about their However.Hypothesis testing: What we’ve been doing and an alternative approach Hypothesis testing: What we’ve been doing and an alternative approach Single parameter hypothesis tests Single parameter hypothesis tests Deviance based hypothesis tests Deviance based hypothesis tests Simple to conduct and easy to interpret— Simple to conduct and easy to interpret— making them very useful in hands on data making them very useful in hands on data analysis (as we’ve been doing) analysis (as we’ve been doing) However. form. A) by imposing constraints on one or more by imposing constraints on one or more parameters in the more complex model (e. ALDA. MLwiN) won’t output them packages (e. B).6.σ12 = 0. but not always. 1.g. p<. If these conditions hold. One model is nested within the other—we 2. and effectiveness nature.g. its LL= and the second term drops out. setting them to 0) 1...g. making Deviance = -2LLcurrent drops out. Harvard Graduate School of Education. slide 24 .. Willett. then: If these conditions hold.

Harvard Graduate School of Education. ALDA. Singer & John B. Singer & John B. Section 4.6.001) reject H0 reject H 0 The pooled test does not imply that each level-2 slope is on its own statistically significant (ALDA. Models need not be nested. Section 4.001) Δ Deviance 15. pp 116-119) © Judith D. Willett. Harvard Graduate School of Education.. p<.41 (2 df. Model E has the lowest AIC and BIC statistics Interpreting differences in BIC Interpreting differences in BIC across models (Raftery.6. γ 11 = 0 2: Compute difference in Deviance 2: Compute difference in Deviance 2 statistics and compare to appropriate χχ2 statistics and compare to appropriate distribution distribution Δ Deviance ==15.41 (2 df. 1995): 0-2: Weak evidence 0-2: Weak evidence 2-6: Positive evidence 2-6: Positive evidence 6-10: Strong evidence 6-10: Strong evidence >10: Very strong >10: Very strong Careful: Gelman & Rubin (1995) declare these statistics and criteria to be “off-target and only by serendipity manage to hit the target” (ALDA. Willett. slide 26 . sample size. but datasets must be the but datasets must be the same. pp 120-122) © Judith D. slide 25 Comparing non-nested multilevel models using AIC and BIC Comparing non-nested multilevel models using AIC and BIC You can You can (supposedly) (supposedly) compare non-nested compare non-nested multilevel models multilevel models using information using information criteria criteria Information Criteria: AIC and BIC Information Criteria: AIC and BIC Each information criterion “penalizes” the logEach information criterion “penalizes” the loglikelihood statistic for “excesses” in the structure of likelihood statistic for “excesses” in the structure of the current model the current model The AIC penalty accounts for the number of The AIC penalty accounts for the number of parameters in the model. Chapter 4. Chapter 4. p<. Smaller values of AIC & BIC indicate better fit Smaller values of AIC & BIC indicate better fit Here’s the taxonomy of multilevel models that we ended up fitting. The BIC penalty goes further and also accounts for The BIC penalty goes further and also accounts for sample size. Models need not be nested. parameters in the model.Using deviance statistics to test more complex hypotheses Using deviance statistics to test more complex hypotheses Key idea: Deviance statistics are great for Key idea: Deviance statistics are great for simultaneously evaluating the effects of simultaneously evaluating the effects of adding predictors to both level-2 models adding predictors to both level-2 models We can obtain Model B from Model C by invoking 2 constraints: H 0 : γ 01 = 0. in the ALCUSE example…. ALDA.4.1. same. 1995): across models (Raftery.

e.. Harvard Graduate School of Education.8) Checking functional form Checking functional form Checking normality Checking normality Checking homoscedasticity Checking homoscedasticity Model-Based (empirical Bayes) estimates of the Model-Based (empirical Bayes) estimates of the individual growth parameters (§4.. Singer & John B. slide 28 .9) Superior estimates that combine OLS estimates with population average that combine OLS estimates with population average estimates that are usually your best bet if you would like estimates that are usually your best bet if you would like to display individual growth trajectories for particular to display individual growth trajectories for particular sample members sample members © Judith D. 3.g. pp 63-68.. in autocorrelated and heteroscedastic) (implemented. pp 85-92) © Judith D. even if you’ve used restricted estimation methods estimation methods Evaluating the tenability of the model’s assumptions Evaluating the tenability of the model’s assumptions (§4. Section.g. Harvard Graduate School of Education. Restricted (ML or GLS) Full: Simultaneously estimate the fixed effects and Full: Simultaneously estimate the fixed effects and Restricted: Sequentially estimate the fixed effects Restricted: Sequentially estimate the fixed effects the variance components.g. • Default in MLwiN & HLM Default in MLwiN & HLM • and then the variance components and then the variance components • • Default in SAS Proc Mixed Default in SAS Proc Mixed Goodness of fit statistics apply to Goodness of fit statistics apply to the entire model the entire model (bothfixed and random effects) fixed and random effects) (both This is the method we’ve used in This is the method we’ve used in both the examples shown so far both the examples shown so far Goodness of fit statistics apply to Goodness of fit statistics apply to only the random effects only the random effects So we can only test hypotheses about So we can only test hypotheses about VCs (and the models being compared VCs (and the models being compared must have identical fixed effects) must have identical fixed effects) (ALDA. Section 4.A final comment about estimation and hypothesis testing A final comment about estimation and hypothesis testing Two most common methods of estimation Maximum likelihood (ML): Maximum likelihood (ML): Generalized Least Squares (GLS) (& Iterative Generalized Least Squares (GLS) (& Iterative GLS): : Iteratively seeks those parameter estimates that GLS) Iteratively seeks those parameter estimates that Seeks those parameter estimates that maximize the likelihood Seeks those parameter estimates that maximize the likelihood function.7)—generalization of the “parameter estimate divided by its standard error” “parameter estimate divided by its standard error” approach that allows you to test composite hypotheses approach that allows you to test composite hypotheses about fixed effects. e. e. ALDA. which assesses the joint probability of function. A more important distinction: Full vs. slide 27 Other topics covered in Chapter Four of ALDA Other topics covered in Chapter Four of ALDA Using Wald statistics to test composite hypotheses Using Wald statistics to test composite hypotheses about fixed effects (§4. in HLM and SAS Proc Mixed). Willett.7)—generalization of the about fixed effects (§4. MLwiN). obtained (implemented. even if you’ve used restricted about fixed effects. minimize the sum of squared residuals (allowing them to be minimize the sum of squared residuals (allowing them to be autocorrelated and heteroscedastic) (implemented. e.9) Superior estimates individual growth parameters (§4. Chapter 4. in MLwiN).g. Chapter 4.4. ALDA..8) (§4. Singer & John B. in HLM and SAS Proc Mixed).3. the variance components. Willett. which assesses the joint probability of simultaneously observing all the sample data actually simultaneously observing all the sample data actually obtained (implemented.

3) The values of some predictors vary over time The values of some predictors vary over time They’re easy to include and can have powerful interpretations They’re easy to include and can have powerful interpretations Re-centering the effect of TIME (§5. Singer & John B. Chapter 5. ALDA. Chapter 5. slide 1 Chapter 5: Treating TIME more flexibly Chapter 5: Treating TIME more flexibly General idea: Although all our examples have been equally spaced. Chapter Five “Change is a measure of time” Edwin Way Teale John B. Willett & Judith D.Extending the multilevel model for change ALDA.4) Re-centering the effect of TIME (§5. Harvard Graduate School of Education.1)—each Variably spaced measurement occasions (§5.3) Including time-varying predictors (§5. time-structured. the multilevel model for change is actually far more flexible Variably spaced measurement occasions (§5. Singer Harvard Graduate School of Education © Judith D.1)—each individual can have his or her own customized data individual can have his or her own customized data collection schedule collection schedule Varying numbers of waves of data (§5. Harvard Graduate School of Education. and fully balanced. Willett. slide 2 .2)—not everyone need have the same number of waves of data need have the same number of waves of data Allows us to handle missing data Allows us to handle missing data Can even include individuals with just one or two waves Can even include individuals with just one or two waves Including time-varying predictors (§5. Willett. ALDA.4) Initial status is not the only centering constant for TIME Initial status is not the only centering constant for TIME Recentering TIME in the level-1 model improves interpretation Recentering TIME in the level-1 model improves interpretation in the level-2 model in the level-2 model © Judith D.2)—not everyone Varying numbers of waves of data (§5. Singer & John B.

1988. slide 4 .” and “in their 10th yr” yr” Of course. PIAT. Section 5. not each child was tested on his/her birthday or half-birthday. ALDA. and waves of data collected in 1986. when the children were to be “in their 6th yr. and 1990.1. PIAT. Harvard Graduate School of Education. Chapter 5. Singer & John B. which his/her birthday or half-birthday. slide 3 What does the person-period data set look like when waves are variably spaced? What does the person-period data set look like when waves are variably spaced? Person-period data sets are easy to construct even with variably spaced waves We could build models of PIAT scores over time using ANY of these 3 measures for TIME—so which should we use? Three different ways of coding TIME WAVE—reflects design but has no substantive meaning AGEGRP—child’s “expected” age on each occasion AGE—child’s actual age (to the day) on each occasion—notice “occasion creep”—later waves are more likely to be even later in a child’s life (ALDA. Chapter 5. is the child’s unstandardized score on the reading portion unstandardized score on the reading portion of the Peabody Individual Achievement Test of the Peabody Individual Achievement Test Not standardized for age so we can see Not standardized for age so we can see growth over time growth over time No substantive predictors to keep the No substantive predictors to keep the example simple example simple How do PIAT scores change over time? How do PIAT scores change over time? Research question Research question © Judith D. Singer & John B. when the children were to be “in their 1990. ALDA. Harvard Graduate School of Education. Willett. 1988. not each child was tested on Of course.” “in their 8th yr. which creates the variably spaced waves creates the variably spaced waves The outcome.” and “in their 10th 6th yr.1.Example for handling variably spaced waves: Reading achievement over time Example for handling variably spaced waves: Reading achievement over time Data source: Children of the National Longitudinal Survey of Youth (CNLSY) Sample: 89 children Sample: 89 children Research design Research design Each approximately 66years old at study start Each approximately years old at study start 33waves of data collected in 1986. Willett. pp 139-144) © Judith D. is the child’s The outcome.” “in their 8th yr.

Comparing OLS trajectories fit using AGEGRP and AGE Comparing OLS trajectories fit using AGEGRP and AGE 80 60 40 20 0 5 6 7 8 9 10 11 12 80 60 40 20 0 5 6 7 8 9 10 11 12 80 60 40 20 0 5 6 AGEGRP (+’s with solid line) For many children—especially those assessed near the half-years—it makes little difference AGE (•’s with dashed line) 7 8 9 10 11 12 80 60 40 20 0 5 6 7 8 9 10 11 12 80 60 40 20 0 5 6 7 8 9 10 11 12 80 60 40 20 0 5 6 7 8 9 10 11 12 Why ever use rounded AGE? Note that this what we did in the past two examples. making the slope steeper • Unexplained variation for initial status is associated with real AGE AIC and BIC better with AGE Treating an unstructured data set as structured introduces error into the analysis (ALDA. which is our conceptual outcome (rate of change) (ALDA. σ ε2 ) π 0i = γ 00 + ζ 0i π 1i = γ 10 + ζ 1i ⎛ ⎡0 ⎤ ⎡ σ 2 ⎡ζ ⎤ where ⎢ 0i ⎥ ~ N ⎜ ⎢ ⎥. ALDA. is ½ pt larger • cumulates to a 2 pt diff over 4 yrs • Level-2 VCs are also larger • AGEGRP associates the data from later waves with earlier ages than observed. Willett. ALDA. Willett.2. Harvard Graduate School of Education.1. slide 5 Comparing models fit with AGEGRP and AGE Comparing models fit with AGEGRP and AGE Level-1 Model: Level-2 Model: Composite Model: Yij = π 0i + π 1i TIMEij + ε ij .1 p. Harvard Graduate School of Education. Section 5. the predictor TIME. the slope. ⎢ 0 ⎜ ⎣0 ⎦ σ ⎣ζ 1i ⎦ ⎣ 10 ⎝ σ 01 ⎤ ⎞ ⎥⎟ σ 12 ⎦ ⎟ ⎠ By writing the level-1 By writing the level-1 model using the generic model using the generic predictor TIME. Figure 5. Chapter 5. Singer & John B. Singer & John B. 143) © Judith D. and so do lots of researchers!!! 80 60 40 20 0 5 6 7 8 9 10 11 12 80 60 40 20 0 5 6 7 8 9 10 11 12 80 60 40 20 0 5 6 7 8 9 10 11 12 For some children though—there’s a big difference in slope. where ε ij ~ N (0. slide 6 . pp 144-146) © Judith D. Chapter 5. the specification is identical specification is identical Yij = γ 00 + γ 10TIME ij + [ζ 0 i + ζ 1iTIME ij + ε ij ] Some parameter estimates are virtually identical Other est’s larger with AGEGRP • γˆ10 .

pp 146-148) Covariates: Race and Highest Grade Completed © Judith D. inflation adjusted Outcome is log(WAGES). Harvard Graduate School of Education. slide 7 Examining a person-period data set with varying numbers of waves of data per person Examining a person-period data set with varying numbers of waves of data per person ID 206 has 3 waves # waves 1 2 3-4 5-6 7-8 9-10 >10 N men 38 39 82 166 226 240 97 ID 332 has 10 waves ID 1028 has 7 waves EXPER = specific moment (to the nearest day) in each man’s labor force history •Varying # of waves •Varying spacing LNW in constant dollars seems to rise over time (ALDA. Section 5. Willett.Example for handling varying numbers of waves: Wages of HS dropouts Example for handling varying numbers of waves: Wages of HS dropouts Data source: Murnane.2. Evaluation Review Sample: 888 male high school dropouts Sample: 888 male high school dropouts Based on the National Longitudinal Survey of Based on the National Longitudinal Survey of Youth (NLSY) Youth (NLSY) Tracked from first job since HS dropout. Singer & John B. when the men varied in age from 14 to 17 when the men varied in age from 14 to 17 Each interviewed between 11and 13 times Each interviewed between and 13 times Research design Research design Both variable number and spacing of waves Both variable number and spacing of waves Outcome is log(WAGES). slide 8 . but some were every 22 years every years Each wave’s interview conducted at different times Each wave’s interview conducted at different times during the year during the year Research question Research question How do log(WAGES) change over time? How do log(WAGES) change over time? Do the wage trajectories differ by ethnicity Do the wage trajectories differ by ethnicity and highest grade completed? and highest grade completed? © Judith D. Chapter 5.1. but some were Interviews were approximately annual. inflation adjusted natural logarithm of hourly wage natural logarithm of hourly wage Interviews were approximately annual. Chapter 5. Harvard Graduate School of Education. ALDA. ALDA. Tracked from first job since HS dropout. Boudett and Willett (1999). Singer & John B. Willett.

Fitting multilevel models for change when data sets have varying numbers of waves Fitting multilevel models for change when data sets have varying numbers of waves Everything remains the same—there’s really no difference! Everything remains the same—there’s really no difference! Unconditional growth model: On average. pp150-156) D.489-1)=5. slide 10 © Judith .2. Chapter 5. Section 5.2. Singer & John B.4 p.2 Black 12 th grade dropouts 2.8 9 th grade dropouts Highest grade completed • Those who stay in school longer have higher initial wages • This differential remains constant over time (lines remain parallel) 1. Table 5. Harvard Graduate School of Education.489-0. Willett. which removes non-significant parameters (ALDA.0457)-1)=4. the wage of Black males increase less rapidly with labor force experience • Rate of change for Whites and Latinos is 100(e0.4 LNW White/Latino 2. Willett. no racial differences in wages • Racial disparities increase over time because wages for Blacks increase at a slower rate 2. a dropout’s hourly wage increases with work experience 100(e(0. Singer & John B. 149) © Judith D.3% • Significant level-2 VCs indicate that there’s still unexplained variation—this is hardly a ‘final’ model Fully specified growth model (both HGC & BLACK) • HGC is associated with initial status (but not change) • BLACK is associated with change (but not initial status) ⇒ Fit Model C.0161-1)=3.1 and 5. Chapter 5.6 0 2 4 6 EXPER 8 10 (ALDA.2. ALDA. slide 9 Prototypical wage trajectories of HS dropouts Prototypical wage trajectories of HS dropouts Race • At dropout.0% • Rate of change for Blacks is 100(e0.0 1. Harvard Graduate School of Education. ALDA.7 is the %age change in Y per annum Model C: an intermediate “final” model • Almost identical Deviance as Model B • Effect of HGC—dropouts who stay in school longer earn higher wages on labor force entry (~4% higher per yr of school) • Effect of BLACK—in contrast to Whites and Latinos.

Chapter 5.2 Many practical strategies discussed in ALDA. interview. Singer & John B. pp160-161) © Judith D. Time-varying predictor: Unemployment status (UNEMP) Time-varying predictor: Unemployment status (UNEMP) 132 remained unemployed at every interview 132 remained unemployed at every interview 61 were always working after the 1st interview 61 were always working after the 1st interview 41 were still unemployed at the 2nd interview. Chapter 5.Practical advice: Problems can arise when analyzing unbalanced data sets Practical advice: Problems can arise when analyzing unbalanced data sets The multilevel model for change is designed to handle The multilevel model for change is designed to handle unbalanced data sets. J of Occupational Health Psychology Sample: 254 people identified at unemployment offices. Willett. but working by the 3rd working by the 3rd 19 were working at the 2nd interview. ALDA. data set is not at 1. problems can occur or waves of data. Research design: Goal was to collect 33waves of data per person Research design: Goal was to collect waves of data per person Research question Research question at 1. Section 5. it does unbalanced data sets. not everyone completed the 2nd and 3rd interview. it does its job well. however… its job well. In reality. In reality. Section 5. you’ll get negative variance components you’re lucky.3. data set is not time-structured: time-structured: Interview 11was within 11day and 22months of job loss Interview was within day and months of job loss Interview 22was between 33and 88months of job loss Interview was between and months of job loss Interview 33was between 10 and 16 months of job loss Interview was between 10 and 16 months of job loss In addition. or lots of people have just 11 When imbalance is severe. however… When imbalance is severe. Singer & John B. Section 5. but 41 were still unemployed at the 2nd interview. and in most circumstances.2. you’ll get negative variance components Another sign is too much time to convergence (or no convergence) Another sign is too much time to convergence (or no convergence) Most common problem: your model is overspecified Most common problem: your model is overspecified Most common solution: simplify the model Most common solution: simplify the model Software packages may not issue clear warning signs Software packages may not issue clear warning signs Many practical strategies discussed in ALDA. 55and 11 months of job loss. ALDA. Willett.2. pp151-156) © Judith D. Harvard Graduate School of Education. not everyone completed the 2nd and 3rd In addition. but were 19 were working at the 2nd interview. but were unemployed again by the 3rd unemployed again by the 3rd Outcome: CES-D scale—20 4-pt items (score of 00to 80) Outcome: CES-D scale—20 4-pt items (score of to 80) How does unemployment affect depression symptomatology? How does unemployment affect depression symptomatology? (ALDA. or lots of people have just or 22waves of data. problems can occur You may not estimate some parameters (well) You may not estimate some parameters (well) Iterative fitting algorithms may not converge Iterative fitting algorithms may not converge Some estimates may hit boundary constraints Some estimates may hit boundary constraints Problem is usually manifested via VCs not fixed effects (because the Problem is usually manifested via VCs not fixed effects (because the fixed portion of the model is like aa‘regular regression model”). fixed portion of the model is like ‘regular regression model”).2.2 Another major advantage of the multilevel model for change: How easy it is to include time-varying predictors (ALDA. however. Harvard Graduate School of Education. and 11 months of job loss. Section 5. and in most circumstances. IfIfyou’re lucky.2.1. however. slide 12 .. Sample: 254 people identified at unemployment offices. slide 11 Example for illustrating time-varying predictors: Unemployment & depression Example for illustrating time-varying predictors: Unemployment & depression Source: Liz Ginexi and colleagues (2000).

Chapter 5. but does so in a very particular way Y ij = π 0 i + π 1 i TIME + ε ij . all unemployed ID 65641 has 3 waves.6. Singer & John B. p161) © Judith D. pp 159-164) © Judith D. σ ε2 ) ij Y ij = γ 00 + γ 10 TIME ij + γ 20 UNEMP ij + [ζ 0 i + ζ 1i TIME ij + ε ij ] Yij = γ 00 + γ 10 TIME ij + γ 20UNEMP ij + γ 30UNEMP ij × TIME ij + [ζ 0 i + ζ 1iTIME ij + ε ij ] Yij = γ 00 + γ 20UNEMP + γ 30UNEMP × TIMEij ij ij + [ζ 0i + ζ 2iUNEMP + ζ 3iUNEMP × TIMEij + ε ij ] ij ij As we go through this analysis. must be 1 at wave 1) ID 7589 has 3 waves. Harvard Graduate School of Education. Chapter 5.1. ALDA. Harvard Graduate School of Education. Willett. Singer & John B. Section 5. Table 5.A person-period data set with a time-varying predictor A person-period data set with a time-varying predictor TIME=MONTHS since job loss UNEMP (by design. slide 14 . ALDA. slide 13 Analytic approach: We’re going to sequentially fit 4 increasingly complex models Analytic approach: We’re going to sequentially fit 4 increasingly complex models Model A: An individual growth model with no substantive predictors Model B: Adding the main effect of UNEMP Model C: Allowing the effect of UNEMP to vary over TIME Model D: Also allows the effect of UNEMP to vary over TIME. unemployed again at 3rd (ALDA. re-employed at 2nd. we will demonstrate: • Strategies for the thoughtful inclusion of time varying predictors • Strategies for practical data analysis more generally (you’re almost ready to fly solo!) • How both the level-1/level-2 and composite specifications facilitate understanding • The need to simultaneously consider the model’s structural (fixed effects) and stochastic components (variance components) and whether you want them to be parallel (ALDA. where ε ij ~ N ( 0 .3. Willett. re-employed after 1st wave ID 53782 has 3 waves.

slide 15 Model B: Adding time-varying UNEMP to the composite specification Model B: Adding time-varying UNEMP to the composite specification Y ij = γ 00 + γ 10 TIME ij + γ 20 UNEMP ij + [ζ 0 i + ζ 1i TIME ij + ε ij ] Logical impossibility Population average rate of change in CES-D. slide 16 . the TV nature of UNEMP implies the predictor’s effect remains constant. ⎢ 0 ⎜ ⎣0 ⎦ σ ⎣ζ 1i ⎦ ⎣ 10 ⎝ How can it go at level-2??? It seems like it can go here Yij = γ 00 + γ 10 TIME ij + [ζ 0 i + ζ 1i TIME ij + ε ij ] On the first day of job loss. such as: existence of many possible population average trajectories. Chapter 5. controlling for UNEMP Population average difference. Willett. CES-D declines by 0. Harvard Graduate School of Education. Harvard Graduate School of Education. in CES-D by UNEMP status How can we understand this graphically? Although the magnitude of the TV How can we understand this graphically? Although the magnitude of the TV predictor’s effect remains constant. pp 159-164) © Judith D. Singer & John B. the average person has an estimated CES-D of 17. such as: Remains unemployed 20 20 CES-D CES-D Reemployed at 5 months 20 CES-D Reemployed at 10 months 20 CES-D Reemployed at 5 months Unemployed again at 10 15 15 γ20 15 15 γ20 10 10 γ20 γ20 10 10 5 0 2 4 6 8 10 12 Months since job loss 14 5 0 2 4 6 8 10 12 14 5 0 2 Months since job loss 4 6 8 10 12 Months since job loss 14 5 0 2 4 6 8 10 12 Months since job loss 14 What happens when we fit Model B to data? (ALDA.1. the TV nature of UNEMP implies the existence of many possible population average trajectories. Section 5. σ ε2 ) σ 01 ⎤ ⎞ ⎥⎟ σ 12 ⎦ ⎟ ⎠ ⎛ ⎡0 ⎤ ⎡ σ 2 ⎡ζ 0 i ⎤ where ⎢ ⎥ ~ N ⎜ ⎢ ⎥ . Section 5.3.1. where ε ij ~ N ( 0 . Singer & John B.3. over time.7 On average. Chapter 5. pp 159-164) © Judith D.First step: Model A: The unconditional growth model First step: Model A: The unconditional growth model Let’s get a sense of the data by ignoring UNEMP and fitting the usual unconditional growth model Level-1 Model: Y ij = π 0 i + π 1 i TIME Level-2 Model: Composite Model: π 0 i = γ 00 + ζ 0 i π 1i = γ 10 + ζ 1i ij + ε ij . ALDA.42/mo There’s significant residual withinperson variation There’s significant variation in initial status and rates of change How do we add the timevarying predictor UNEMP? (ALDA. ALDA. Willett.

6656 + 5.85 to 62. 162-167) What about the variance components? 2 4 6 8 10 12 14 Months since job loss © Judith D. which includes the TV predictor UNEMP Monthly rate of decline is cut in half by controlling for UNEMP (still sig.g. ALDA. Willett.3. including UNEMP are 0). σ ε Adding UNEMP to the unconditional growth model (A) reduces its magnitude 68. Singer & John B. Harvard Graduate School of Education.4% of the variation in CES-D scores 2 Look what happened to the Level-2 VC’s In this example. We can clarify what’s happened by decomposing the composite specification back into a Level 1/Level-2 representation (ALDA. pp. which includes the TV predictor UNEMP Fitting and interpreting Model B. Harvard Graduate School of Education. Willett. pp. they’ve increased! Why?: Because including a TV predictor changes the meaning of the individual growth parameters (e.1. all VCs can change. 1 df.7769 − 0. Singer & John B. p<. all VCs can change. Section 5. we know which VCs will predictors.3.001) 20 CES-D Consistently unemployed (UNEMP=1): UNEMP = 1 15 ˆ Y j = (12. slide 17 Variance components behave differently when you’re working with TV predictors Variance components behave differently when you’re working with TV predictors When analyzing time-invariant When analyzing time-invariant predictors..1113) − 0.6656 − 0. 162-167) © Judith D.2020MONTHS j Consistently employed (UNEMP=0): What about people who get a job? 10 UNEMP = 0 ˆ Y j = 12. but predictors.2020 MONTHS j 5 0 (ALDA.39 UNEMP “explains” 9. slide 18 . the intercept now refers to the value of the outcome when all level-1 predictors. Section 5.2020 MONTHS j ˆ Y j = 17. but Level-1 VCs will remain relatively stable Level-1 VCs will remain relatively stable because time-invariant predictors cannot because time-invariant predictors cannot explain much within-person variation explain much within-person variation Level-2 VCs will decline ififthe timeLevel-2 VCs will decline the timeinvariant predictors explain some of the invariant predictors explain some of the between person variation between person variation Although you can interpret aadecrease in Although you can interpret decrease in the magnitude of the Level-1 VCs the magnitude of the Level-1 VCs Changes in Level-2 VCs may not be Changes in Level-2 VCs may not be meaningful! meaningful! Level-1 VC. we know which VCs will change and how: change and how: When analyzing time-varying When analyzing time-varying predictors.5.1. Chapter 5. Chapter 5. ALDA.) UNEMP has a large and stat sig effect Model A is a much poorer fit (Δ Deviance = 25.Fitting and interpreting Model B.

you may not have enough data Here. ALDA. pp. slide 20 . we can’t actually fit this model!! ε ij ~ N ( 0. 2 ⎛ 0 ⎡σ 0 ⎡ζ 0 i ⎤ ⎜ ⎢ ζ ⎥ ~ N ⎜ 0 .3. Harvard Graduate School of Education.Decomposing the composite specification of Model B into a L1/L2 specification Decomposing the composite specification of Model B into a L1/L2 specification Y ij = γ 00 + γ 10 TIME ij + γ 20 UNEMP ij + [ζ 0 i + ζ 1i TIME ij + ε ij ] Level-1 Model: Level-2 Models: Yij = π 0 i + π 1i TIME ij + π 2 i UNEMP ij + ε ij Unlike time-invariant predictors. • Until you’re sure you know what you’re doing. you pay a price you may not be able to afford Adding this one term adds 3 new VCs If you have only a few waves. Willett. Chapter 5. but… • Think carefully about the consequences for both the structural and stochastic parts of the model. pp. Section 5.1. • Don’t just “buy” the default specification in your software. Singer & John B. Willett. Singer & John B. slide 19 Trying to add back the “missing” level-2 stochastic variation in the effect of UNEMP Trying to add back the “missing” level-2 stochastic variation in the effect of UNEMP Level-1 Model: Level-2 Models: Yij = π 0 i + π 1i TIME ij + π 2 i UNEMP ij + ε ij π 0i = γ 00 + ζ 0i π 1i = γ 10 + ζ 1i π 2i = γ 20 + ζ 2i • It’s easy to allow the effect of UNEMP to vary randomly across people by adding in a level-2 residual • Check your software to be sure you know what you’re doing…. σ ε ) 2 σ 01 σ 12 σ 21 σ 02 ⎤ ⎞ ⎥⎟ σ 12 ⎥ ⎟ 2 ⎟ σ 2 ⎥⎟ ⎦ ⎠ Moral: The multilevel model for change can easily handle TV predictors. 169-171) © Judith D. Should we accept this constraint? • Should we assume that the effect of the person-specific predictor is constant across people? • When predictors are time-invariant. across TIME? (ALDA. TV predictors go into the level-1 model π 0i = γ 00 + ζ 0i π 1i = γ 10 + ζ 1i π 2i = γ 20 • Model B’s level-2 model for π2i has no residual! • Model B automatically assumes that π2i is “fixed” (that it has the same value for everyone). ⎢σ and ⎢ 1i ⎥ ⎢ 10 ⎜ ⎢ ⎜ 0 σ 20 ⎢ζ 2 i ⎥ ⎣ ⎦ ⎝ ⎣ But.3. we have no choice • When predictors are time-varying.1. Harvard Graduate School of Education. 168-169) © Judith D. Section 5. ALDA. always write out your model before specifying code to a computer package So… Are we happy with Model B as the final model??? Is there any other way to allow the effect of UNEMP to vary – if not across people. we can try to relax this assumption (ALDA. Chapter 5.

ALDA.2. ALDA.1620 − 0. Willett. Chapter 5. Willett.05) Model B is a much poorer fit than C (Δ Deviance = 4. slide 22 . we automatically allowed predictors to affect the trajectory’s slope Because of the way in which we’ve constructed the models with TV predictors. 171-172) 5 0 2 4 6 8 10 12 14 Months since job loss Should the trajectory for the reemployed be constrained to 0? © Judith D. 171-172) © Judith D. pp. slide 21 Model C: Allowing the effect of a TV predictor to vary over time Model C: Allowing the effect of a TV predictor to vary over time Main effect of TIME is now positive (!) & not stat sig ?!?!?!?!?!?!?!?! UNEMP*TIME interaction is stat sig (p<. just add its interaction with TIME Y ij = γ 00 + γ 10 TIME ij + γ 20 UNEMP ij + γ 30 UNEMP ij × TIME ij + [ζ 0 i + ζ 1i TIME ij + ε ij ] Two possible (equivalent) interpretations: The effect of UNEMP differs across occasions The rate of change in depression differs by unemployment status But you need to think very carefully about the hypothesized error structure: We’ve basically added another level-1 parameter to capture the interaction Just like we asked for the main effect of the TV predictor UNEMP.Model C: Might the effect of a TV predictor vary over time? Model C: Might the effect of a TV predictor vary over time? When analyzing the effects of time-invariant predictors. pp.05) 20 CES-D Consistently unemployed (UNEMP=1) UNEMP 15 =1 ˆ Y j = (9. Singer & John B. p<.6.3. What happens when we fit Model C to data? (ALDA. but we will in a minute. Section 5. Chapter 5.3.3032 MONTHS j Consistently employed (UNEMP=0) 10 UNEMP =0 What about people who get a job? ˆ Y j = 9.4652) MONTHS j ˆ Y j = 18. Harvard Graduate School of Education.1458 − 0. Section 5.2. 1 df. Singer & John B. Harvard Graduate School of Education.5291) + 0.6167 + 8. we’ve automatically constrained UNEMP to have only a “main effect” influencing just the trajectory’s level To allow the effect of the TV predictor to vary over time.6167 + 0. should we allow the interaction effect to vary across people? We won’t right now.1620MONTHS j (ALDA.(0.

Harvard Graduate School of Education.3254MONTHS Yj j ˆ Y j = 18. Harvard Graduate School of Education. might we also need to allow the effect of UNEMP itself to vary randomly? But. Willett. Willett. let’s better align the parts by having UNEMP*TIME be both fixed and random Y ij = γ 00 + γ 20 UNEMP ij + γ 30 UNEMP ij × TIME ij + [ζ 0 i + ζ 3i UNEMP ij × TIME ij + ε ij ] If we’re allowing the UNEMP*TIME slope to vary randomly. Section 5.How should we constrain the individual growth trajectory for the re-employed? How should we constrain the individual growth trajectory for the re-employed? Should we remove the main effect of TIME? (which is the slope when UNEMP=0) Yes. ALDA. Singer & John B. slide 24 .8795) − 0. Section 5.3. Chapter 5.3254MONTHS j Best fitting model (lowest AIC and BIC) What about people who get a job? Consistently employed ˆ Y j = 11.2666 + 6.2. pp. ALDA. this actually fits worse (larger AIC & BIC)! Model D: Yij = γ 00 + γ 20UNEMP + γ 30UNEMP × TIMEij ij ij + [ζ 0i + ζ 2iUNEMP + ζ 3iUNEMP × TIMEij + ε ij ] ij ij UNEMP*TIME has both a fixed & random effect What happens when we fit Model D to data? UNEMP has both a fixed & random effect (ALDA.2.2666 (ALDA. pp. Chapter 5. slide 23 Model D: Constraining the individual growth trajectory among the reemployed Model D: Constraining the individual growth trajectory among the reemployed Consistently unemployed ˆ = (11. 172-173) © Judith D.3.1461 − 0. but this creates a lack of congruence between the model’s fixed and stochastic parts Y ij = γ 00 + γ 10 TIME ij + γ 20 UNEMP ij + γ 30 UNEMP ij × TIME ij + [ζ 0 i + ζ 1i TIME ij + ε ij ] So. Singer & John B. 172-173) © Judith D.

4. POS is the number of positive moods The outcome. Section 5. ALDA. Willett. of course— each person would have 21 mood assessments (most had each person would have 21 mood assessments (most had at least 16 assessments. although person had only and 11only 12) only 12) The outcome. pp. Harvard Graduate School of Education. Section 5. Willett. but there are other options Middle TIME point—focus on the “average” value of the outcome during the study Endpoint—focus on “final status” Any inherently meaningful constant can be used (ALDA. Singer & John B. pm. the researchers prevented all participants from sleeping participants from sleeping Each person was electronically paged 33times aaday (at 88 Each person was electronically paged times day (at am. slide 26 . placebo) Research question: Research question: Pre-intervention night. We always want to center TIME on a value that ensures that the level-1 growth parameters are meaningful. this approach is not sacrosanct. Chapter 5. placebo) antidepressants (vs. POS is the number of positive moods How does POS change over time? How does POS change over time? What is the effect of medication on the trajectories of What is the effect of medication on the trajectories of change? change? (ALDA. 181-183) © Judith D. and 10 pm) to remind them to fill out mood diary diary With full compliance—which didn’t happen. Singer & John B. 33pm. slide 25 Example for recentering the effects of TIME Example for recentering the effects of TIME Data source: Tomarken & colleagues (1997) American Psychological Society Meetings Sample: 73 men and women with major depression who Sample: 73 men and women with major depression who were already being treated with non-pharmacological were already being treated with non-pharmacological therapy therapy Research design Research design Randomized trial to evaluate the efficacy of supplemental Randomized trial to evaluate the efficacy of supplemental antidepressants (vs.Recentering the effects of TIME Recentering the effects of TIME All our examples so far have centered TIME on the first wave of data collection Allows us to interpret the level-1 intercept as individual i’s true initial status While commonplace and usually meaningful. the researchers prevented all Pre-intervention night. of course— With full compliance—which didn’t happen. ALDA. pp. Harvard Graduate School of Education. and 10 pm) to remind them to fill out aamood am. although 11person had only 22and at least 16 assessments. 181-182) © Judith D. Chapter 5.4.

let use a generic form: 2 Level-1 Model: Yij = π 0 i + π 1i (TIME ij − c ) + ε ij . ⎢ 0 ⎜ ⎣0 ⎦ σ ⎣ζ 1i ⎦ ⎣ 10 ⎝ σ 01 ⎤ ⎞ ⎥⎟ σ 12 ⎦ ⎟ ⎠ Notice how changing the value of the centering constant. Willett. c. where ε ij ~ N (0.4. Singer & John B. Singer & John B. ALDA. but how to quantify? TIME—days since study began (centered on first wave of data collection) (TIME-6.How might we clock and code TIME? How might we clock and code TIME? DAY—Intuitively appealing. Chapter 5.67: • π0i is the individual mood at TIME=0 Usually called “initial status” π0i is the individual mood at TIME=3. pp 182-183) © Judith D.67) + ε ij When c = 0: • When c = 3. Willett. changes the definition of the intercept in the level-1 model: Yij = π 0i + π 1iTIMEij + ε ij Yij = π 0i + π 1i (TIMEij − 3. σ ε ) Level-2 Model: π 0 i = γ 00 + γ 01TREAT i + ζ 0 i π 1i = γ 10 + γ 11TREAT i + ζ 1i ⎛ ⎡0 ⎤ ⎡ σ 2 ⎡ζ 0 i ⎤ where ⎢ ⎥ ~ N ⎜ ⎢ ⎥ . but doesn’t distinguish readings each day TIME OF DAY— quantifies 3 distance between readings (could also make unequal) (TIME-3.67) Same as TIME but now centered on the study’s endpoint (ALDA.33: • When c = 6. Section 5. Section 5.33 Useful to think of as“mid-experiment status” π0i is the individual mood at TIME=6.67 Useful to think about as “final status” • • • (ALDA. Harvard Graduate School of Education. ALDA. pp 181-183) © Judith D. slide 27 Understanding what happens when we recenter TIME Understanding what happens when we recenter TIME Instead of writing separate models depending upon the representation for TIME.4.33) Same as TIME but now centered on the study’s midpoint WAVE— Great for data processing—no intuitive meaning READING— right idea.33) + ε ij Yij = π 0i + π 1i (TIMEij − 6. Harvard Graduate School of Education. slide 28 . Chapter 5.

Harvard Graduate School of Education.00 150.4. ALDA.67 ⎝ Individual Initial Status Parameter ⎞ ⎛ TIMEij ⎟ + π 1i ⎜ ⎟ ⎜ 6.00 POS Treatment Control • Betw person res variance in rate of change 140. pp 186-188) © Judith D.4.67 ⎠ ⎝ ⎞ ⎟ + ε ij ⎟ ⎠ Individual Final Status Parameter Advantage: You can use all your longitudinal data to analyze initial and final status simultaneously. Section 5. parameterize the level-1 model so ititproduces one parameter for parameterize the level-1 model so produces one parameter for initial status and one parameter for final status… initial status and one parameter for final status… ⎛ 6. in POS between the groups (TREATment effect) • -3. Chapter 5.11 (ns) at study beginning • 15.00 170.00 180.35 (ns) at study midpoint • 33.80 * at study conclusion The choice of centering constant has no effect on: • Goodness of fit indices • Estimates for rates of change • Within person residual variance 190. Harvard Graduate School of Education. Chapter 5. pp 183-186) © Judith D.00 160.67 − TIMEij Yij = π 0i ⎜ ⎜ 6. Singer & John B.00 0 1 2 3 Days 4 5 6 7 (ALDA. ALDA. Willett. slide 29 You can extend the idea of recentering TIME in lots of interesting ways You can extend the idea of recentering TIME in lots of interesting ways Example: Instead of focusing on rate of change. Singer & John B. slide 30 .Comparing the results of using different centering constants for TIME Comparing the results of using different centering constants for TIME What are affected are the level-1 intercepts γ 00 assesses level of POS at time c for the control group (TREAT=0) γ 01 assesses the diff. Example: Instead of focusing on rate of change. Section 5. Willett. (ALDA.

Chapter Six “Things have changed” Bob Dylan Judith D. exponential. Singer & John B. exponential. Willett Harvard Graduate School of Education © Judith D. for example Logistic.1)—especially useful when discrete shocks or Discontinuous individual change (§6. and negative exponential models. it’s very easy to do While admittedly atheoretical. Harvard Graduate School of Education. Harvard Graduate School of Education. slide 2 .Modeling discontinuous and nonlinear change ALDA.4) Logistic.3) Using polynomials of TIME to represent non-linear change (§6.2)—perhaps the easiest way of fitting non-linear change models way of fitting non-linear change models Can transform either the outcome or TIME Can transform either the outcome or TIME We already did this with ALCUSE (which was aasquare root of aasum of 44items) We already did this with ALCUSE (which was square root of sum of items) Using polynomials of TIME to represent non-linear change (§6. for example AAworld of possibilities limited only by your theory (and the quality and amount of data) world of possibilities limited only by your theory (and the quality and amount of data) © Judith D. Willett.4) Truly non-linear trajectories (§6. Chapter 6.3) While admittedly atheoretical. ALDA. Chapter 6.1)—especially useful when discrete shocks or time-limited treatments affect the life course time-limited treatments affect the life course Using transformations to model non-linear change (§6. and negative exponential models. it’s very easy to do Probably the most popular approach in practice Probably the most popular approach in practice Truly non-linear trajectories (§6. Willett. slide 1 Chapter 6: Modeling discontinuous and nonlinear change Chapter 6: Modeling discontinuous and nonlinear change General idea: All our examples so far have assumed that individual growth is smooth and linear. Singer & John B. But the multilevel model for change is much more flexible: Discontinuous individual change (§6. Singer & John B.2)—perhaps the easiest Using transformations to model non-linear change (§6. ALDA.

Section 6. Chapter 6.5 0 2 4 6 EXPER (ALDA. Chapter 6. Singer & John B. Willett. pp 190-193) © Judith D.1. Harvard Graduate School of Education. ALDA.1. no difference in elevation GED B: An immediate shift in elevation.6% (n=307) earned GED at some point during data collection during data collection OLD research questions OLD research questions How do log(WAGES) change over time? How do log(WAGES) change over time? Do the wage trajectories differ by ethnicity and Do the wage trajectories differ by ethnicity and highest grade completed? highest grade completed? Additional NEW research questions: What is the Additional NEW research questions: What is the effect of GED attainment? Does earning aa effect of GED attainment? Does earning GED: GED: affect the wage trajectory’s elevation? affect the wage trajectory’s elevation? affect the wage trajectory’s slope? affect the wage trajectory’s slope? create aadiscontinuity in the wage trajectory? create discontinuity in the wage trajectory? (ALDA.1. Harvard Graduate School of Education. slide 4 . Singer & John B. p 193) How do we model trajectories like these within the context of a linear growth model??? 8 10 © Judith D. Figure 6. Willett. slide 3 First steps: Think about how GED receipt might affect an individual’s wage trajectory First steps: Think about how GED receipt might affect an individual’s wage trajectory Let’s start by considering four plausible effects of GED receipt by imagining what the wage trajectory might look like for someone who got a GED 3 years after labor force entry (post dropout) 2. ALDA. Boudett and Willett (1999). no difference in rate of change 2.6% (n=307) earned aaGED at some point 34.0 A: No effect of GED whatsoever 1.Example for discontinuous individual change: Wage trajectories & the GED Example for discontinuous individual change: Wage trajectories & the GED Data source: Murnane. Evaluation Review Sample: the same 888 male high school Sample: the same 888 male high school dropouts (from before) dropouts (from before) Research design Research design Each was interviewed between 11and 13 times Each was interviewed between and 13 times after dropping out after dropping out 34.5 LNW F: Immediate shifts in both elevation & rate of change D: An immediate shift in rate of change.

not slope (Trajectory B) Including a discontinuity in elevation.4 LNW Yij = π 0i + π 1i EXPERij + π 2i GEDij + ε ij Common rate of change Pre-Post GED.1. π0i 2 4 6 EXPER 8 10 (ALDA.0 Rate of change Pre GED. Chapter 6.8 Elevation differential on GED receipt.1. ALDA.6 0 LNW at labor force entry. simply include GED as time-varying effect at level-1 2.1.4 2. slide 6 . π1i 2. not slope (Trajectory B) Key idea: It’s easy.2 Post-GED (GED=1): Yij = (π 0i + π 2i ) + π 1i EXPERij + ε ij 2. not elevation (Trajectory D) Yij = π 0i + π 1i EXPERij + π 3i POSTEXPij + ε ij Post-GED (POSTEXP clocked in same cadence as EXPER): Yij = π 0i + π 1i EXPERij + π 3i POSTEXP + ε ij LNW 2. Chapter 6. Willett. π0i 0 2 4 6 EXPER 8 10 1. Section 6. pp 195-198) © Judith D. slide 5 Using an additional temporal predictor to capture the “extra slope” post-GED receipt Using an additional temporal predictor to capture the “extra slope” post-GED receipt Including a discontinuity in slope. Singer & John B.1.6 (ALDA.8 Yij = π 0i + π 1i EXPERij + ε ij LNW at labor force entry. Willett. Harvard Graduate School of Education.” a new TV predictor that clocks “TIME since GED receipt” (in the same cadence as EXPER) 2. π3i POSTEXPij = 0 prior to GED POSTEXPij = “Post GED experience.0 Pre-GED (GED=0): 1. not elevation (Trajectory D) Including a discontinuity in slope. Harvard Graduate School of Education.2 Slope differential Pre-Post GED. π1i Pre-GED (POSTEXP=0): 1. pp 194-195) © Judith D.Including a discontinuity in elevation. Singer & John B. π2i Yij = π 0i + π 1i EXPERij + ε ij 1. Section 6. ALDA. simply include GED as aatime-varying effect at level-1 Key idea: It’s easy.

Harvard Graduate School of Education. Singer & John B. π1i Constant elevation differential on GED receipt.1. than by the ability to specify the model Extra terms in the level-1 model translate into extra parameters to estimate Think carefully about what kinds of discontinuities might arise in your substantive context How do we select among the alternative discontinuous models? (ALDA. Section 6. ALDA. π2i LNW at labor force entry.Including a discontinuities in both elevation and slope (Trajectory F) Including a discontinuities in both elevation and slope (Trajectory F) Simple idea::Combine the two previous approaches Simple idea Combine the two previous approaches Yij = π 0i + π 1i EXPERij + π 2i GED + π 3i POSTEXPij + ε ij 2.1. slide 8 . or both.1. Harvard Graduate School of Education. on entry in college for GED recipients) Just like a regular regression model. 199-201) You might have non-linear changes before or after the transition point The effect of GED receipt might be instantaneous but not endure The effect of GED receipt might be delayed Might there be multiple transition points (e. Chapter 6.4 LNW 2. Willett. nonlinearities and other ‘nonstandard’ terms Generally more limited by data. Willett. π0i 0 2 4 6 EXPER 8 10 1.g.2 Slope differential Pre-Post GED. theory. ALDA.. pp 195-198) © Judith D.8 Pre-GED Yij = π 0i + π 1i EXPERij + ε ij 1. the multilevel model for change can include discontinuities. Section 6. Chapter 6.0 Rate of change Pre GED. slide 7 Many other types of discontinuous individual change trajectories are possible Many other types of discontinuous individual change trajectories are possible What kinds of other complex trajectories could be used? Effects on elevation and slope can depend upon timing of GED receipt (ALDA pp. pp199-201) © Judith D. π3i Yij = (π 0i + π 2i ) + π 1i EXPER + π 3i POSTEXP + ε ij Post-GED 2.6 (ALDA. Singer & John B.1.

Let’s start with a “baseline model” (Model A) Let’s start with a “baseline model” (Model A) against which we’ll compare alternative discontinuous trajectories against which we’ll compare alternative discontinuous trajectories (UERATE-7) is the local area unemployment rate (added in previous chapter as an example of a TV predictor). Section 6. slide 10 . Singer & John B. Chapter 6. ALDA. slide 9 How we’re going to proceed… How we’re going to proceed… Instead of constructing tables of (seemingly endless) parameter estimates. Singer & John B.2.) deviance statistic (for model comparison) (ALDA. Harvard Graduate School of Education. Willett. we’re going to construct a summary table that presents the… specific terms in the model Baseline just shown n parameters (for d. Harvard Graduate School of Education.f. centered around 7% for interpretability Benchmark against which we’ll evaluate discontinuous models Yij = π 0i + π 1i EXPERij + π 2i (UERATE ij − 7) + ε ij π 0i = γ 00 + γ 01 ( HGC i − 9) + ζ 0i π 1i = γ 10 + γ 11 BLACK i + ζ 1i π 2i = γ 20 ⎛ ⎡0⎤ ⎡ σ 2 σ 01 ⎤ ⎞ ⎡ζ ⎤ ε ij ~ N (0. Section 6.1.1. Chapter 6. pp 201-202) 4 random effects 5 fixed effects © Judith D. we need to know how many parameters have been estimated to achieve this value of deviance (ALDA. ⎢ 0 ⎥⎟ ⎜ ⎣0 ⎦ σ σ 12 ⎦ ⎟ ⎣ζ 1i ⎦ ⎣ 10 ⎝ ⎠ -7 To appropriately compare this deviance statistic to more complex models. Willett. σ ε2 ) and ⎢ 0i ⎥ ~ N ⎜ ⎢ ⎥.2. pp 202-203) © Judith D. ALDA.

Harvard Graduate School of Education. Willett.1. Singer & John B. 3 extra random) ΔDeviance=13.2. pp 202-203) © Judith D. p<. slide 11 Next steps: Investigating the discontinuity in slope by adding the effect of POSTEXP Next steps: Investigating the discontinuity in slope by adding the effect of POSTEXP (without the GED effect producing a discontinuity in elevation) (without the GED effect producing a discontinuity in elevation) D: Adding POSTEXP as both a fixed and random effect (1 extra fixed parameter.First steps: Investigating the discontinuity in elevation by adding the effect of GED First steps: Investigating the discontinuity in elevation by adding the effect of GED B: Add GED as both a fixed and random effect (1 extra fixed parameter. 3 df.001—keep GED effect C: But does the GED discontinuity vary across people? (do we need to keep the extra VCs for the effect of GED?) ΔDeviance=12. Section 6.1. ALDA.05— keep POSTEXP effect E: But does the POSTEXP slope vary across people? (do we need to keep the extra VCs for the effect of POSTEXP?) ΔDeviance=3.2. 3 df. Harvard Graduate School of Education. 4 df. Willett.3. Section 6. Chapter 6.8. 4 df.01— keep VCs What about the discontinuity in slope? (ALDA.1. p<.0. 3 extra random) ΔDeviance=25. ns—don’t need the POSTEXP random effects (but in comparison with A still need POSTEXP fixed effect) (ALDA. p<. Chapter 6. Singer & John B. pp 203-204) © Judith D. slide 12 What if we include both types of discontinuity? . ALDA.

1. with D shows significance of GED (ALDA. Chapter 6. Willett. ALDA.2. pp 204-205) © Judith D. Chapter 6. with B shows significance of POSTEXP comp. suggesting that Model F (which includes both random effects) is better (even though Model E suggested we might be able to eliminate the VC for POSTEXP) We actually fit several other possible models (see ALDA) but F was the best alternative—so…how do we display its results? (ALDA.2. slide 13 Can we simplify this model by eliminating the VCs for POSTEXP (G) or GED (H)? Can we simplify this model by eliminating the VCs for POSTEXP (G) or GED (H)? Each results in a worse fit. Harvard Graduate School of Education. Singer & John B. Section 6. Singer & John B.1. ALDA.Examining both discontinuities simultaneously Examining both discontinuities simultaneously F: Add GED and POSTEXP simultaneously (each as both fixed and random effects) comp. pp 204-205) © Judith D. Section 6. Harvard Graduate School of Education. Willett. slide 14 .

pp 208-210) 16 17 We can ‘detransform’ the findings and return to the original scale.4 White/ Latino 2. Section 6. we usually begin by trying transformation: When facing obviously non-linear trajectories. Singer & John B. Harvard Graduate School of Education. Chapter 6. ALDA.Displaying prototypical discontinuous trajectories Displaying prototypical discontinuous trajectories (Log Wages for HS dropouts pre.and post-GED attainment) (Log Wages for HS dropouts pre. pp 204-206) 2 4 6 EXPERIENCE 8 10 © Judith D. by squaring the predicted values of ALCUSE and replotting So…how do we know what variable to transform using what transformation? © Judith D.2% (vs. an outcome that we formed by taking the square root of the researchers’ original alcohol use measurement (ALDA.8 9th grade dropouts • Upon GED receipt.and post-GED attainment) Race • At dropout.2% pre-receipt) 1.2. Singer & John B. wages rise annually by 5.6 0 (ALDA.2 12th grade dropouts earned a GED Black 2 Highest grade completed • Those who stay longer have higher initial wages • This differential remains constant over time GED receipt has two effects 1. slide 16 . transformation to another ad hoc scale may sacrifice little 2 ALCUSE COA = 1 High The prototypical individual growth trajectories are now non-linear: By transforming the outcome before analysis. we modeled ALCUSE. wages rise immediately by 4. Willett. Willett. we usually begin by trying transformation: A straight line—even on a transformed scale—is a simple form with easily interpretable parameters A straight line—even on a transformed scale—is a simple form with easily interpretable parameters Since many outcome metrics are ad hoc. ALDA. Section 6. 4.2.1. Chapter 6. Harvard Graduate School of Education. slide 15 Modeling non-linear change using transformations Modeling non-linear change using transformations When facing obviously non-linear trajectories. no racial differences in wages • Racial disparities increase over time because wages for Blacks increase at a slower rate LNW 2.2% • Post-GED receipt. transformation to another ad hoc scale may sacrifice little Since many outcome metrics are ad hoc. we have effectively modeled non-linear change over time 1 PEER Low High COA = 0 PEER Low 0 13 14 15 AGE Earlier.

ALDA. Harvard Graduate School of Education. Singer & John B. 211-213) © Judith D. Singer & John B. Section 6. Plot many empirical growth trajectories You find linearizing transformations by moving “up” or “down” in the direction of the “bulge” Generic variable V compress scale (ALDA.1. Harvard Graduate School of Education. Section 6. Chapter 6. pp.2.2.1. Chapter 6. Willett. Willett. ALDA. slide 17 The effects of transformation for a single child in the Berkeley Growth Study The effects of transformation for a single child in the Berkeley Growth Study Down in TIME Up in IQ expand scale How else might we model non-linear change? (ALDA.The “Rule of the Bulge” and the “Ladder of Transformations” The “Rule of the Bulge” and the “Ladder of Transformations” Mosteller & Tukey (1977): EDA techniques for straightening lines Mosteller & Tukey (1977): EDA techniques for straightening lines Step 2: How do we know when to use which transformation? Step 1: What kinds of transformations do we consider? 1. pp. 2. 210-212) © Judith D. slide 18 .

p. disruptive. 1=sometimes. Chapter 6. ALDA. slide 19 Example for illustrating use of polynomials in TIME to represent change Example for illustrating use of polynomials in TIME to represent change Source: Margaret Keiley & colleagues (2000). Chapter 6.3. Section 6. TIME2 and TIME3 • Can keep on adding powers of TIME • Each extra polynomial adds another stationary point—a cubic has 2 (ALDA. 2=often) 24 aggressive. Harvard Graduate School of Education. pp.3. more dramatic its effect • Peak is called a “stationary point”—a quadratic has 1. Section 6. larger its value. or delinquent behaviors Outcome: EXTERNAL—ranges from 00to 68 Outcome: EXTERNAL—ranges from to 68 (simple sum of these scores) (simple sum of these scores) Predictor: FEMALE—are there gender Predictor: FEMALE—are there gender differences? differences? Research question Research question How does children’s level of externalizing How does children’s level of externalizing behavior change over time? behavior change over time? Do the trajectories of change differ for boys and Do the trajectories of change differ for boys and girls? girls? (ALDA. Willett. teachers rated each child’s level of externalizing behavior using each child’s level of externalizing behavior using Achenbach’s Child Behavior Checklist: Achenbach’s Child Behavior Checklist: 33 point scale (0=rarely/never. ALDA. teachers rated At the end of every school year. Singer & John B. 217) © Judith D. Willett.Representing individual change using a polynomial function of TIME Representing individual change using a polynomial function of TIME Polynomial of the “zero order” (because TIME0=1) • Like including a constant predictor 1 in the level-1 model • Intercept represents vertical elevation • Different people can have different elevations Polynomial of the “first order” (because TIME1=TIME) • Familiar individual growth model • Varying intercepts and slopes yield criss-crossing lines Second order polynomial for quadratic change • Includes both TIME and TIME2 • π0i=intercept. 2=often) point scale (0=rarely/never.2. 213-217) © Judith D. Singer & John B. J of Abnormal Child Psychology st Sample: 45 boys and girls identified in 11stgrade: Sample: 45 boys and girls identified in thgrade: Goal was to study behavior changes over time (until 66thgrade) Goal was to study behavior changes over time (until grade) Research design Research design At the end of every school year. 1=sometimes. Harvard Graduate School of Education. Third order polynomial for cubic change • Includes TIME. or delinquent behaviors 24 aggressive. disruptive.1. slide 20 . but now both TIME and TIME2 must be 0 • π1i=instantaneous rate of change when TIME=0 (there is no longer a constant slope) • π2i=curvature parameter.

Harvard Graduate School of Education. how do you select a common polynomial for analysis? (ALDA.2. pp 217-220) © Judith D. ALDA.Examining empirical growth plots (which invariably display great variability in temporal complexity) Examining empirical growth plots (which invariably display great variability in temporal complexity) Quadratic change (but with varying curvatures) Selecting a suitable level-1 polynomial trajectory for change Selecting a suitable level-1 polynomial trajectory for change Linear decline (at least until 4th grade) Little change over time (flat line?) Two stationary points? (suggests a cubic) Three stationary points? (suggests a quartic!!!) When faced with so many different patterns.2. Chapter 6.3. Chapter 6. pp 217-220) © Judith D. Harvard Graduate School of Education. slide 21 Order optimized for each child (solid curves) and a common quartic across children (dashed line) Order optimized for each child (solid curves) and a common quartic across children (dashed line) First impression: Most fitted trajectories provide a reasonable summary for each child’s data Second impression: Maybe these ad hoc decisions aren’t the best? Examining alternative fitted OLS polynomial trajectories Examining alternative fitted OLS polynomial trajectories dra t ic? Third realization: We need a common polynomial across all cases (and might the quartic be just too complex)? Using sample data to draw conclusions about the shape of the underlying true trajectories is tricky—let’s compare alternative models (ALDA. Section 6. Willett. Singer & John B. Singer & John B. ALDA.3. slide 22 Would a quadr atic d o? Qu a . Section 6. Willett.

slide 24 . slide 23 Example for truly non-linear change Example for truly non-linear change Data source: Terry Tivnan (1980) Dissertation at Harvard Graduate School of Education Sample: 17 1st and 2nd graders Sample: 17 1st and 2nd graders During aa33week period. Harvard Graduate School of Education. Willett.Using model comparisons to test higher order terms in a polynomial level-1 model Using model comparisons to test higher order terms in a polynomial level-1 model Add polynomial functions of TIME to person period data set Compare goodness of fit (accounting for all the extra parameters that get estimated) A: significant between. Singer & John B. Harvard Graduate School of Education. that they use to try to trap the fox Great for studying cognitive development because: Great for studying cognitive development because: There exists a strategy that children can learn that will guarantee victory There exists a strategy that children can learn that will guarantee victory This strategy is not immediately obvious to children This strategy is not immediately obvious to children Many children can deduce the strategy over time Many children can deduce the strategy over time Research design Research design Each child played up to 27 games (each game is aa Each child played up to 27 games (each game is “wave”) “wave”) The outcome. Singer & John B. 3df. p<. p<.4. Chapter 6. Willett. Terry repeatedly played aatwoDuring week period.3. 5df. Terry repeatedly played twoperson checkerboard game called Fox ‘n Geese. person checkerboard game called Fox ‘n Geese.1. but now VCs are ns also ΔDeviance=11. NMOVES is the number of moves made by the child before making aacatastrophic error the child before making catastrophic error (guaranteeing defeat)—ranges from 11to 20 (guaranteeing defeat)—ranges from to 20 Research question: Research question: How does NMOVES change over time? How does NMOVES change over time? What is the effect of aachild’s reading (or cognitive) What is the effect of child’s reading (or cognitive) ability?—READ (score on aastandardized reading test) ability?—READ (score on standardized reading test) (ALDA.3. ns Quadratic (C) is best choice— and it turns out there are no gender differentials at all. 4df. ALDA. ALDA.01 D: still no fixed effects for TIME terms. pp 220-223) © Judith D.5. that they use to try to trap the fox Children have four geese. at one end of the board Fox is controlled by the experimenter.0.1. at one end of the board Children have four geese. Section 6.and within-child variation B: no fixed effect of TIME but significant var comps ΔDeviance=18. Chapter 6. pp. 224-225) © Judith D. Section 6.01 C: no fixed effects of TIME & TIME2 but significant var comps ΔDeviance=16. (ALDA. (hopefully) learning from experience (hopefully) learning from experience Fox is controlled by the experimenter. NMOVES is the number of moves made by The outcome.

5 π1 = 0. pp 226-230) © Judith D. slide 25 Understanding the logistic individual growth trajectory Understanding the logistic individual growth trajectory (which is anything but linear in the individual growth parameters) (which is anything but linear in the individual growth parameters) Upper asymptote in this particular model is constrained to be 20 (1+19) π0i is related to.5 Models can be fit in usual way using provided your software can do it ⇒ (ALDA.3 15 15 15 π1 = 0.4.1 π1 = 0.4.Examining empirical growth plots (and asking what features should the hypothesized model display?) Examining empirical growth plots (and asking what features should the hypothesized model display?) A lower asymptote. Willett. Section 6. the trajectory rises slowly (often not reaching an asymptote) 0 10 Game 20 30 π1 = 0. ALDA. Singer & John B. because everyone makes at least 1 move and it takes a while to figure out what’s going on Selecting a suitable level-1 nonlinear trajectory for change Selecting a suitable level-1 nonlinear trajectory for change An upper asymptote. pp.1 5 5 5 Higher the value of π0i.which unlike our previous growth models will be non-linear in the individual growth parameters (ALDA. because a child can make only a finite # moves each game A smooth curve joining the asymptotes. Willett.2. Harvard Graduate School of Education. Harvard Graduate School of Education.3 10 10 π1 = 0.5 10 π1 = 0.2.3 20 π1 = 0. the trajectory rises more rapidly 19 + ε ij Yij = 1 + −π TIME 1 + π 0i e 1i ij 25 NMOVES 25 NMOVES π1i determines the rapidity with which the trajectory approaches the upper asymptote 25 NMOVES 20 20 π1 = 0. the intercept When π1i is large. the lower the intercept When π1i is small. 225-228) © Judith D. that initially accelerates and then decelerates These three features suggest a level-1 logistic change trajectory. Chapter 6. ALDA. and determines. Section 6.1 0 0 10 Game 20 30 0 0 10 Game 20 30 0 π0 = 150 π0 = 15 π0 = 1.5 π1 = 0. Singer & John B. slide 26 . Chapter 6.

3) Yij = α i − 1 + ε ij π 1i TIMEij Yij =π 0i e 1i π TIME ij + εij Yij = αi − 1 + ε ij (π1iTIME + π 2iTIME2 ) ij ij Yij =αi − (αi −π0i )e −π1iTIME ij + εij (ALDA.4. page 28 . Chapter 6.4.Results of fitting logistic change trajectories to the Fox ‘n Geese data Results of fitting logistic change trajectories to the Fox ‘n Geese data Begins low and rises smoothly and non-linearly Not statistically significant (note small n’s).3. Harvard Graduate School of Education. but better READers approach asymptote more rapidly (ALDA. slide 28 © Singer & Willett. ALDA. Harvard Graduate School of Education. slide 27 A limitless array of non-linear trajectories awaits… A limitless array of non-linear trajectories awaits… (each is illustrated in detail in ALDA. Chapter 6.4. ALDA. Singer & John B. Willett.4. Section 6. pp 232-242) © Judith D.2. Section 6.3) (each is illustrated in detail in ALDA. Willett. Section 6. Section 6. Singer & John B. pp 229-232) © Judith D.

slide 2 . Singer & John B. Singer & John B. Harvard Graduate School of Education. Willett. Willett. slide 1 Resources to help you learn how to use SAS Proc Mixed Resources to help you learn how to use SAS Proc Mixed Textbook Examples Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence by Judith D. Willett Harvard Graduate School of Education © Judith D. Using SAS Proc Mixed. Singer & John B. Harvard Graduate School of Education.Using SAS Proc Mixed to fit the multilevel model for change Time is nature’s way of keeping everything from happening at once Woody Allen Judith D. Singer and John B. Using SAS Proc Mixed. Willett MLwiN Mplus SPlus SPSS Stata Chapter Table of contents A framework for investigating change over time Exploring longitudinal data on change Introducing the multilevel model for change Doing data analysis with the multilevel model for change Treating time more flexibly Modeling discontinuous and nonlinear change Examining the multilevel model’s error covariance structure Modeling change using covariance structure analysis A framework for investigating event occurrence Describing discrete-time event occurrence data Fitting basic discrete-time hazard models Extending the discrete-time hazard model Describing continuous-time event occurrence data Fitting the Cox regression model Extending the Cox regression model HLM SAS Datasets Ch 1 Ch 2 Ch 3 Ch 4 Ch 5 Ch 6 Ch 7 Ch 8 Ch 9 Ch 10 Ch 11 Ch 12 Ch 13 Ch 14 Ch 15 What we’ll do now: Using the specific models we just What we’ll do now: Using the specific models we just fit in Chapter Four to demonstrate how to use fit in Chapter Four to demonstrate how to use SAS PROC MIXED to fit these models to data SAS PROC MIXED to fit these models to data Model A: The unconditional means model Model A: The unconditional means model Model B: The unconditional growth model Model B: The unconditional growth model Model C: The uncontrolled effects of COA Model C: The uncontrolled effects of COA Model D: The controlled effects of COA Model D: The controlled effects of COA © Judith D.

1191 0.Using SAS Proc Mixed to fit Model A (the unconditional means model) Using SAS Proc Mixed to fit Model A (the unconditional means model) Level-1 Model: Y ij = π 0 i + ε ij . SAS always includes a variance component for the level-1 residuals.3 683.9220 DF 81 t Value 9. If you omit this option. here using the dataset named “one. • The class id statement tells SAS to treat the variable ID as a categorical (in SAS’ terms.2 676.4 Solution for Fixed Effects Standard Error 0. Harvard Graduate School of Education. Willett.5617 Pr Z <. where ε ij ~ N ( 0 . slide 27) • The covtest option tells SAS to display tests for the variance components.06 Cov Parm Intercept Residual Subject ID Estimate 0. by default SAS uses restricted maximum likelihood (as discussed on Chapter 4. Harvard Graduate School of Education. slide 3 Results of fitting Model A (the unconditional means model) to data Results of fitting Model A (the unconditional means model) to data Level-1 Model: Y ij = π 0 i + ε ij . By default. Using SAS Proc Mixed. random intercept/subject=id. class id. If you omit this statement. a classification) variable. by default. • The proc mixed statement invokes the procedure. class id. Willett. model alcuse = /solution.09571 Effect Intercept Estimate 0.σ Composite Model: 2 0) Y ij = γ 00 + ζ 0 i + ε ij proc mixed data=one method=ml covtest.0001 Fit Statistics -2 Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) 670. where ε ij ~ N ( 0 . Using SAS Proc Mixed. • The model statement specifies the structural portion of the multilevel model for change.0001 <. Model A: Unconditional means model The Mixed Procedure Covariance Parameter Estimates Standard Error 0. Singer & John B.2 676. the ‘random intercept’ option tells SAS to also include a variance component for the intercept (allowing the means to vary across people). slide 9). • The /solution option on the model statement tells SAS to display the estimated fixed effects (as well as the associated standard errors and hypothesis tests). slide 4 .σ 0 ) Composite Model: Y ij = γ 00 + ζ 0 i + ε ij proc mixed data=one method=ml covtest. • The /subject=id option tells SAS that the intercepts (the means in this unconditional means model) should be allowed to vary randomly across individuals (as identified by the classification variable ID) © Judith D. where ζ 0i ~ N (0. In this unconditional means model. This specification ‘model alcuse = ’ may seem unusual but it’s the way SAS represents the unconditional means model (see Chapter 4. σ ε2 ) Level-2 Model: π 0i = γ 00 + ζ 0i . σ ε2 ) 2 Level-2 Model: π 0i = γ 00 + ζ 0i .” • The method = ml option tells SAS to use full maximum likelihood estimation.73 9.5639 0.63 Pr > |t| <. model alcuse = /solution.0001 © Judith D. includes an implicit intercept by default. The model includes no explicit predictor. By default. SAS would treat ID as a continuous variable. slide 23). Singer & John B. • The random statement specifies the stochastic portion of the multilevel model for change. but like any regression model. random intercept/subject=id.06203 Z Value 4. where ζ 0i ~ N (0. SAS omits these tests (as discussed on Chapter 4.

Willett. Using SAS Proc Mixed.05268 6.05647 2.0037 <. But because Model B includes a second random effect to capture the hypothesized level-2 stochastic variation. Parameter #1 Parameter #2 σ 01 ⎤ ⎞ ⎥⎟ σ 12 ⎦ ⎟ ⎠ Yij = γ 00 + γ 10 ( AGE − 14 ) ij + [ζ 0i + ζ 1i ( AGE − 14 ) ij + ε ij ] Model B: Unconditional growth model The Mixed Procedure Covariance Parameter Estimates Standard Error Z Value Cov Parm UN(1. Harvard Graduate School of Education.1512 0. where ε ij ~ N ( 0 .1051 0.3373 Pr Z <. Harvard Graduate School of Education. the unconditional growth model.0001 <. Singer & John B. the intercept now represents “initial status.22 0. Willett. which stands for unstructured.2) Residual Subject ID ID ID Estimate 0. SAS implicitly understands that the user wishes to include an intercept term.2707 DF 81 81 t Value 6.” © Judith D. σ ε2 ) π 0i = γ 00 + ζ 0i π 1i = γ 10 + ζ 1i ⎛ ⎡0 ⎤ ⎡ σ 2 ⎡ζ ⎤ where ⎢ 0i ⎥ ~ N ⎜ ⎢ ⎥. representing the slope of the level-1 individual growth trajectory. model alcuse = age_14/solution. the random statement must be modified to include this second term—denoted by the temporal predictor AGE_14. where ε ij ~ N ( 0 .98 0. • The /type=un. Singer & John B. Using SAS Proc Mixed.0001 0. σ ε2 ) π 0i = γ 00 + ζ 0i π 1i = γ 10 + ζ 1i ⎛ ⎡0 ⎤ ⎡ σ 2 ⎡ζ ⎤ where ⎢ 0i ⎥ ~ N ⎜ ⎢ ⎥.68 0. SAS implicitly assumes a variance component for the level-1 residuals. slide 5 Results of fitting Model B (the unconditional growth model) to data Results of fitting Model B (the unconditional growth model) to data Yij = π 0 i + π 1i ( AGE − 14 ) ij + ε ij .06844 0.1) UN(2. class id.6513 0.0001 © Judith D. • As before.0 663.1 Solution for Fixed Effects Standard Error 0.0001 0. model alcuse = age_14/solution.33 Pr > |t| <.6 649. As before.3288 0. random intercept age_14/type=un subject=id. ⎢ 0 ⎜ ⎣0 ⎦ σ ⎣ζ 1i ⎦ ⎣ 10 ⎝ Level-2 Model: Composite Model: σ 01 ⎤ ⎞ ⎥⎟ σ 12 ⎦ ⎟ ⎠ Yij = γ 00 + γ 10 ( AGE − 14 ) ij + [ζ 0 i + ζ 1i ( AGE − 14 ) ij + ε ij ] proc mixed data=one method=ml covtest.20 4.1481 4. slide 6 . ⎢ 0 ⎜ ⎣0 ⎦ σ ⎣ζ 1i ⎦ ⎣ 10 ⎝ proc mixed data=one method=ml covtest.6244 -0.1) UN(2. telling SAS to not impose any structure on the variance covariance matrix for the level-2 residuals. includes a single predictor.07008 -0. random intercept age_14/type=un subject=id.06245 Effect Intercept AGE_14 Estimate 0.6 648. class id. Because the predictor age_14 is centered at age 14 (the first wave of data collection). is crucial.Using SAS Proc Mixed to fit Model B (the unconditional growth model) Using SAS Proc Mixed to fit Model B (the unconditional growth model) Level-1 Model: Yij = π 0 i + π 1i ( AGE − 14 ) ij + ε ij . age_14.40 Fit Statistics -2 Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) 636. • Model B.

5 Solution for Fixed Effects Standard Error 0.40 Cov Parm UN(1. Willett. Singer & John B.05639 0.90 2.0179 0.Using SAS Proc Mixed to fit Model C (Uncontrolled effects of COA) Using SAS Proc Mixed to fit Model C (Uncontrolled effects of COA) 2 Level-1 Model: Yij = π 0 i + π 1i ( AGE − 14 ) ij + ε ij . slide 7 Results of fitting Model C (the uncontrolled effects of COA) to data Results of fitting Model C (the uncontrolled effects of COA) to data Yij = π 0 i + π 1i ( AGE − 14 ) ij + ε ij .81 -0.1307 0. Harvard Graduate School of Education. Using SAS Proc Mixed.42 3. and (2) the cross-level interaction. Willett.2930 -0. Using SAS Proc Mixed. which captures the effect of COA on the rate of change • All other statements. Yij = γ 00 + γ 01COAi + γ 10 ( AGE − 14 ) ij + γ 11COAi * ( AGE − 14 ) ij + [ζ 0 i + ζ 1i ( AGE − 14 ) ij + ε ij ] Model C: Uncontrolled effects of COA The Mixed Procedure Covariance Parameter Estimates Standard Error 0.4876 -0.2) Residual Subject ID ID ID Estimate 0. random intercept age_14/type=un subject=id.39 Pr > |t| 0. including the random statement. Harvard Graduate School of Education. ⎢ 0 2 ⎥⎟ ⎜ ⎣0 ⎦ σ ⎣ζ 1i ⎦ ⎣ 10 σ 1 ⎦ ⎠ ⎝ proc mixed data=one method=ml covtest.1254 Effect Intercept COA AGE_14 COA*AGE_14 Estimate 0.0038 <. which captures the effect on the intercept (initial status).0001 0.1278 0. model alcuse = coa age_14 coa*age_14/solution.67 6. where ε ij ~ N ( 0 . COA*AGE_14. class id.1) UN(2.3373 Pr Z <. class id. © Judith D. Singer & John B.04943 DF 80 82 80 82 t Value 2.2 637.2 637. • Like the companion Level-2 model. Model C adds two terms to register the uncontrolled effects of COA: (1) a main effect of COA.05268 Z Value 3.08423 0. model alcuse = coa age_14 coa*age_14/solution.1506 0. where ε ij ~ N ( 0 . ⎢ 0 2 ⎥⎟ ⎜ ⎣0 ⎦ σ ⎣ζ 1i ⎦ ⎣ 10 σ 1 ⎦ ⎠ ⎝ Yij = γ 00 + γ 01COAi + γ 10 ( AGE − 14 ) ij + γ 11COAi * ( AGE − 14 ) ij + [ζ 0 i + ζ 1i ( AGE − 14 ) ij + ε ij ] proc mixed data=one method=ml covtest.3160 0.0001 Fit Statistics -2 Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) 621. random intercept age_14/type=un subject=id.0008 0.3666 0.06573 0.7432 0.82 3. σ ε2 ) π 0i = γ 00 + γ 01COAi + ζ 0i π 1i = γ 10 + γ 11COA i + ζ 1i ⎛ ⎡0⎤ ⎡ σ 2 σ 01 ⎤ ⎞ ⎡ζ ⎤ ⎟ where ⎢ 0i ⎥ ~ N ⎜ ⎢ ⎥. are unchanged from Model B because we have only added new fixed effects (for COA) and not any new random effects.8 656.1) UN(2. slide 8 .1946 0.0003 0.05934 0.6944 © Judith D. σ ε ) Level-2 Model: Composite Model: π 0i = γ 00 + γ 01COAi + ζ 0i π 1i = γ 10 + γ 11COA i + ζ 1i ⎛ ⎡0⎤ ⎡ σ 2 σ 01 ⎤ ⎞ ⎡ζ ⎤ ⎟ where ⎢ 0i ⎥ ~ N ⎜ ⎢ ⎥.48 -0.

54 6. © Judith D. Model D adds two terms to register the controlled effects of PEER: (1) a main effect of PEER. Singer & John B.4294 -0.08564 Effect Intercept COA PEER AGE_14 COA*AGE_14 PEER*AGE_14 Estimate -0. and (2) the cross-level interaction.14 3. Harvard Graduate School of Education. Using SAS Proc Mixed.1137 0. σ ε2 ) Level-2 Model: Composite Model: π 0i = γ 00 + γ 01COAi + γ 02 PEERi + ζ 0i π 1i = γ 10 + γ 11COA i + γ 12 PEERi + ζ 1i ⎛ ⎡0⎤ ⎡ σ 2 σ 01 ⎤ ⎞ ⎡ζ ⎤ ⎟ where ⎢ 0i ⎥ ~ N ⎜ ⎢ ⎥. which captures the effect of PEER on the rate of change • All other statements.75 Pr > |t| 0.56 6.1481 0.8 Solution for Fixed Effects Standard Error 0. Using SAS Proc Mixed.1) UN(2. ⎢ 0 2 ⎥⎟ ⎜ ⎣0 ⎦ σ ⎣ζ 1i ⎦ ⎣ 10 σ 1 ⎦ ⎠ ⎝ Yij = γ 00 + γ 01COAi + γ 02 PEER i + γ 10 ( AGE − 14 ) ij + γ 11COAi * ( AGE − 14 ) ij + γ 12 PEER i * ( AGE − 14 ) ij + [ζ 0 i + ζ 1i ( AGE − 14 ) ij + ε ij ] proc mixed data=one method=ml covtest. model alcuse = coa peer age_14 coa*age_14 peer*age_14/solution. are unchanged from Model C because we have only added new fixed effects (for PEER) and not any new random effects.78 -0.1248 0. Willett.7 609. slide 9 Results of fitting Model D (the controlled effects of COA) to data Results of fitting Model D (the controlled effects of COA) to data Model D: Controlled effects of COA The Mixed Procedure Covariance Parameter Estimates Standard Error 0.6943 0.6 632.0006 <. where ε ij ~ N ( 0 .1115 0.00612 0.60 -0.0001 0.9115 0.40 Cov Parm UN(1. Harvard Graduate School of Education.Using SAS Proc Mixed to fit Model D (Controlled effects of COA) Using SAS Proc Mixed to fit Model D (Controlled effects of COA) Level-1 Model: Yij = π 0 i + π 1i TIME ij + ε ij .3165 0. Singer & John B.0356 0.23 3.0056 <. Willett. including the random statement.0046 0.0840 Go to resources to help you use SAS © Judith D.2) Residual Subject Estimate ID ID ID 0. which captures the effect on the intercept (initial status).05500 0.3373 Pr Z 0.0003 0. class id.05268 Z Value 2.05481 0.5792 0.11 2.11 -1.7 608. • Like the companion Level-2 model. random intercept age_14/type=un subject=id. PEER*AGE_14.9107 0.1391 0.1498 DF 79 82 82 79 82 82 t Value -2.1625 0. slide 10 .0001 Fit Statistics -2 Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) 588.01403 -0.2409 -0.1) UN(2.09259 0.

- Complexity Within and Similarity Across
- Complexity Within and Similarity Across
- Rodriguez Article
- Rodriguez Article 1
- Identity Politics in the 21st Century
- 2011 12 Socially Older Adults Slower Health
- Article
- Pages From GSS_Codebook
- American Politics Research 2012 Wilkinson 1532673X12464546 (1)
- SSRN-id1879965
- j.1475-682X.2005.00118.x
- Variable Description List Saq
- Page5 Immigration Implications and the Political Transformation of White America
- Cv PattiThomas Nov2012w
- Divergent Pathways Racial Ethnic Inequalities In
- WFU Resume Template
- PrelimProg

- How Would Be Graduate Should Prepare Themselves to Be Selfemployed
- Structure of Final Report Pmri
- Student Assessment
- 1-s2.0-S073433100600005X-main
- IE 681 - Lecture1 - Master
- Homework 2
- 40365051
- Rationale
- LANGUAGE ARTS_drama.docx
- (Act Paed - 2008) A. Papadimitriou, G. Fytanidis, K. Douros, C. Bakoula, P. Nicolaidou y A. Fretzayas - Age at menarche in Greek girls, levelling-off of the secular trend..pdf
- Design of the 2008 Summer Workshop
- description
- Studies on Dental Caries
- FCI Food Corporation of India Asst Gr-III, Hindi Asst Gr-II & Typist (Hindi) Exam 2011 Coaching at Cheap Rate With Free Study Materials.
- Tution for FCI Food Corporation of India Asst Gr-III, Hindi Asst Gr-II & Typist (Hindi) Exam 2011 at Cheap Rate With Free Study Materials
- FCI Food Corporation of India Asst Gr-III, Hindi Asst Gr-II & Typist (Hindi) Exam 2011 Tution at Cheap Rate With Free Study Materials
- General Press Briefing
- Lecture 11
- Teacher Observation Guide for Instructional Competence
- 4-2013
- Diagnostics and Assessment in Ict Organized
- C01 Guide April2013
- Critique 1
- ITNW1425_Syllabus
- 15. Accommodations on Large-scale Assessment
- HUM 123 French Language_2011
- 2016 Cours`e Audition Guide
- A Critical Analysis on Student Teams Achievement Divisions
- Reasons and Results in Evaluation Research
- Lesson Plan

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue listening from where you left off, or restart the preview.

scribd