You are on page 1of 510
Hierarchical Linear Models Second Edition 10. Advanced Quantitative Techniques in the Social Sciences VOLUMES IN THE SERIES HIERARCHICAL LINEAR MODELS: Applications and Data Analysis Methods Anthony S. Bryk and Stephen W. Raudenbush MULTIVARIATE ANALYSIS OF CATEGORICAL DATA: Theory John P, Van de Geer MULTIVARIATE ANALYSIS OF CATEGORICAL DATA: Applications John P, Van de Geer STATISTICAL MODELS FOR ORDINAL VARIABLES Clifford C. Clogg and Edward S. Shihadeh FACET THEORY: Form and Content Ingwer Borg and Samuel Shye LATENT CLASS AND DISCRETE LATENT TRAIT MODELS: Similarities and Differences Ton Heinen REGRESSION MODELS FOR CATEGORICAL AND LIMITED DEPENDENT VARIABLES J. Scott Long LOG-LINEAR MODELS FOR Jeroen K. Vermunt MULTIVARIATE TAXOMETRIC PROCEDURES: Distinguishing ‘Types From Continua Niels G. Waller and Paul E. Meet STRUCTURAL EQUATION MODELINt David Kaplan ENT HISTORIES ‘oundations and Extensions Hierarchical Linear Models Applications and Data Analysis Methods Second Edition Stephen W. Raudenbush Anthony S. Bryk Addonniced Quantitative Techniques | in the Social Sciences Series Sage Publications International Educational and Professional Publisher Thousand Oaks » London = New Delhi Copyright © 2002 by Sage Publications, Inc. All rights reserved. No part of this book may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system, without permission ia writing from the publisher For information: ‘Sage Publications, Inc. 2455 Teller Road ‘Thousand Oaks, California 91320 E-mail: order@sagepub.com Sage Publications, Lid 6 Bonhill Steet London EC2A 4PU United Kingdom Sage Publications India Pvt. Ltd M-32 Market Greater Kailash | New Dethi 140 048 India Printed in the United States of America Library of Congress Cataloging-in-Publication Data A catalog record for this book is available from the Library of Congress 02 03 04 05 06 10987654321 Acquiring Editor: C. Deborah Laughton Editorial Assistant: Veronica Novak Production Editor: Sanford Robinson Typesetter/Designer: Technical Typesetting Contents Acknowledgments for the Second Edition Series Editor's Introduction to Hierarchical Linear Models Series Editor's Introduction to the Second Edition Introduction Hierarchical Data Structure: A Common Phenomenon Persistent Dilemmas in the Analysis of Hierarchical Data A Brief History of the Development of Statistical Theory for Hierarchical Models Early Applications of Hierarchical Linear Models Improved Estimation of Individual Effects Modeling Cross-Level Effects Partitioning Variance-Covariance Components New Developments Since the First Edition An Expanded Range of Outcome Variables Incorporating Cross-Classified Data Structures Multivariate Model Latent Variable Models Bayesian Inference Organization of the Book xvii xix xxiii 2. The Logic of Hierarchical Linear Models Preliminaries A Study of the SES-Achievement Relationship in One School A Study of the SES-Achievement Relationship in Two Schools A Study of the SES-Achievement Relationship in J Schools A General Model and Simpler Submodels One-Way ANOVA with Random Effects Means-as-Outcomes Regression One-Way ANCOVA with Random Effects Random-Coefficients Regression Model Intercepts- and Slopes-as-Outcomes A Model with Nonrandomly Varying Slopes Section Recap Generalizations of the Basic Hierarchical Linear Model Multiple Xs and Multiple Ws Generalization of the Error Structures at Level 1 and Level 2 Extensions Beyond the Basic Two-Level Hierarchical Linear Model Choosing the Location of X and W (Centering) Location of the Xs Location of Ws Summary of Terms and Notation Introduced in This Chapter A Simple Two-Level Model Notation and Terminology Summary Some Definitions Submodel Types Centering Definitions Implications for By Principles of Estimation and Hypothesis Testing for Hierarchical Linear Models Estimation Theory Estimation of Fixed Effects Estimation of Random Level-1 Coefficients Estimation of Variance and Covariance Components Hypothesis Testing Hypothesis Tests for Fixed Effects Hypothesis Tests for Random Level-1 Coefficients 16 16 16 18 18 23 23 24 2s 26 27 28 28 29 29 30 31 31 32 35 35 35 36 36 36 37 37 38 38 38 45 51 56 ST 61 Hypothesis Testing for Variance and Covariance Components 7 Summary of Terms Introduced in This Chapter An Illustration Introduction ‘The One-Way ANOVA. The Model Results Regression with Means-as-Outcomes The Modet Results The Random-Coefficient Model The Model Results An intercepts- and Slopes-as-Outcomes Model The Model Results Estimating the Level-1 Coefficients for a Particular Unit Ordinary Least Squares Unconditional Shrinkage Conditional Shrinkage Comparison of Interval Estimates Cautionary Note Summary of Terms Introduced in This Chapter Applications in Organizational Research Background Issues in Research on Organizational Eifects Formulating Models Person-Level Model (Level 1) Organization-Level Model (Level 2) Case 1: Modeling the Common Effects of Organizations via Random-Intercept Models A Simple Random-Intercept Model Example: Examining School Effects on Teacher Efficacy Comparison of Results with Conventional Teacher-Level and School-Level Analyses A Random-Intercept Model with Level-1 Covariates Example: Evaluating Program Egfects on Writing Comparison of Results with Conventional Student- and Classraom-Level Analyses 63 65 68 68 69 69 70 2 72 B 18 15 11 80 80 BL 85 86 87 90, 92 94 94 99 100 100 101 102 102 103 107 Wd 112 113 Case 2: Explaining the Differentiating Effects of Organizations via Intercepts- and Slopes-as-Outcomes Models Difficulties Encountered in Past Efforts at Modeling Regression Slopes-as-Outcomes Example: The Social Distribution of Achievement in Public and Catholic High Schools Applications with Both Random and Fixed Level-1 Slopes Special Topics Applications with Heterogeneous Level-1 Variance Example: Modeling Sector Effects on the Level-1 Residual Variance in Mathematics Achievement Data-Analytic Advice About the Presence of Heterogeneity at Level 1 Centering Levei-i Predictors in Organizational Effects Applications Estimating Fixed Level-1 Coefficients Disentangling Person-Level and Compositional Effects Estimating Level-2 Effects While Adjusting for Level-1 Covariates Estimating the Variances of Level-I Coefficients Estimating Random Level-1 Coefficients Use of Proportion Reduction in Variance Statistics Estimating the Effects of Individual Organizations Conceptualization of Organization Specific Effects Commonly Used Estimates of School Performance Use of Empirical Bayes Estimators Threats to Valid Inference Regarding Performance Indicators Power Considerations in Designing Two-Level Organization Effects Studies Applications in the Study of Individual Change Background Issues in Research on Individual Change Formulating Models : Repeated-Observations Model (Level 1) Person-Level Model (Level 2) A Linear Growth Model Example: The Effect of Instruction on Cognitive Growth A Quadratic Growth Model 7 7 119 129 130 131 BL 132 134 135 139 142 143 149 149 152 152 152 153 154 158 160 160 161 162 162 163 164 169 Example: The Effects of Maternal Speech on Children’s Vocabulary Some Other Growth Models More Complex Level- Error Structures Piecewise Linear Growth Models Time-Varying Covariates Centering of Level-1 Predictors in Studies of Individual Change Defmnition of the Intercept in Linear Growth Models Definitions of Other Growth Parameters in Higher-Order Polynomial Models Possible Biases in Studying Time-Varying Covariates Estimation of the Variance of Growth Parameters Comparison of Hierarchical, Multivariate Repeated-Measures, and Structural Equation Models Multivariate Repeated-Measures (MRM) Model Structural Equation Models (SEM) Case I: Observed Data are Balanced Case 2: Complete Data are Balanced Case 3: Complete Data are Unbalanced Effects of Missing Observations at Level 1 Using a Hierarchical Model to Predict Future Status Power Considerations in Designing Studies of Growth and Change Applications in Meta-Analysis and Other Cases where Level-I Variances are Known Introduction The Hierarchical Structure of Meta-Analytic Data Extensions to Other Level-I “Variance-Known” Problems Organization of This Chapter : Formulating Models for Meta-Analysis, Standardized Mean Differences Level-1 (Within-Studies) Model Level-2 (Between-Studies) Model Combined Model Estimation Example: The Effect of Teacher Expectancy on Pupil 1Q Unconditional Analysis Conditional Analysis Bayesian Meta-Analysis 170 176 177 178 119 181 181 182 183 183 185 185 186 188 189 196 199 200 202 205 205 206 207 207 208 208 209 209 210 210 210 212 213 27 Other Level-1 Variance-Known Problems Example: Correlates of Diversity The Multivariate V-Known Model Level-] Model Level-2 Model Meta-Analysis of Incomplete Multivariate Data Level-] Model Level-2 Model Nlustrative Example Three-Level Models Formulating and Testing Three-Level Models A Fully Unconditional Model Conditional Models Many Alternative Modeling Possibilities Hypothesis Testing in the three-Level Model Example: Research on Teaching Studying Individual Change Within Organizations Unconditional Model Conditional Model Measurement Models at Level 1 Example: Research on School Climate Example: Research on School-Based Professional Community and the Factors That Facilitate It Estimating Random Coefficients in Three-Level Models Assessing the Adequacy of Hierarchical Models Introduction Thinking about Model Assumptions Organization of the Chapter Key Assumptions of a Two-Level Hierarchical Linear Model Building the Level-1 Model Empirical Methods to Guide Model Building at Level 1 Specification Issues at Level 1 Examining Assumptions about Level-1 Random Effects Building the Level-2 Model Empirical Methods to Guide Model Building at Level 2 Specification Issues at Level 2 Examining Assumptions about Level-2 Random Effects Robust Standard Errors Mlustration Validity of Inferences when Samples are Small 217 219 222 222 223 224 224 225 228 228 228 231 233 234 235 237 238 241 245 245 248 250 252 252 253 253 254 256 257 259 263 267 268 an 273 276 279 280 10. I Inferences about the Fixed Effects Inferences about the Variance Components Inferences about Random Level-1 Coefficients Appendix Misspecification of the Level-1 Structural Model Level-1 Predictors Measured with Error Hierarchical Generalized Linear Models ‘The Two-Level HLM as a Special Case of HGLM Level-1 Sampling Model Level-1 Link Function Level-1 Structural Model Two- and Three-Level Models for Binary Outcomes Level-I Sampling Model Level-1 Link Function Level-] Stractural & Level-2 and Level-3 Models A Bernoulli Example: Grade Retention in Thailand Population-Average Models A Binomial Example: Course Failures During First Semester of Ninth Grade Hierarchical Models for Count Data Level-1 Sampling Model Level-1 Link Function Level-1 Structural Model Level-2 Model Example: Homicide Rates in Chicago Neighborhoods Hierarchical Models for Ordinal Data The Cumulative Probability Model for Single-Level Data Extension to Two Levels An Example: Teacher Control and Teacher Commitment Hierarchical Models for Multinomial Data Level-1 Sampling Model Level-1 Link Function Level-1 Structural Model Level-2 Model Wustrative Example: Postsecondary Destinations Estimation Considerations in Hierarchical Generalized Linear Models Summary of Terms Introduced in This Chapter del Hierarchical Models for Latent Variables 281 283 284 285 285 286 291 293 293 293 294 294 294 295 296 296 301 304 309 309 310 310 310 3il 317 iT 321 322 325 326 326 327 327 327 332 333 336 12, 13, Regression with Missing Data Multiple Model-Based Imputation Applying HLM to the Missing Data Problem Regression when Predictors are Measured with Error Incorporating Information about Measurement Error in Hierarchical Models Regression with Missing Data and Measurement Errors Estimating Direct and Indirect Effects of Latent Variables A Three-Level Illustrative Example with Measurement Error and Missing Data The Model A Two-Level Latent Variable Example for Individual Growth Nonlinear item Response Models A Simple Item Response Model An Item Response Model for Multiple Traits Two-Parameter Models Summary of Terms Introduced in This Chapter Missing Data Problems Measurement Error Problems Models for Cross-Classified Random Effects Formulating and Testing Models for Cross-Classified Random Effects Unconditional Model Conditional Models Example 1: Neighborhood and School Effects on Educational Attainment in Scotland Unconditional Model Conditional Model Estimating a Random Effect of Social Deprivati Example 2: Classroom Effects on Children’s Cognitive Growth During the Primary Years Summary Summary of Terms Introduced in This Chapter Bayesian Inference for Hierarchical Models An Introduction to Bayesian Inference Classical View Bayesian View Example: Inferences for a Normal Mean Classical Approach 338 338 339 346 347 351 351 352 354 361 365 365 368 370 371 371 371 373 376 376 379 14, Bayesian Approach Some Generalizations and Inferential Concerns A Bayesian Perspective on Inference in Hierarchical Linear Models Full Maximum Likelihood (ML) of y,T, and 0? REML Estimation of T and 0? ‘The Basics of Bayesian Inference for the Two-Level HLM Model for the Observed Data Stage-1 Prior Stage-2 Prior Posterior Distributions Relationship Between Fully Bayes and Empirical Bayes Inference Example: Bayes Versus Empirical Bayes Meta-Analysis Bayes Model Parameter Estimation and Inference A Comparison Between Fully Bayes and Empirical Bayes Inference Gibbs Sampling and Other Computational Approaches Application of the Gibbs Sampler to Vocabulary Growth Data Summary of Terms Introduced in This Chapter Estimation Theory Models, Estimators, and Algorithms Overview of Estimation via ML and Bayes ML Estimation Bayesian Inference ML Estimation for Two-Level HLMs ML Estimation via EM The Model M Step E Step Putting the Pieces Together ML Estimation for HLM via Fisher Scoring Application of Fisher-IGLS to Two-Level ML ML Estimation for the Hierarchical Multivariate Linear Model (HMLM) The Model EM Algorithm Fisher-IGLS Algorithm 403 406 408 408 410 412 412 412 413 413 413 414 41S 416 420 427 428 432 436 436 438 438 439 440 440 440 441 4a2 443 444 445 450 450 451 451 Estimation of Alternative Covariance Structures Discussion Estimation for Hierarchical Generalized Linear Models Numerical Integration for Hierarchical Models Application to Two-Level Data with Binary Outcomes Penalized Quasi-Likelihood Closer Approximations to ML Representing the Integral as a Laplace Transform Application of Laplace to Two-Level Binary Data Generalizations to other Level-1 Models Summary and Conclusions References Index About the Authors 452 454 454 456 457 457 459 460 462 463 465 467 477 485 To our loving wives Stella Raxdenbush and Sharon Greenberg Acknowledgments for the Second Edition The decade since the publication of the first edition of this book has produced substantial growth in knowledge about hierarchical models and a rapidly expanding range of applications. This second edition results in part from the sustained and enormously satisfying collaboration between its authors, and also from collaboration and discussion with many colleagues too numerous to mention. Nevertheless, certain persons deserve special thanks for making this work possible. Ongoing methodological discussions involving Darrell Bock, Yuk Fai Cheong, Sema Kalaian, Rafa Kasim, Xiaofeng Liu, and Yasuo Miyazaki have challenged our thinking. Yeow Meng Thum’s work helped inspire the multivariate applications found in Chapters 6 and 11. Mike Seltzer provided an extremely useful critique of the Bayesian approaches described in Chapter 13 and generously gave us permission to reproduce the last example in that chapter. Meng-Li Yang and Matheos Yosef were essential in developing the maximum likelihood estimation methods used for hierarchical generalized linear models (Chapter 10). Young-Yun Shin provided careful reviews and many helpful comments on the entire manuscript. Guang-lei Hong’s critique of an earlier draft helped shape Chapter 12 on cross-classified model The work of Richard Congdon, applications programmer extraordinaire and long-time friend, shows up in every chapter of this book. Stuart Leppescu xviii HIERARCHICAL LINEAR MODELS also assisted with preparation of data and new analyses for the second edition, Colleagues with the Project on Human Development in Chicago Neigh- bothoods (PHDCN), including Felton Earls, Rob Sampson, and Christopher Johnson, have had an important impact on the second edition, as revealed in examples of neighborhood effects in Chapters 10 and 11. Indeed, we thank the MacArthur Foundation, the National Institute of Justice, and the National institute of Mental Health for grants to PHDCN, which supported key methodological work reported in the new chapters of this volume, Special thanks go to Pamela Gardner who helped proof, edit, and type this entire volume. Her amazing productivity and good humor were essential in this enterprise. Anonymous reviewers provided many helpful suggestions on the new chapters in this edition. And C. Deborah Laughton, methodology editor for ge, showed series editor, Jan De Leeuw, for his encouragement. Series Editor's Introduction to Hierarchical Linear Models In the social sciences, data strictures are often hierarchical in the following sense: We have variables describing individuals, but the individuals also are grouped into larger units, each unit consisting of a number of individuals. We also have variables describing these higher order units. The leading example is, perhaps, in education. Students are grouped in classes. We have variables describing students and variables describing classes. It is possible that the variables describing classes are aggregated stu- dent variables, such as number of students or average socioeconomic status. But the class vasiahles could also describe the teacher (if the class has only one teacher) or the classroom (if the class always meets in the same room). Moreover, in this particular example, further hierarchical structure often occurs quite naturally. Classes are grouped in schools, schools in school districts, and sa on, We may have variables describing school districts and variables describing schools (teaching style, school building, neighborhood, and so on). Once we have discovered this one example of a hierarchical data struc~ ture, we see many of them. They occur naturally in geography and (regional) economics. In a sense, one of the basic problems of sociology is to relate properties of individuals and properties of groups and structures in which the individuals function. In the same way, in economics there is the problem of relating the micro and the macro levels. Moreover, many repeated measure- xix xx HIERARCHICAL LINEAR MODELS ments are hierarchical. If we follow individuals over time, then the measure- ments for any particular individual are a group, in the same way as the school class is a group. If each interviewer interviews a group of interviewees, then the interviewers are the higher level. Thinking about these hierarchical struc~ tures a bit longer inevitably leads to the conclusion that many, if not most, social science data have this nested or hierarchical structure. ‘The next step, after realizing how important hierarchical data are, is to think of ways in which statistical techniques should take this hierarchical structure into account. There are two obvious procedures that have been somewhat discredited. The first is to disaggregate all higher order variables to the individual level. Teacher, class, and school characteristics are all assigned to the individual, and the analysis is done on the individual level. The prob- Jem with this approach is that if we know that students are in the same class, then we also know that they have the same value on each of the class ion of independence of observa- tions that is basic for the classical statistical techniques. The other alternative is to aggregate the individual-level variables to the higher level and do the analysis on the higher level. Thus we aggregate student characteristics over classes and do a class analysis, perhaps weighted with class size. The main problem here is that we throw away all the within-group information, which may be as much as 80% or 90% of the total variation before we start the analysis. As a consequence, relations between aggregated variables are often much stronger, and they can be very different from the relation between the nonaggregate variables. Thus we waste information, and we distort interpre- tation if we try to interpret the aggregate analysis on the individual level Thus aggregating and disageregating are both unsatisfactory. If we limit ourselves to traditional linear model analysis, we know that the basic assumptions are linearity, normality, homoscedasticity, and indepen- dence. We would like to maintain the first two, but the last two (especially the independence assumption) should be adapted. The general idea behind such adaptations is that individuals in the same group are closer or more similiar than individuals in different groups. Thus students in different classes can be independent, but students in the same class share values on many more vari- ables. Some of these variables will not be observed, which means that they vanish into the error term of the linear model, causing correlation between disturbances. This idea can be formalized by using variance component mod- els. The disturbances have a group and an individual component. Individual components are all independent; group components are independent between groups but perfectly correlated within groups. Some groups might be more homogeneous than other groups, which means that the variance of the group components can differ. Series Editor's Introduction to Hierarchical Linear Models xxi ‘There is a slightly different way to formalize this idea. We can suppose that cach of the groups has a different regression model, in the simple regression case with its own intercept and its own slope. Because groups are also sam- pled, we then can make the assumption that the intercepts and slopes are a random sample from a population of group intercepts and slopes. This defines random-coefficient regression models. {f we assume this for the intercepts only, and we let all slopes be the same, we are in the variance-component situation discussed in the previous paragraph. If the slopes vary randomly as well, we have a more complicated class of models in which the covariances of the disturbances depend on the values of the individual-level predictors. In random-coefficient regression models, thete is still no possibility to incorporate higher level variables, describing classes or schools. For this we need multilevel models, in which the group-level model is again a linear model. Thus we assume that the slope of the student variable SAT depends Yinearly on the class variables of class size or teacher philosophy. There are finear modeis on both ieveis, and if there are more levels, there are more nested linear models. Thus we arrive at a class of models that takes hierarchical structure into account and that makes it possible to incorporate variables from all levels. Until about 10 years ago, fitting such models was technically not pos- sible, Then, roughly at the same time, techniques and computer programs were published by Aitkin and Longford, Goldstein and co-workers, and Rau- denbush and Bryk. The program HLM, by Bryk and Raudenbush, was the friendliest and most polished of these products, and in rapid succession a number of convincing and interesting examples were published. In this book Bryk and Raudenbush describe the model, the algorithm, the program, and the examples in great detail. 1 think such a complete treatment of this class of techniques is both important and timely. Hierarchical linear models, or multilevel models, are certainly not a solution to all the data analysis prob- ems of the social sciences. For this they are far too limited, because they are stil} based on the assumptions of linearity and normality, and because they still study the relatively simple regression structure in which a single variable depends on a number of others. Nevertheless, technically they are a big step ahead of the aggregation and disaggregation methods, mainly because they are statistically correct and do not waste information. I think the main gain, illustrated nicely in this book by the extet analysis of the examples, is conceptual. The models for the various lev- els are nicely separated, without being completely disjointed. One can think about the possible mechanisms on each of the levels separately and then join the separate models in a joint analysis. In educational research, as well as in geography, sociology, and economics, these techniques will gain in e xxii HIERARCHICAL LINEAR MODELS importance in the next few years, until they also run into their natural lim- itations. To avoid these limitations, they will be extended (and have been extended) to more levels, multivariate data, path-analysis models, latent vari- ables, nominal-dependent variables, generalized linear models, and so on. Social statisticians will be able to do more extensive modeling, and they will be able to choose from a much larger class of models. if they are able to build up the necessary prior information to make a rational choice from the model class, then they can expect more power and precision, It is a good idea to keep this in the back of your mind as you use this book to explore this new exciting class of techniques. Jan DE Leeuw Sertes Eprror

You might also like