This action might not be possible to undo. Are you sure you want to continue?
Janet K. Sheehan-Holt M Cecil Smith
Northern Illinois University
March 17, 2000
Abstract The purpose of this study was to demonstrate how multilevel analyses can be used with large-scale data sets such as the National Adult Literacy Survey (NALS). The results of ordinary least squares (OLS) regression and hierarchical linear modeling (HLM) analyses were compared for modeling predictors of literacy from the NALS. Our results indicate that contextual factors, such as mean income of the neighborhood, are important to take into account when predicting adult literacy proficiencies when studying racial and ethnic differences. Also, contextual effects estimates and their standard errors were found to differ between HLM and OLS. Finally, contextual-effects studies of adult literacy using OLS produced a different model of the predictors of adult literacy than did HLM. Statistical justification is given for the discrepant results between the two methods and HLM is recommended as the appropriate statistical tool for studying predictors of literacy when using the NALS.
Using Multilevel Modeling to Investigate Predictors of Literacy in the National Adult Literacy Survey. Recent advances in statistical methodology and computing power have made more sophisticated data analytic tools such as hierarchical linear modeling (HLM) readily available. This methodological tool is well suited for research using national databases since (a) they often involve very large sample sizes and (b) complex sampling designs are often used. The intent of this study was to investigate the use of these tools, with an emphasis on HLM, for use in investigating substantive problems of interest in the National Adult Literacy Survey (NALS). The NALS (Kirsch et al., 1993) is the most recent and comprehensive survey conducted of American adults’ literacy skills and practices. The NALS data were gathered on a nationally representative sample of 26,091 adults, ages 16 and older between January and August, 1992. The sampling design for the NALS survey is a multistage cluster sample in which counties or groups of counties, i.e., probability sampling units (PSUs), are first randomly selected. From the PSUs census blocks or groups of census blocks, i.e., segments, are randomly selected. At this stage, segments that were identified as high minority were over-sampled in order to ensure reliable estimates of Blacks’ and Hispanics’ literacy proficiencies. Households were then randomly selected from the segments, and one or two adults from each household were selected for the survey. Further details regarding data collection are found in Kirsch et al. (1993).
The cluster-sampling design of the NALS makes the data set a prime candidate for multilevel modeling techniques such as hierarchical linear modeling (HLM; Bryk & Raudenbush, 1992; de Leeuw & Kreft, 1995; Draper, 1995; Morris, 1995; Raudenbush, 1995; Rogosa & Saner, 1995). Like regression analysis, HLM techniques can condition on many background variables at the individual-level. However, HLM has the advantage of yielding appropriate estimates of standard errors when there are moderate to high intraclass correlations (ICC), resulting from similarity among individuals within a higher-level unit. HLM is a technique evolved from the random coefficient tradition (Dempster, Rubin, & Tsutakawa, 1981; Swamy,1973) in which individual-level parameters are assumed to vary randomly across higher-level units, such as neighborhoods. Therefore, variation in student-level intercepts or slopes can be modeled with neighborhood-level variables. This eliminates the debate over whether dis-aggregated or aggregated analyses should be used with multilevel data, because both levels can be used to model individual-level variability. Even though data from the NALS are cluster-sampled at four levels, data are only collected at the individual level. Hence, using HLM would seemingly have only one advantage over OLS (ordinary least squares) regression analyses; yielding more appropriate estimates of standard errors when intraclass correlations are high. Because no data are reported at the higher levels, e.g., household, segment, and PSU, the ability of HLM to model cross-level effects would not seem to be as relevant to the NALS data. It is the purpose, however, of this study to demonstrate how both benefits of multilevel analyses can be ascertained with data such as the NALS. Further, comparing the results of OLS to HLM analyses when modeling predictors of literacy from the NALS allows us to … ? Contextual Analyses We chose to model segment-level, as well as individual-level, variation in the NALS because segments represent groups of census blocks or small geographical regions, such as neighborhoods, which could serve as a proxy for these neighborhoods. Therefore, by modeling both individual-level variation and segment-level variation we could better
represent the major influences on adult literacy skills and practices, individual attributes and characteristics, as well as the respondents’ neighborhood characteristics. Since segment-level variables were not collected, however, it was necessary to take a contextual approach to model neighborhood variables, that is, to use the mean of individual-level variables as a measure of neighborhood-level characteristics. This type of contextual analysis has been researched extensively in the literature (e.g., see Willms’ review, 1986). [Jan: probably need to summarize this work briefly—in a half paragraph or so.] Cronbach and Webb (1975) addressed [this problem of confounding contextual effects] > unclear as to what you are referring! by partitioning the variation into between-context and within-context components. The within-context regression is formed by first deviating the scores of the individual-level predictor from the segment-level mean
( X ij −X j )
, and regressing the outcome on the
deviated scores. The between-context component is determined by aggregating the data to the segment level for the regression analysis. Hence the mean of Y is regressed on the mean of X. Cronbach and Webb (date) demonstrated that very different conclusions can be reached when the conventional OLS analysis is replaced with such a partitioned analysis. Yet, the separate analyses approach used by Cronbach and Webb (1975) and detailed by Cronbach and Snow (1977) can be effectively employed to interpret contextual effects from hierarchically nested samples, particularly when intraclass correlations are low. The separate analysis approach can also be obtained from a singular regression analysis in which both X and X are regressed on the outcome. The within-groups regression
Yij =β +β X ij +β X 0 w c
+ ij ε
(1) coefficient is represented by β while β c represents β b - β w, the difference between
the within-groups regression coefficient and the between-groups regression coefficient. This difference is the contextual effect of X on Y.
Multilevel Modeling Another technique which has been used to make cross-level inferences within school-effects studies is multilevel modeling. Multilevel modeling has particular merit when analyzing data which have high intraclass correlations due to the hierarchical structure of the data. When analyzing data with intraclass correlations using conventional regression analysis, the data are forced to fit a model that does not reflect how they were collected. Conversely, multilevel techniques draw strength from appropriately modeling the data at each level of the sampling design. In multilevel modeling a separate micro-level model is defined for each macro unit. In a neighborhood-effects study this would mean that individual level regression coefficients are modeled by neighborhood-level variables (de Leeuw & Kreft, 1986). Random-Intercepts Models Random-intercept models are a particular type of multilevel model that are often used to make cross-level inferences in which the intercepts are not assumed to be constant for all contexts. These multilevel models circumvent a limitation of the OLS separate analyses approach, the assumption that the intercepts across all second-level units, e.g., neighborhoods, are homogeneous. The extent to which mean literacy proficiencies and practices vary across neighborhoods determines the extent to which a multilevel model will better fit the data than an OLS model. This variation can be measured with an intraclass correlation. When the intraclass correlation is high the average outcome varies considerably across the level-two units. In addition to providing a more realistic model of the data, the random-intercepts model is also an improvement over the conventional multiple regression model because it calculates the correct standard errors. Moreover, the random-intercepts model improves the estimation of the parameters for the separate neighborhoods. An empirical Bayes estimation procedure is used to weight the regression coefficient estimates of each neighborhood by a reliability coefficient calculated for each neighborhood. This process is known as shrinkage because the estimates are “shrunk” toward the estimated group mean coefficients. Those neighborhoods providing less reliable estimates experience the most shrinkage (Cheung, Keeves, Sellin & Tsoi, 1990; Raudenbush, 1988). The resulting shrinkage estimates are
more precise parameter estimates than those generated through ordinary least square methods. The contextual model is one in which there is an individual-level predictor, X, and one group-level predictor, X (Bryk & Raudenbush, 1992).
Yij =β0 j +β j ( X ij −X .. ) +rij 1
β0 j =γ00 +γ01 X . j +u 0 j
β j =γ10 1
(2) In this model, in which the level-one predictor is centered around its grand mean, γ represents the within-groups regression coefficient and γ
represents the contextual
effect of the predictor on Y. Therefore, both OLS and HLM can be used to study contextual effects. The OLS estimates should be unbiased, yet their standard errors may be negatively biased (Bryk & Raudenbush, 1992). Three questions were addressed in this study. First, is a contextual analysis preferred when investigating predictors of adult literacy from the NALS? Second, how do the estimates of within-groups effects and contextual effects differ between OLS and HLM methods? Third, do these differences result in different conclusions about the predictors of adult literacy? Method Measures of Literacy Measures of adult literacy used in this study included literacy proficiencies and literacy practices. NALS literacy proficiencies were reported using three scales, prose, document, and quantitative (PDQ) literacy--each ranging from 0 to 500. For efficiency, NALS used a matrix sampling approach in which each test-taker answered only a portion of the items for each literacy proficiency scale. Therefore, by design, there is much missing data. Multiple imputation procedures are then used to generate a set of plausible values for the respondent’s scores within each of the three literacy scales, using the respondent’s raw scores, information about the respondent’s background, and test item
data (e.g., item difficulty). Our analyses used the average of the plausible values for each scale. The nature and variety of adults’ literacy practices were determined through the NALS background survey. Respondents were asked about their reading of newspapers, books, magazines, documents, and use of writing and quantitative skills for both workrelated and personal reasons. The data pertaining to reading practices, that is, reading periodicals (newspapers and magazines), books, and documents for work and personal reasons are examined in this study. [Jan: do we need to explain how we arrived at the reading practice scores?] Predictors of Literacy Prediction of adult literacy skills and practices were made from the following individual-level predictors: educational attainment, age, race, labor force participation, basic skills training, (i.e., whether the respondent had ever participated in basic skills education), parents’ educational attainment, English as the primary home language, disability or long-term illness, newspaper reading practice (i.e., the number of newspaper sections read per week), and total family income. The variable race1 represented the mean difference between Blacks and Whites, race2 represented the mean difference between Hispanics and Whites, and race3 represented the mean difference between other minority groups and Whites. Race was then coded as three dummy-coded variables: race1, race 2, and race 3. Whites were coded as 0 for all three variables. Mean income of the neighborhood was used as a contextual variable since, in previous research, it has been shown to be a statistically significant predictor of literacy proficiencies. Other contextual variables, however, are not statistically significant predictors (e.g., mean educational attainment of the neighborhood; Sheehan, Smith, & E, 1997) and so were not used in our analyses. Procedures Initially, regression analyses were performed predicting each of the measures of adult literacy from the full set of predictors, excluding the neighborhood-level variable. These results were compared to HLM contextual analyses using the mean income of the neighborhood as a control variable. Differences in the conclusions one could make from the two analyses are discussed. Second, an abbreviated analysis of adult literacy was
conducted on both HLM and OLS to compare the contextual effects between the two methods. In these analyses only X and X are used as predictors, to partition the regression effect into within-groups and between-groups. The contextual effect (i.e., mean income of the neighborhood on literacy proficiencies and practices) is then calculated and compared between the two analysis methods. Third, full analyses of adult literacy using all the predictors, including the contextual effects, were conducted to determine if differences in the contextual effects between the OLS and the HLM methods resulted in any important differences in modeling adult literacy. In the HLM analyses, the intercepts were allowed to vary across the level-two units, but the slopes were fixed. Results Intraclass correlations for the segment level were low to moderate for the measures of adult literacy. Literacy practices had low intraclass correlations, ranging from .016 for personal document reading to .026 for newspaper reading, while literacy proficiencies had moderate intraclass correlations, ranging from .134 for document literacy to .164 for prose literacy. In the first phase of the analysis the statistically significant predictors from the OLS analyses and HLM contextual analyses were compared. The only coefficients in which the conclusions between the two analyses differed were the relationships between race and newspaper, personal document, and work document reading. Although no statistically significant race effects were detected by HLM for personal document and work document reading, a statistically significant effect for race1 was detected using OLS regression for work document reading (b=-1.023, t(14,270)=-3.056, p<.01) and a statistically significant effect for race2 was detected using OLS regression for personal document reading (b=-.571, t(17,296)=-2.077, p<.05). For newspaper reading, a statistically significant effect for race1 was detected with HLM (γ =1.61, t=2.596, p<.05), but not with OLS regression. For the second phase of the study, contextual analyses were conducted using both HLM and OLS regression for purposes of comparing the estimates of the contextual effects, bc. Simplified models of literacy were used to facilitate interpretation, in which only the predictors family income, X1, and X
mean income of the neighborhood, were
used as described in equations 1 and 2. Results of both sets of analyses are presented in
Multilevel Modeling 10
Table 1. All of the coefficients were statistically significant, with p< .001. However, the magnitude of the coefficients and their standard errors differed between the two analyses. The difference in the contextual effects estimates between the two methods increased with larger intraclass correlations. For the reading practices, which had low intraclass correlations, the standardized difference between the estimates ranged from 0.307 for work documents to 1.414 for newspaper reading. However, the standardized differences between the coefficients for prose, document, and quantitative literacy were 5.59, 5.36, and 6.36, respectively. The OLS coefficients for the contextual effects were consistently larger and had smaller standard errors than the HLM estimates. In phase three of the data analysis, predictors of adult literacy were modeled with OLS regression analyses by including the mean income of the neighborhood as a variable in the model, as well as the other predictors. Similarly, HLM was used to predict adult literacy using mean income of the neighborhood as a predictor at level two and the remaining predictors at level one. The statistically significant findings were compared between the two approaches to determine if important differences in the prediction of adult literacy were apparent. In only two cases were predictors found to be statistically significant in the HLM analyses but not in the OLS analyses. However, in many instances, predictors were found to be statistically significant only in the OLS analyses. These variables are listed in Table 2. Overall, there were numerous differences in the conclusions that could be drawn about the predictors of adult literacy. Discussion A contextual-effects analysis of the NALS data allowed for the estimation of the effect of the mean income of the neighborhood on adult literacy proficiencies and practices, when controlling for family income. When doing so, a very different picture of racial group differences in adult literacy emerges than is typically found (Kirsch, Jungeblut, Jenkins, & Kolstad, 1993). By controlling for mean income of the neighborhood, the HLM approach uncovered a mean difference in newspaper reading favoring Blacks, which was not detected in the OLS analysis of individual predictors. Further, where the HLM analysis did not detect any significant differences between the different ethnic groups for work document and personal document reading, the OLS analysis found Blacks to read fewer work documents than Whites and Hispanics and to
Multilevel Modeling 11
read fewer personal documents than Whites. It is apparently important to control for mean income of the neighborhood when studying racial differences because poverty levels are likely to be higher within some minority-populated neighborhoods. Otherwise, analyses will result in a bias toward majority (White) groups (*** need citations***-why?). It was also discovered that a contextual-effects analysis using OLS regression yields different estimates of the contextual effect and its standard error than HLM analyses, particularly with moderate intraclass correlations (e.g., in the .15 range). It is expected that the standard errors of the contextual effects would be negatively biased in regression analyses, because of the violation of the independence assumption when there is substantial variation across level-two units, as indicated by the intraclass correlation (Bryk & Raudenbush, 1992). Although estimates of the contextual effects should be unbiased in OLS, HLM yields more efficient estimates when samples are unbalanced (Bryk & Raudenbush, 1992, pp. 38, 122-123). It cannot be ascertained whether the OLS estimates were unbiased in these comparisons without knowledge of the parameter values. However, this consistent discrepancy between estimates of the two analysis methods suggests that further investigation of this issue may be warranted. As indicated by the third phase of the analysis of this study a contextual-effects analysis using OLS differs markedly from an HLM contextual analysis of this data set. There were several instances in which the predictors were statistically significant in the OLS analyses but not in the HLM analyses. This result occurred even with the outcome measures that had lower intraclass correlations across neighborhoods. The regression estimates in both HLM and OLS are weighted estimates, which depend on nj. If sample sizes are equivalent across the level-two units, estimates from both should be equivalent. However, in the NALS data, as in many national databases, there are very unequal sample sizes across the sampling units. In this case, the fixed-effects coefficients in an HLM analysis will be weighted by ∆
= (Vj + ι
)-1, where Vj =σ 2/nj when variances
are equal across the j level-two units, and ι
represents the variance in mean outcomes
across the level-two units. On the contrary, OLS estimates are weighted solely by nj, hence differences in estimates may occur between the two, particularly when there is considerable variance in mean outcomes across the level-two units (Bryk & Raudenbush,
Multilevel Modeling 12
1992). When this occurs, analysts are advised to note that HLM estimators should be more efficient, particularly when group sizes are very uneven. As noted by Glass and Hopkins (1996, p. 246), since most estimators are consistent and unbiased when the n size is large, the choice of an estimator is typically based on efficiency. However, the major difference between HLM and OLS results is the estimates of the standard errors, in which the OLS estimates are known to be negatively biased when there is variation across the level-two units (Cheung LIST ALL, 1990). Therefore, in the NALS data where sample sizes are unbalanced and the measures of literacy proficiencies have moderate intraclass correlations, we endorse the use of the HLM method in estimating and testing predictors of adult literacy. The major disadvantage of HLM, however, is that it is a univariate approach in which there can only be one outcome variable in a model. In complex national surveys in which data are collected on many intervening or mediating variables, this is a non-trivial issue because the relationships among several outcome variables cannot then be explored. Hence a more multivariate technique would be useful which would take into account the multilevel nature of the data. Recent developments in structural equation modeling (SEM) make possible covariance structure modeling of nested data (Muthén, 1994; Muthén & Satorra, 1995). Kaplan and Elliott (1997) recently demonstrated how such a multilevel modeling approach could be used to study various educational quality indicators (e.g., the organizational characteristics of schools). Therefore, this methodology [WHICH METHOD?] might be particularly useful for investigating complex models of adult literacy with the NALS. [Jan—seems like this ends on kind of an ambiguous note, like we’re leaving something unsaid; do we need to be more definitive in our conclusion?]
Multilevel Modeling 13
References Bryk, A. S., & Raudenbush, S. W. (1992). Hierarchical linear models: Applications and data analysis methods. Newbury Park, CA: Sage. Cheung, K. C., Keeves, J. P., Sellin, N., & Tsoi, S. C. (1990). The analysis of multilevel data in educational research: Studies of problems and their solutions. International Journal of Educational Research, 14, 215-319. Cronbach L. J and Webb, N. (1975). Between-class and within-class effects in a reported aptitude x treatment interaction: Reanalysis of a study by G. L. Anderson. Journal of Educational Psychology, 67, 717-724. Cronbach L. J and Snow, R. E. (1977). Aptitudes and instructional methods. New York: Irvington. de Leeuw, J., & Kreft, I. G. (1986). Random coefficients models for multilevel analysis. Journal of Educational statistics, 11, 57-85. de Leeuw, J., & Kreft, I. G. (1995). Questioning multilevel models. Journal of Educational and Behavioral Statistics, 20, 171-189. Dempster, A. P., Rubin, D. B., & Tsutakawa, R. K. (1981). Estimation in covariance components models. Journal of the American Statistical Association, 76, 341353. Draper, D. (1995). Inference and hierarchical modeling in the social sciences. . Journal of Educational and Behavioral Statistics, 20, 115-147. Glass, G., & Hopkins, K. (1996). Statistical methods in education and psychology (3rd ed.). Needham Heights, MA: Allyn & Bacon. Kaplan, D., & Elliott, P. R. (1997). A model-based approach to validating education indicators using multilevel structural equation modeling. Journal of Educational and Behavioral Statistics, 22, 323-347. Kirsch, I.S., Jungeblut, A., Jenkins, L., & Kolstad, A. (1993). Adult literacy in America: A first look at the results of the National Adult Literacy Survey. Princeton, NJ: Educational Testing Service. Morris, C. N. (1995). Hierarchical models for educational data: An overview. . Journal of Educational and Behavioral Statistics, 20, 190-200.
Multilevel Modeling 14
Muthén, B. O. (1994). Multilevel covariance structure analysis. Sociological Methods & Research, 22, 376-398. Muthén, B. O., & Satorra, A. (1995). Complex sample data in structural equation modeling. In P. V. Marsden (Ed.), Sociological Methodology (pp. 267-316). Washington DC: American Sociological Association. Raudenbush, S. W. (1988). Educational applications of hierarchical linear models: A review. Journal of Educational Statistics, 13, 85-116. Sheehan, J.K., Smith, MC., & E, N. (1997, March). Hierarchical modeling of contextual effects on literacy proficiencies: An analysis of NALS. Paper presented at the annual meeting of the American Educational Research Association, Chicago, IL. Swamy, P. A. (1973). Criteria, constraints, and multi-collinearity in random coefficient regression models. Annals of Economic and Social Measurement, 2, 429-450. Willms, J.D. (1986). Social class segregation and its relationship to pupils’ examination results in Scotland. American Sociological Review, 55, 224-241.