A Review of Statistical Analyses of The National Adult Literacy Survey: Implications For Policy Recommendations

Statistical Analyses 1
Running Head: STATISTICAL ANALYSES USING THE NALS
A Review of Statistical Analyses of the National Adult Literacy Survey:
Implications for Policy Recommendations
Janet K. Sheehan-Holt
M Cecil Smith
Northern Illinois University
Paper presented at the annual meeting of the American Educational Research
Association, New Orleans, April 24-28, 2000.

Abstract
A content review of all studies published using the National Adult Literacy
Survey was conducted for three purposes: to assess whether authors accounted for
important design features of the NALS; to examine the complexity and appropriateness
of the statistical analyses conducted and, to identify innovative data analytic procedures
that capitalized on unique aspects of the data to address important adult literacy issues.
From this content review we conclude that important findings regarding adult literacy,
which can inform policy, have resulted from this data. Yet, appropriate attention needs to
be given to the design issues of the NALS in order to make accurate inferences from the
data. Opportunities exist to formulate adult literacy models which can mirror the
complexities of the factors related to adult literacy when full use is made of the data and
these design considerations are taken into account.

A Review of Statistical Analyses of the National Adult Literacy Survey:
Implications for Policy Recommendations
Large-scale national databases provide social scientists with a bounty of
potentially useful information that can be used to examine questions that cannot be easily
addressed in smaller, local studies. The National Adult Literacy Survey (NALS), for
example, is the only large and national-level survey that contains extensive information
about adults' literacy abilities. Data regarding educational, language, and occupational
characteristics that are assumed to be critical to literacy are present in the NALS. These
data became widely available to social scientists in 1993. Since that time, the
NALS has been used for several studies that have investigated relationships between
literacy and various background factors (e.g., age, gender, ethnicity, occupation).
Educational and social policy recommendations have been made as a result of a few of
these studies, although the extent to which these recommendations have been acted
upon is unknown.
Analysts who use data from national surveys such as the NALS can take
advantage of the assets of the large-scale nature of the survey to test multivariate models,
involving many variables and make inferences from the data at the national level.
However, analysts must also work within the constraints of the original survey design,
the specific populations studied, and the questions and tasks posed to participants in the
survey. In some ways, analysis of large-scale data sets is quite different from smaller-
scale surveys because unique features of the survey design need to be considered. In the
NALS, certain features of the data which require special treatment include: (a) taking into
account the non-random probability of selection from certain minority groups, (b) the
cluster sampling design, and (c) the use of multiple plausible values to represent
respondents’literacy proficiencies along three dimensions - prose, document, and
quantitative literacy.
The primary purpose of the current study is to report the analytic methods used by
researchers, including ourselves, when analyzing data from the 1992 NALS to account
for the special design considerations of the survey. Another purpose is to report how the
data were used, with special attention to the use of screening and or control variables, and
whether multivariate models were tested. The final purpose of the study is to describe
some innovative methods that NALS analysts have used to address research related to
adult literacy.
In the next sections, we begin by describing the issues and recommended
procedures for working with the NALS survey. Next we describe the advantages of using
a large-scale survey, such as the NALS, and suggest methods for making full use of the
data. Last we discuss some of the innovative ways researchers have used the NALS data
and discuss the inferences they were able to make from such analyses.
Design Considerations
Sampling weights. Sampling weights are necessary in the analysis of the NALS
data to account for such sampling characteristics as the oversampling of Blacks and
Hispanics in high-minority segments, the possibility of non-response bias, and the
combination of state and national samples. These sampling weights are provided in the
NALS data, however, data analysts need to be aware of the need for using the weights in
order to get accurate population estimates of the effects of interest.

In the NALS data file many weights are provided: base weights, a series of
replicate weights, and the final weights. The base weight is the reciprocal of the
probability of selection for a respondent, which reflects all stages of sampling. The
composite weights are then derived from the base weights by multiplying by a
compositing factor, which combines the state and national data in an optimal manner.
The final sample weights were then calculated by raking the composite weights to known
population totals (National Center for Education Statistics [NCES], 1997). Weighting the
data by in this manner will ensure that the data are representative of the population.
Sampling variability estimation. Estimates of sampling variability of statistics
based on the NALS may be negatively biased since the data were collected via a cluster
sampling design. That is, observations within a cluster may be more similar than those
between clusters and, consequently, the data may violate the assumption of independence
necessary for most parametric statistical tests. To acquire accurate estimates of standard
errors the NALS User's Guide (NCES, 1997) recommends several procedures. First, a
complex mathematical approximation procedure, Taylor Series expansion, may be used
to determine the corrected standard errors for most parametric statistics. A second
method uses a series of replicate weights, provided in the NALS data, to calculate
jackknife variance estimates of repeated subsamples of the NALS. These can then be
used to generate a final variance estimate of the statistic of interest. A third approach,
discussed in the NALS User's Guide, is to adjust the estimated standard error by a factor
equivalent to the design effect. Although the design effect will vary depending on the
statistic calculated, Rock estimated that the design effect for the NALS on the average is
equal to 2.0 (NCES, 1997).

Another statistical method that can be used to generate correct estimates of
standard errors from data collected with a cluster sampling design is hierarchical linear
modeling (HLM; Bryk & Raudenbush, 1992). This method allows parameter estimates of
individuals to vary depending on characteristics of the cluster to which the individual is
assigned, hence there is not the assumption of independence of errors that is common to
ordinary least squares (OLS) regression and other parametric techniques. Moreover, this
variation can be explicitly modeled with cluster-level variables. In this way, the outcomes
of interest can be modeled with data collected at any unit of analysis.
Multiple plausible values. Prose, document and quantitative proficiencies in the
NALS are assessed using a variant of a matrix sampling technique. Different sets of
prose, document, and quantitative literacy items are assigned to different respondents at
random, allowing the assessment of a broad range of literacy areas while minimizing time
demands for survey administration. Because individuals take different skills tests, it is
inappropriate to estimate literacy proficiencies with a percentage correct score because
score differences may depend on the difficulty of items in the set. To reliably assess
literacy proficiency plausible value methodology is used. The plausible values contain
information about both the individual’s level of proficiency as well as the measurement
imprecision in the estimation process. Plausible values are not meant to be used as
estimates of an individual’s proficiency score, in fact they may be biased estimates of
individual-level proficiency; however, they can be used to generate unbiased estimates of
population effects. The plausible value methodology relates an individual’s item
responses to their proficiency using a three-parameter logistic model. Sets of plausible
values of proficiencies are then generated (NCES, 1997).

The correct method to infer population effects from multiple plausible values is to
follow the method of Mislevy, Johnson, and Muraki (1992) which takes into account both
measurement and sampling error. This method is computationally laborious, however,
since it requires the analysis to be computed separately for each plausible value.
Parameter estimates can then be computed by the average of the parameter estimates
from the separate runs. However, the calculation of the standard error requires not only
averaging the standard errors of the separate runs, but also estimating the variability
among the standard error estimates. The method of Mislevy et al. (1992) combines this
information resulting in standard error estimates that account for both sampling error and
measurement error. Alternatives to the plausible value methodology are not wholly
acceptable (NCES, 1998). If the analyst simply averages the plausible values this may
produce better estimates of an individual’s ability, yet, it will not produce consistent
estimates of population effects or error variance. Using one plausible value will yield
accurate estimates of the population effect, however, standard error estimates will be
negatively biased (NCES, 1998).
Uses of Data
The English Background Questionnaire (EBQ) of the 1992 NALS was used to
collect a broad range of information that would enhance our understanding of the factors
related to adults’literacy skills. The EBQ contained six sections: (a) general and
language background, (b) educational background and experiences, (c) political and
social participation, (d) labor force participation, (e) literacy activities and collaboration,
and (f) demographic information. Although improvements have been recommended in
several areas of the EBQ (Smith & Sheehan-Holt, 2000), data from the survey are very
rich and allow the possibility of forming complex theoretical models of adult literacy
which can be empirically tested.
Items in the EBQ may be used as screening variables to define a specific aspect of
the population to be studied. Because the total sample size of the NALS is in excess of
26,000, sub-populations can be selected that are specific to the issue under study and still
maintain a large enough population size to make strong inferences. The data from the
NALS also lends itself to multivariate analyses, which more closely mirror the
complexities of much social science phenomena. Because survey data are most often
used in correlational analyses, there is a distinct advantage in statistically controlling, via
multiple regression techniques and other methods, for important background
characteristics which may influence adult literacy. One of the advantages of large-scale
surveys such as the NALS, is that statistical control can be easily done without significant
“cost” to the error degrees of freedom of the statistic. Further, there is a much larger array
of variables than would be available in a typical small-scale surveys. Therefore, many
different types of characteristics can be controlled for in the analyses. A third use of the
data, which capitalizes on the both the large sample size and the breadth of the survey, is
to form complex, multivariate models that can be empirically tested. Complex analytic
tools such as structural equation modeling (SEM) and hierarchical linear modeling
require large samples in order to make accurate inferences from the results. Also, the
formation of models of indirect, direct, and reciprocal effects, such as is possible with
SEM, necessitates that a large enough set of variables is available to correctly specify the
model. The broader the survey, the more likely these types of complex models can be
tested.
Method
Published articles and reports based on the 1992 National Adult Literacy were
reviewed for this study (see Appendix). This content review of the studies focused on
three aspects of the methodology. First, the analytic methods the researchers used to take
into account important design features of the NALS were assessed. Specifically, the
studies were reviewed to determine (a) whether sampling weights were used to take into
account the unequal probability of selection of various subgroups, (b) whether the cluster
sampling design of the survey was taken into consideration, and (c) how the proficiency
scores were used. Second, we examined the complexity of the analyses to determine if
the researchers took advantage of the wide array of variables in the dataset to screen for
particular subpopulations and conduct multivariate analyses that may more closely mirror
the complexities of adult literacy. Third, we identified and highlighted innovative data
analytic procedures that capitalized on unique aspects of the NALS to address important
issues involving adult literacy.
Results
Sampling weights. Eleven of the fourteen studies (79%) reviewed reported or
implied that sampling weights were used to obtain accurate population estimates of the
effects of interest. Two of the three studies which did not report using sampling weights
used inferential analyses and appeared to be interested in generalizing to a broader
population, which would not be appropriate without sampling weights. One study which
did not report using sampling weights was primarily descriptive in nature and therefore
would not have been as affected by disregarding sampling weights.

Sampling variability estimation. The cluster-sampling design of the NALS data
was taken into account in some manner in eight of the 14 studies (57%). The most
common method used to adjust for the possible non-independence of errors was to reduce
the effective sample size by an upward estimate of the average design effect of the
NALS. Four of the studies adjusted the sample size in this manner. The design effect
used in these studies concurs with what Rock recommends in the NALS User’s Guide
(NCES, 1997). In one study the design effect was first estimated using bootstrapped
estimates of OLS coefficients. Standard errors were estimated to be 1.2 to 1.4 times the
standard errors assuming simple random sampling. This was rounded upward to a
conservative design effect of 2.0 for all subsequent analyses.
The other three studies which took into account the cluster sampling design, did
so using HLM. In one of these studies individual states were used as the second-level
variable in order to ensure consistent estimation of within-state parameters and to
examine inter-state variability in mean outcomes and regression coefficients. However,
the results produced were nearly identical to corresponding OLS results, therefore OLS
estimates were used in all subsequent analyses. In the other two studies the segment was
used as the second-level variable. The authors reported a moderate intraclass correlation
coefficient (.164) for prose literacy proficiency between segments. The segment level
was also used as a second-level variable because it served as a proxy for neighborhoods,
i.e., census blocks or groups of census blocks.
Multiple plausible values. Thirteen of the studies reviewed used at least one
literacy proficiency scale in some of their analyses. Only four of these studies (31%)
reported using multiple plausible value methodology to estimate literacy proficiencies.

These four studies all appeared to be using the recommended procedure by Mislevy et al.
(1992). One study reported using the first plausible value to estimate literacy
proficiencies. Of the remaining eight studies which investigated literacy skills, none
provided details about which method, if any, was used to determine accurate estimates of
literacy proficiency. Five of the studies (38%) averaged the plausible values across the
prose, document and quantitative scales because of reported high intercorrelations among
the literacy scales.
Uses of Data
The majority of the studies reviewed (71%) used at least one variable in the data
set to narrow the population to a particular subsample of the total population. The most
common variables used to delimit the data were age, born in the US, and spoke English
before entering school. The majority of the studies (79%) reported controlling for
demographic characteristics or other characteristics in at least some of the analyses. A
variety of different control variables were used across the studies, however age, ethnicity,
educational attainment and gender were the most common control variables used. Eight
studies (57%) used some inferential analyses such as simple t-tests, chi-squares and
ANOVAs, where no control variables were used.
The formation of complex models which were testable by regression analysis or
multivariate methods were made in twelve studies (86%). Ten studies utilized multiple
regression techniques or hierarchical linear modeling to form predictive models. Two
studies used structural equation modeling to test bona fide multivariate models.
Innovative Analyses
Both the large sample size and the wide array of variables in the NALS data allow
the analyst to use a variety of modeling methods, e.g., hierarchical linear modeling,
structural equation modeling, which are not possible in many data sets. This section will
discuss innovative analytic methods which were used to uncover factors related to adult
literacy.
Occupational variation. In their study of gender and ethnic differences of earnings
and employment status, Raudenbush and Kasim (1998) used an analytic method that
helped them determine if earnings and employment status differentials could be best
explained by an occupational preference perspective. They hypothesized that different
subpopulations (e.g., males and females) may earn different wages either because males
and females have preferences for different occupations or have different cognitive skills.
Raudenbush and Kasim deduced that within-occupational inequalities cannot be solely
explained by occupational preferences, however. Also, if these inequalities cannot be
explained by cognitive skill, then they suggest that other explanations must exist for pay
and employment inequality.
To test these hypotheses Raudenbush and Kasim (1998) formulated HLM models
in which individuals were nested within occupation. This procedure allowed for
disentangling the within- and between-group occupational differences. Raudenbush and
Kasim concluded that about two-thirds of the African-American versus European-
American male gap in earnings, and essentially all of the gap in unemployment, lies
within occupations. Literacy explains more than half of the between-occupation gap, but
less than half of the within-occupation gap. Having lower literacy skills, therefore, denies
blacks access to favorable occupations and helps to explain wage inequality on the basis
of ethnicity. Raudenbush and Kasim put forth a different explanation for gender
differences, however. The majority of the gender gap is within occupation, and
controlling for literacy has little or no effect on this gender gap. Consequently, they
concluded that neither job preference theories or cognitive skill deficiency accounts for
the gender gap in earnings, and their results are consistent with a labor-force
discrimination explanation.
Contextual effects. We have previously conducted analyses to determine
predictors of literacy practices and skills using HLM, however, in a different manner
from Raudenbush and Kasim (Sheehan-Holt & Smith, 2000; Smith & Sheehan, 1998).
We determined that the segment level of the NALS data collection design could serve as
a proxy for a neighborhood since segments were comprised of census blocks or groups of
census blocks. We therefore used the segment as the second level in our HLM analyses
and found that literacy skills, but not practices, had moderate intraclass correlations
across segments. By using segment as a second-level variable, we could (a) appropriately
model individual-level effects even with non-independent errors within neighborhoods
and (b) we could appropriately model the neighborhood-level variation with contextual
variables. As a consequence, we created a contextual variable at the neighborhood level,
mean income of the neighborhood, and determined that not only was this variable
significantly related to literacy even when controlling for an individual’s family income,
but that it is an important control variable in the prediction of adult literacy.
Literacy selection and development. Reder (1995, 1998) has advanced the
understanding of the relationships between education, adult literacy, and other learner
characteristics using structural equation modeling to test competing models of adult
literacy. He hypothesized that if educational attainment is a primary cause of adult
literacy (literacy development), a model with a unidirectional path from education to
literacy would fit the data best. However, if literacy operates on educational attainment
(literacy selection), a model with a unidirectional flow from literacy to education would
be a better fitting model. He tested these two competing models, along with a reciprocal-
effects model, with a bi-directional flow between literacy and education, to determine
which model best represents the interaction of these two factors. He determined that the
reciprocal effects model best explains the education – literacy relationship. Reder (1995)
also extended the model to include the exogenous variables of self-reported learning
disability, gender, age, age squared, minority status, and parents’education, as well as
social and economic outcomes. Using SEM he was also able to conclude that, in general,
learning disability status has substantial direct and indirect effects on education, literacy,
and social and economic outcomes.
Discussion
The majority of the studies involving secondary analysis of the NALS data
reported using sampling weights, which lends validity to the inferences made regarding
population effects. Only two studies appeared to be making inferences to the population
without correctly weighting the data. It is possible, however, that the data were properly
weighted in these studies but was not reported in the description of the data analysis
procedures.
Slightly over half of the reviewed studies accounted for the sampling design in
some manner - either using an estimate of the design effect to reduce the effective
sample size or by using HLM. Using the design effect to weight the data has the
advantage that it takes into account the possible dependence of observations due to all
stages of the sampling design. However, this method is lacking in that design effects
differ depending on the statistic used and the variables used in a particular analysis
(NCES, 1997). None of the studies evaluated appeared to vary the design effect
adjustment depending on the statistic used. However, the investigators probably
overestimated the design effect and as such erred by being more conservative.
The HLM method of accounting for the sampling design is very effective, as it is
particularly suited for this type of data. However, a shortcoming of the method is that the
HLM software (Bryk, Raudenbush, & Congdon, 1996) can only analyze three stages of
sampling design (the individual level and two other stages) at one time. There is some
evidence that there is not much random variation in outcomes between states or PSUs,
although there is at the segment level (Raudenbush & Kasim, 1998; Sheehan-Holt &
Smith, 2000; Smith & Sheehan, 1998). Therefore, even a two-level model may be
sufficient to model the NALS data. Almost half of the published NALS research,
however, did not take the sampling design into account in any manner when making
statistical inferences. This may have resulted in negatively biased standard errors and
overreporting of significant effects related to adult literacy.
The limited number of studies that utilized plausible value methodology raises
some concern about the accuracy of the reported results of statistical tests involving adult
literacy. Although the plausible values take into account both literacy proficiency and
measurement error, sampling error is underestimated when only one plausible value is
used. Averaging the plausible values causes even more serious problems since this results
in inconsistent estimates. We suggest that the limited uses of the plausible value
methodology is because (a) it is not well understood how to correctly use the plausible
values and (b) very few statistical software programs have built in subroutines for
appropriately calculating plausible values, the notable exception being HLM (Bryk,
Raudenbush, & Congdon, 1996). However, because the main focus of most of the studies
using the NALS is adult literacy it is critical to inform NALS users about correct methods
for estimating literacy proficiencies.
Uses of Data
NALS users have taken advantage of the wide array of variables in NALS to
screen particular subpopulations and control demographic and respondent characteristics.
Most of the analyzed studies used regression or multivariate techniques to form
prediction models in which irrelevant characteristics were statistically controlled.
However, in some cases, the authors could have made stronger conclusions by employing
greater statistical controls. For example, in one study comparisons of literacy skills were
made between community college graduates and baccalaureate degree recipients without
controlling for income level or any other indicator of socioeconomic status, which may
vary considerably between these groups.
Although many of the studies employed regression analysis or multivariate
methods in part of the analysis, there were still many cases of t-tests and ANOVAs used
for making inferences, when multivariate methods would have more appropriately
captured the complex relationships among the variables. For example, in one study,
gender differences in literacy were examined through the use of t-tests for each ethnic
group, education level, and age group. The only gender differences detected were those
which varied simply as a function of one variable. However, differences that might vary
as a function of several variables were not detected. Therefore, their inferences regarding
the “literacy gender gap” would have been stronger if ethnicity, education and age had all
been used simultaneously as control variables.
Many of the studies combined several variables to test more involved models.
Using these more complex models, the researchers were able to separate between- and
within-occupational differences in earnings and employment, determine how contextual
effects are related to adult literacy, and investigate the reciprocal effects of literacy and
education.
Using the NALS to Inform Policy
Much of the information gained from the NALS can be important to
policymakers, given that literacy can be viewed as a result of investment in human capital
and is dependent on the many social and economic factors. The NALS studies we have
examined have accomplished important work which indicate that there are many
influences on adult literacy. Some of these influencing factors can themselves be
influenced, modified, and shaped through thoughtful social and educational policies. The
NALS research indicates that adult literacy plays a key role in explaining the Black-
White earnings and employment inequalities for males. Yet, literacy does not generally
explain the gender inequalities in earnings and employment (Raudenbush & Kasim,
1998). Also, some of our work with the NALS has questioned the overall effectiveness of
adult basic education programs for improving literacy skills to obtain the larger social
benefit of a literate society (Sheehan-Holt & Smith, 2000). Moreover, Reder’s
investigations of the relationships between adult literacy and educational attainment
indicates that educational attainment not only impacts literacy acquisition, but literacy
itself has an impact on educational attainment As such, the investigations using the
NALS may be critical in informing policymakers.
Policymakers need to be aware of the differences in the treatment of data and
types of analyses that have been done in the various analyses of the NALS. For instance,
our work suggests that not all researchers are using sampling weights, which limits
inferences to population effects. Most NALS research is not taking into account the
hierarchical structure of the design, which can result in incorrect inferences about effects.
Most of the NALS investigations have not used or have not reported using the multiple
plausible value methodology correctly to measure literacy proficiencies. Incorrect usage
of the plausible values may affect the accuracy of inferences regarding literacy
proficiencies. However, our work does show that researchers, for the most part, are
utilizing many of the NALS items to control for irrelevant effects and testing complex
adult literacy models. Further, several sophisticated analyses have been conducted which
have uncovered interesting findings that challenge conventional wisdom regarding ethnic
and gender differences in earnings and employment (Raudenbush & Kasim, 1998), the
limits of basic skills education in improving literacy skills (Sheehan-Holt & Smith,
2000), and the reciprocal effects of education and literacy (Reder 1995, 1998).
Important findings regarding adult literacy have resulted from the handful of
NALS investigations conducted to date. Despite these positive outcomes, appropriate
attention must be given to the design issues of the NALS in order to make accurate
inferences. When these design issues are taken into account, and the data are used
appropriately, more accurate and complex adult literacy models can be derived and
tested.
References
Bryk, A. S., & Raudenbush, S. W. (1992). Hierarchical linear models:
Applications and data analysis methods. Newbury Park, CA: Sage.
Bryk, A.S., Raudenbush, S.W., & Congdon, R. (1996). HLM (Version 4)
[Computer software]. Chicago, IL: Scientific Software International.
Mislevy, R.J., Johnson, E.G., & Muraki, E. (1992). Scaling procedures in NAEP.
Journal of Educational Statistics, 17, 131-154.
National Center for Education Statistics. (1997). National Adult Literacy Survey
Public Use Data Tape User’s Guide.
National Center for Education Statistics.(1998). Third International Math and
Science Study User’s Guide.
Raudenbush, S.W., & Kasim, R.M. (1998). Cognitive skill and economic
inequality: Findings from the National Adult Literacy Survey. Harvard Education
Review, 68, 33-79.
Reder, S. (1998a). Literacy, education and learning disabilities. Technical Report
No. 95-xx. Philadelphia: National Center on Adult Literacy.
Reder, S. (1998b). Literacy selection and literacy development: Structural
equation models of the reciprocal effects of education and literacy. In M C. Smith (Ed.):
Literacy for the twenty-first century: Research, policy, practices and the National Adult
Literacy Survey (pp. 139-158). Westport, CT: Praeger.
Sheehan-Holt, J.K., & Smith, M C. (2000). Does basic skills education affect
adults’literacy proficiencies and reading practices? Reading Research Quarterly.

Smith, MC., & Sheehan-Holt, J.K. (2000). Evaluation of the 1992 Background
Questionnaire: An analysis of uses with recommendations for revisions. NCES working
paper series.
Smith, M C. (1996). Differences in adults’reading practices and literacy
proficiencies. Reading Research Quarterly, 31, 196-219.
Smith, M C., & Sheehan, J.K. (1998). Adults’reading practices and their
associations with literacy proficiencies. In M C. Smith (Ed.): Literacy for the twenty-first
century: Research, policy, practices and the National Adult Literacy Survey (pp. 79-93).
Westport, CT: Praeger.

Appendix
Studies Examined for Content Review
Friedman, L., & Davenport, E. (1998). Literacy gender gaps: Evidence from the
National Adult Literacy Survey. In M C. Smith (Ed.): Literacy for the twenty-first
Gerber, S., & Finn, J.D. (1998). Learning document skills at school and at work.
Journal of Adolescent and Adult Literacy, 42, 32-44.
Greenberg, E.J., Swaim, P.L., & Teixeira, R.A. (1995). Workers with higher
literacy skills not as well rewarded in rural areas. Rural Development Perspectives, 10,
45-52.
Howard, J., & Obetz, W.S. (1996). Using the NALS to characterize the literacy of
community college graduates. Journal of Adolescent and Adult Literacy, 39, 462-467.
Pryor, F.L., & Schaffer, D. (1997). Wages and the university educated: A paradox
resolved. Monthly Labor Review, 120(7), 3-15.
Raudenbush, S.W., & Kasim, R.M. (1998). Cognitive skill and economic
inequality: Findings from the National Adult Literacy Survey. Harvard Education
Review, 68, 33-79.
Reder, S. (1998a). Literacy, education and learning disabilities. Technical Report
No. 95-xx. Philadelphia: National Center on Adult Literacy.
Reder, S. (1998b). Literacy selection and literacy development: Structural
equation models of the reciprocal effects of education and literacy. In M C. Smith (Ed.):
Literacy for the twenty-first century: Research, policy, practices and the National Adult
Literacy Survey (pp. 139-158). Westport, CT: Praeger.
Sheehan-Holt, J.K., & Smith, M C. (2000). Does basic skills education affect
adults’literacy proficiencies and reading practices? Reading Research Quarterly.
Smith, M C. (1996). Differences in adults’reading practices and literacy
proficiencies. Reading Research Quarterly, 31, 196-219.
Smith, M C., & Sheehan, J.K. (1998). Adults’reading practices and their
associations with literacy proficiencies. In M C. Smith (Ed.): Literacy for the twenty-first
Venezky, R., Kaplan, D., & Yu, F. (1998, August). Literacy practices and voting
behavior: An analysis of the 1992 National Adult Literacy Survey. Washington, DC:
National Center for Education Statistics.
Vogel, S.A., & Reder, S. (1998a). Literacy proficiency among adults with self-
reported learning disabilities. In M C. Smith (Ed.): Literacy for the twenty-first century:
Research, policy, practices and the National Adult Literacy Survey (pp. 159-174).
Vogel, S.A., & Reder, S. (1998b). Educational attainment of adults with learning
disabilities. In S.A. Vogel and S. Reder (Eds.), Learning disabilities, literacy, and adult
education (pp. 29-41.). Baltimore: Paul H. Brooks.

A Review of Statistical Analyses of The National Adult Literacy Survey: Implications For Policy Recommendations

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Review of Statistical Analyses of The National Adult Literacy Survey: Implications For Policy Recommendations

Uploaded by

Copyright:

Available Formats

Statistical Analyses 1

Running Head: STATISTICAL ANALYSES USING THE NALS

A Review of Statistical Analyses of the National Adult Literacy Survey:

Implications for Policy Recommendations

Northern Illinois University

Paper presented at the annual meeting of the American Educational Research

Association, New Orleans, April 24-28, 2000.

these design considerations are taken into account.

A Review of Statistical Analyses of the National Adult Literacy Survey:

Implications for Policy Recommendations

Large-scale national databases provide social scientists with a bounty of

respondents’literacy proficiencies along three dimensions - prose, document, and

In the next sections, we begin by describing the issues and recommended

Hispanics in high-minority segments, the possibility of non-response bias, and the

order to get accurate population estimates of the effects of interest.

Sampling variability estimation. Estimates of sampling variability of statistics

complex mathematical approximation procedure, Taylor Series expansion, may be used

equal to 2.0 (NCES, 1997).

Another statistical method that can be used to generate correct estimates of

individuals to vary depending on characteristics of the cluster to which the individual is

of interest can be modeled with data collected at any unit of analysis.

Multiple plausible values. Prose, document and quantitative proficiencies in the

inappropriate to estimate literacy proficiencies with a percentage correct score because

estimates of an individual’s proficiency score, in fact they may be biased estimates of

individual-level proficiency; however, they can be used to generate unbiased estimates of

population effects. The plausible value methodology relates an individual’s item

responses to their proficiency using a three-parameter logistic model. Sets of plausible

values of proficiencies are then generated (NCES, 1997).

measurement and sampling error. This method is computationally laborious, however,

negatively biased (NCES, 1998).

and (f) demographic information. Although improvements have been recommended in

which can be empirically tested.

used in correlational analyses, there is a distinct advantage in statistically controlling, via

multiple regression techniques and other methods, for important background

of variables than would be available in a typical small-scale surveys. Therefore, many

issues involving adult literacy.

Sampling weights. Eleven of the fourteen studies (79%) reviewed reported or

used inferential analyses and appeared to be interested in generalizing to a broader

would not have been as affected by disregarding sampling weights.

Sampling variability estimation. The cluster-sampling design of the NALS data

conservative design effect of 2.0 for all subsequent analyses.

variable in order to ensure consistent estimation of within-state parameters and to

examine inter-state variability in mean outcomes and regression coefficients. However,

i.e., census blocks or groups of census blocks.

reported using multiple plausible value methodology to estimate literacy proficiencies.

the literacy scales.

demographic characteristics or other characteristics in at least some of the analyses. A

ANOVAs, where no control variables were used.

The formation of complex models which were testable by regression analysis or

regression techniques or hierarchical linear modeling to form predictive models. Two

Occupational variation. In their study of gender and ethnic differences of earnings

explained by an occupational preference perspective. They hypothesized that different

Raudenbush and Kasim deduced that within-occupational inequalities cannot be solely

explained by occupational preferences, however. Also, if these inequalities cannot be

and employment inequality.

disentangling the within- and between-group occupational differences. Raudenbush and

Kasim concluded that about two-thirds of the African-American versus European-

Contextual effects. We have previously conducted analyses to determine

across segments. By using segment as a second-level variable, we could (a) appropriately

model individual-level effects even with non-independent errors within neighborhoods

variables. As a consequence, we created a contextual variable at the neighborhood level,

but that it is an important control variable in the prediction of adult literacy.

characteristics using structural equation modeling to test competing models of adult