Professional Documents
Culture Documents
net/publication/303746858
CITATIONS READS
13 1,633
3 authors:
Kevin Duff
University of Utah
226 PUBLICATIONS 4,842 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Paul Jewsbury on 24 October 2017.
Article
Journal of Psychoeducational Assessment
1–21
The Cattell–Horn–Carroll Model of © The Author(s) 2016
Reprints and permissions:
Cognition for Clinical Assessment sagepub.com/journalsPermissions.nav
DOI: 10.1177/0734282916651360
jpa.sagepub.com
Abstract
The Cattell–Horn–Carroll (CHC) model is a comprehensive model of the major dimensions
of individual differences that underlie performance on cognitive tests. Studies evaluating the
generality of the CHC model across test batteries, age, gender, and culture were reviewed and
found to be overwhelmingly supportive. However, less research is available to evaluate the CHC
model for clinical assessment. The CHC model was shown to provide good to excellent fit in
nine high-quality data sets involving popular neuropsychological tests, across a range of clinically
relevant populations. Executive function tests were found to be well represented by the CHC
constructs, and a discrete executive function factor was found not to be necessary. The CHC
model could not be simplified without significant loss of fit. The CHC model was supported as a
paradigm for cognitive assessment, across both healthy and clinical populations and across both
nonclinical and neuropsychological tests. The results have important implications for theoretical
modeling of cognitive abilities, providing further evidence for the value of the CHC model as a
basis for a common taxonomy across test batteries and across areas of assessment.
Keywords
Cattell–Horn–Carroll, executive function, confirmatory factor analysis, invariance
Introduction
The construct validities of cognitive ability tests used for clinical diagnostic assessment, espe-
cially neuropsychological tests, do not appear to be well established. For example, Dodrill (1997,
1999) pointed out that commonly cited neuropsychological constructs (e.g., attention, learning,
and motor abilities) are not clearly and consistently supported by empirical research. Other stud-
ies have identified uncertainty in the construct validities of various neuropsychological tests
(e.g., Chaytor & Schmitter-Edgecombe, 2003; Dodrill, 1997, 1999; Gansler, Jerram, Vannorsdall,
& Shretlen, 2011; Jurado & Rosselli, 2007; Salthouse, 2005; Salthouse, Atkinson, & Berish,
2003; Spooner & Pachana, 2006).
Sometimes, validity interpretations rely on clinical usage and established practice as much as
on rigorous construct validity evaluations (Lezak, Howieson, & Loring, 2004; E. Strauss,
Corresponding Author:
Paul A. Jewsbury, Melbourne School of Psychological Sciences, The University of Melbourne, Parkville, Victoria 3010,
Australia.
Email: jewsbury@unimelb.edu.au
Sherman, & Spreen, 2006). An example is the taxonomy of “neurocognitive domains” provided
in Diagnostic and Statistical Manual of Mental Disorders (5th ed.; DSM-5; American Psychiatric
Association [APA], 2013). The taxonomy apparently derives from informal clinical usage, with-
out a clear empirical or theoretical justification, but is intended to provide a guide to diagnostic
assessment practices and interpretation of individual patient mental status.
In contrast to less formal clinical taxonomies, the Cattell–Horn–Carroll (CHC) model is based
on psychometric intelligence and cognitive ability research conducted over much of the last cen-
tury (McGrew, 2005; Reynolds, Keith, Flanagan, & Alfonso, 2013). The CHC model is a factor
analysis–based model, which describes the major (broad ability) and minor (narrow ability)
sources or factors of individual differences captured by cognitive tests. The factor structure of
cognitive tests provides a critical test of construct validity and also provides insight on the cogni-
tive abilities, as represented by factors, that underlie cognitive test performance (M. E. Strauss &
Smith, 2009; Widaman & Reise, 1997). For clinical assessment, the most relevant constructs in
the CHC model include the broad constructs of visuospatial ability (Gv), working memory
(Gsm), long-term memory encoding and retrieval (Glr), acquired knowledge or crystallized abil-
ity (Gc), processing speed (Gs), and fluid reasoning (Gf). However, there is also an additional
level of more specific constructs known as narrow abilities, and there are other less well-under-
stood broad constructs, such as auditory ability (Ga) and tactile ability (Gh; McGrew, 2009).
The CHC model is the result of the integration of John Carroll’s (1993) exploratory factor
analytical review of over 460 data sets and the developing consensus in the intelligence literature
around the work of Raymond Cattell, John Horn, and other scholars represented by modern Gf–
Gc theory (McGrew, 2005). The CHC model is the most strongly supported, empirically derived
taxonomy of cognitive abilities (Ackerman & Lohman, 2006; Kaufman, 2009; McGrew, 2005;
Newton & McGrew, 2010) and has influenced the development of most contemporary intelli-
gence tests (Bowden, 2013; Kaufman, 2009; Keith & Reynolds, 2010). For a description of CHC
constructs, see McGrew (2009), Schneider and McGrew (2012), or the Supplemental Materials.
For a history of the CHC model, see Schneider and Flanagan (2015), Schneider and McGrew
(2012), and Ortiz (2015).
The present article is based on the premise that carefully conducted group studies, using well-
researched psychometric methodology and guided by the high-quality cognitive ability research
incorporated in the CHC model, can be used to address current questions in clinical construct
validity. However, the CHC model is primarily supported by studies with nonclinical cognitive
ability tests in community and educational samples (Carroll, 1993). In contrast, clinical assess-
ment often involves clinical tests, or tests specifically developed for assessment of clinical cogni-
tive symptoms, which have been less studied with respect to the CHC model. Furthermore,
clinical assessment often involves special populations, such as individuals with disorders or par-
ticular brain injuries. Finally, some constructs that are commonly assessed in clinical assessment
are not present in the CHC model, such as executive function.
Therefore, for the CHC model to have utility in clinical and neuropsychological assessment,
the critical issues are (a) the generality of the CHC constructs across tests, (b) the generality of
the CHC model across populations, and (c) the potential integration of neuropsychological con-
structs, most notably executive function, into the CHC model.
Table 1. Summary of Studies Showing CHC-Consistent Models for Popular Intelligence Batteries,
Analyzed Alone or With Another Intelligence Battery.
Woodcock (1990) showed that a CHC precursor, modern Gf–Gc theory, and by extension the
CHC model, was consistent with the factorial structure of data sets with Woodcock–Johnson–
Revised in conjunction with the Kaufman Assessment Battery for Children, the Stanford–Binet
IV, the Wechsler Intelligence Scale–III, or the Wechsler Adult Intelligence Scale–Revised
(WAIS-R), respectively. Several additional cross-battery factor analyses have been conducted
recently, and these also show that the CHC constructs are independent of the test used to measure
the respective constructs (see Table 1).
One cross-battery study of particular value involved the Wechsler Intelligence Scale for
Children–III, Wechsler Intelligence Scale for Children–IV, Kaufman Assessment Battery for
Children–II, Woodcock–Johnson III, and Peabody Individual Achievement Test–Revised test
batteries in a single analysis (Reynolds et al., 2013). All children in the sample were administered
the Kaufman Assessment Battery for Children–II along with one or more of the other test batter-
ies as part of the Kaufman Assessment Battery for Children–II test validation process (Kaufman
& Kaufman, 2004). Reynolds and colleagues (2013) found that all but one of the 39 subtests
loaded on the predicted CHC factor and that the CHC factors generalized across each battery.
Woodcock–Johnson III Picture Recognition was found to load better on the long-term memory
encoding and retrieval ability (Glr) factor than on the expected visuospatial ability (Gv) factor,
but this is not incongruent with the CHC model and may instead suggest that Picture Recognition
is primarily dependent on associative abilities rather than on visuospatial abilities. Evidence to
date shows that, when conducted in a careful, confirmatory factor analysis framework, the evi-
dence supports the hypothesis that CHC constructs transcend particular test batteries. This is an
important observation because if the CHC model generalizes to other test batteries and popula-
tions, then the CHC model may provide a useful practical guide to test development and interpre-
tation and, ultimately, a general model of diagnostic assessment.
Table 2. Summary of Measurement Invariance Studies of CHC-Consistent Models of Cognitive Tests.
Note. CHC = Cattell–Horn–Carroll; WAIS-R = Wechsler Adult Intelligence Scale–Revised; WMS-R = Wechsler
Memory Scale–Revised; WISC-IV = Wechsler Intelligence Scale for Children–IV; WJ = Woodcock–Johnson.
Method
Data Analysis
Confirmatory factor analysis was conducted with Mplus Version 6.1 (Muthén & Muthén, 2010)
with maximum likelihood estimation. Goodness of fit was evaluated on the basis of the maxi-
mum likelihood chi-square, as well as commonly reported fit indices including the root mean
square error of approximation (RMSEA), the standardized root mean square residual (SRMR),
the comparative fit index (CFI), and the nonnormed fit or Tucker–Lewis index (TLI). The fit
indices were compared with the cutoff values suggested by Hu and Bentler (1999), namely, <.06
for the RMSEA, <.08 for the SRMR, and >.95 for the CFI and TLI as indicating good fit.
However, the caveats voiced by Marsh, Hau, and Wen (2004) were considered, in particular the
caveat that it is harder to satisfy Hu and Bentler’s cutoff rules for good model fit with a relatively
large number of indicators (viz., more than two or three per factor).
For most data sets, only the correlation or covariance matrices were available. The raw, indi-
vidual-level data set was only available for the data set from Duff, Schoenberg, Scott, and Adams
(2005). The analysis for this data set was conducted with full information maximum likelihood
estimation based on the raw scores. To account for skewness in the neuropsychological variables,
nonnormality robust estimators were also used for the data from Duff and colleagues (specifi-
cally, MLR or robust maximum likelihood with chi-square asymptotically equivalent to the
Yuan-Bentler T2* test statistic; MLM or maximum likelihood with Satorra-Bentler chi-square
statistic; and MLMV or maximum likelihood with mean- and variance- adjusted chi-square sta-
tistic; note that other differences exist in the standard error estimation and missing data treatment;
details in Muthén & Muthén, 2010, and the Supplemental Materials).
1. To allow for a confirmatory analysis to be conducted, at least the correlation matrix was
available either in the article or from the authors.
2. For an adequate sample size, the sample size was at least 200.
3. To be relevant for the present topic, the data set had tests commonly used in neuropsycho-
logical assessment.
4. To allow identification of multiple CHC constructs, the data set had at least 15 different
tests or subtests. This was chosen as an arbitrary but objective criterion to attempt to
avoid factor solutions with sole indicators and to ensure that there would be adequate
sampling of the CHC constructs, especially to model alongside a potential executive
function factor where possible. Because most data sets of cognitive batteries were con-
sidered, a priori, likely to yield at least four CHC factors (typically Gv/Gf, Gc, Gsm,
and Gs), a minimum of three indicators is desirable to identify a factor (Brown, 2006;
Kline, 2011), and at least three additional indicators would be required to identify an
executive factor; 15 indicators was considered a workable minimum number of
indicators.
5. To provide confidence that the CHC constructs were correctly identified, the data set had
tests with generally accepted and well-established construct validity (e.g., Wechsler
Intelligence Scales for Adults or Children, Wechsler Memory Scales, Stanford–Binet
Intelligence Scales, or Woodcock–Johnson Intelligence Scales) along with tests of more
controversial construct validity (e.g., executive function tests).
The following two criteria were optional to obtain as wide a variety of data sets as possible,
but for special relevance for the present topic, the following criteria were sought.
6. Ideally, the population was relevant to neuropsychological assessment (e.g., a clinical
population).
7. Ideally, some tests are identified as executive function tests by the study authors.
Procedure
The models reported below were specified, a priori, to be consistent both with conceptual descrip-
tions of CHC theory and previous research (Carroll, 1993; Flanagan, McGrew, & Ortiz, 2000;
McGrew, 2009). When there were multiple indicators from the same test, the residuals were
allowed to correlate to account for method variance (Kline, 2011; Larrabee, 2003). After the
model was estimated, any nonsignificant factor loadings and residual correlations were removed
from the model. The standardized residuals and modification indices were examined, but post
hoc modifications were made with reluctance (MacCallum, Roznowski, & Necowitz, 1992).
Modifications were only made when the associated modification index was significant and very
large relative to other modification indices for the same model, and the modification was theo-
retically interpretable. The one post hoc modification, in one data set, that met this criterion is
described in detail below.
The possible addition of an executive function construct to the respective CHC models, speci-
fied for each data set, was evaluated by adding an executive function factor to each model if the
original authors hypothesized certain indicators to be executive function tests. Wherever executive
function factors were specified in the present study, the tests selected as executive function indica-
tors were exactly consistent with the original authors’ classification of executive function tests.
This strategy required that the executive function indicators were double-loaded on the relevant
CHC factor and the new executive function factor. Loading the executive function indicators on
both the relevant CHC factor and the new executive function factor corresponds to the dominant
conceptual view that executive function indicators are confounded with nonexecutive variance
(known as the task impurity problem; Miyake, Friedman, Emerson, Witzki, Howerter, & Wager,
2000). However, this double-loaded model may be underidentifed. Therefore, as a second possible
executive function model, the loadings of the executive function tests on CHC factors were
removed, such that executive function tests were loaded only on the executive function actor.
Finally, the hypothesis that putative executive function tests might measure executive func-
tions specific to each test was investigated. Reliable unique variance for each test was estimated
with a method described in the Supplemental Materials. The hypothesis that putative executive
function tests have greater unique variance than nonexecutive tests was examined with a t test.
Results
Nine data sets were selected for reanalysis. These data sets, along with the fit statistics of the
associated CHC model, are shown in Table 3. Due to space limitations, only one reanalysis was
described here in full detail as an example. The remaining reanalyses were described in full detail
in the Supplemental Materials, and only the overall results were reported in the main body of the
text.
Table 3. Selected Studies and Fit Statistics for the CHC Model.
Note. All above studies used adult samples. CHC = Cattell–Horn–Carroll; RMSEA = root mean square error of
approximation; SRMR = standardized root mean square residual; CFI = comparative fit index; TLI = Tucker–Lewis
index.
Figure 1. Final model for the Duff, Schoenberg, Scott, and Adams (2005) reanalysis.
Note. WCST = Wisconsin Card Sorting Test; Gsm = working memory; ROCFT = Rey–Osterrieth Complex Figure
Test; Gv = visuospatial ability; Gf = fluid reasoning; Glr = long-term memory encoding and retrieval; Gc = acquired
knowledge or crystallized ability; RAVLT = Rey Auditory Verbal Learning Test; COWAT = Controlled Oral Word
Association Test; Gs = processing speed.
In this data set, the original authors described five indicators as executive function tests.
Adding an executive function factor modeled by Wisconsin Card Sort Test, Controlled Oral Word
Association Test, Trail Making Test–Part B, WAIS-R Similarities, and WAIS-R Digit Span–
backward to the CHC model, with each test also loaded onto the relevant CHC factor, produced
a model with a nonpositive definite latent variable covariance matrix. This may be related to high
estimated correlations between the executive function factor and Gsm and Gs (r = .91, SE = .32,
and r = 1.05, SE = .14, respectively). The alternate model, where the indicators of the executive
function factor were only loaded on the executive function factor, also resulted in a nonpositive
definite latent variable covariance matrix, and similar high estimated correlations. As a conse-
quence, both variants of the executive function model were not viable alternatives and the execu-
tive function factor was found to be statistically redundant.
Table 4 shows the estimates and standard errors for the unique variances for each indicator in
the data set. On average, test indicators were made up of 54% (SD = 13) variance explained by
the CHC constructs, 32% (SD = 10) unreliable variance, and 13% (SD = 15) reliable unique vari-
ance. The variance accounted for in the model by the correlated residuals is counted in the unique
variance. The unique variance of the five executive function measures (M = 15%) was not sig-
nificantly different from the unique variance observed for the 22 nonexecutive function measures
(M = 13%; t = .21, df = 25, p = .84).
The reanalyses for the remaining eight data sets produced the same pattern of results to
those observed for the reanalysis of the Duff et al. (2005) data. The full description of the con-
firmatory factor analyses for all nine data sets is provided in the Supplemental Materials. In
Table 4. Unique Variance Estimates and Standard Errors for the Final CHC Model for the Duff,
Schoenberg, Scott, and Adams (2005) Reanalysis.
Residual
Note. CHC = Cattell–Horn–Carroll; WMS-R = Wechsler Memory Scale–Revised; WAIS-R = Wechsler Adult
Intelligence Scale–Revised; SE = standard error.
aPaolo, Axelrod, and Tröster (1996).
bRuff, Light, Parker, and Levin (1996).
cGoldstein and Watson (1989).
dWechsler (1987).
eMitrushina and Satz (1991).
fWechsler (1981).
gSnow, Tierney, Zorzitto, Fisher, and Reid (1989).
every case, after the initial CHC model was specified, only one modification was made across
any of the data sets, aside from dropping nonsignificant loadings that had negligible effects of
the fit indices (see Supplemental Materials). As described above, residuals from WAIS-R
Block Design and WAIS-R Object Assembly were allowed to correlate in the Duff et al. reanal-
ysis. The correlation was replicated in the reanalyses of Goldstein and Shelly’s (1972);
Salthouse, Fristoe, and Rhee’s (1996); and Bowden, Cook, Bardenhagen, Shores, and
Carstairs’s (2004) data sets.
The only uncertainty in classifying the measures according to CHC theory was due to the
tactile indicators in the Goldstein and Shelly (1972) data set. Little is known about the latent
structure of tactile tests (Decker, 2010; Stankov, Seizova-Calić, & Roberts, 2001). The modeling
of the tactile indicators was necessarily partly exploratory, where two alternate models were used
to represent the tactile tests in the reanalysis of Goldstein and Shelly’s data set. While the results
supported a left–right (or nondominant–dominant) dichotomy, further research is necessary to
confirm and clarify whether this apparent dichotomy is replicable and goes beyond tactile tests
such as applying to psychomotor tests.
As shown in Table 3, all CHC models fit excellently according to established cutoff criteria
for approximate fit statistics (Hu & Bentler, 1999). A highly significant loss of fit was observed
in all cases where the CHC model was simplified by merging the most highly correlated factors
(see Supplemental Materials). In all studies where an executive function factor could be specified
alongside the CHC models, the model was inadmissible. Even when the executive function factor
was specified independently from the CHC factors, in all cases the resulting model had a non-
positive definite latent covariance matrix associated with the executive function factor. This sug-
gests that the executive function factor was a linear function of the CHC factors and statistically
redundant. Similarly, in these studies the putative executive function tests did not have signifi-
cantly greater unique variance than nonexecutive function tests (see Supplemental Materials).
Together, these results suggest that there is no distinct general executive function factor and that
the putative executive function indicators do not individually measure specific executive func-
tions separate from CHC constructs.
Discussion
In all reanalyses, the CHC model fit excellently and in line with the widely adopted, conservative
fit guidelines described by Hu and Bentler (1999) and critiqued by Marsh et al. (2004). The find-
ing that CHC model fit well across all data sets, considering that the data sets shared many tests
in common that were modeled exactly the same for each data set, provides good evidence that the
CHC model is an excellent fitting model that is replicable and consistent across diverse tests and
populations. In particular, the data sets together provided replicated evidence for the CHC con-
struct validity for many of the most popular neuropsychological tests and batteries (Rabin, Barr,
& Burton, 2005). Furthermore, the CHC construct validity was supported across a range of clini-
cally relevant populations, including patients referred for neuropsychological evaluation, com-
munity, elderly, and at-risk for Alzheimer’s disease populations (see Table 3). Finally, the CHC
model was found to apply equally well to traditional instruments such as the WAIS and putative
executive function measures that are commonly believed to measure constructs beyond the CHC
constructs.
For every data set, the CHC model could not be reduced to fewer factors without significant
loss of fit. This finding has several implications. First, cognitive ability could not be reduced to a
single latent variable, thus showing the superiority of multiple-factor models of cognitive ability
over a single-factor model of general intelligence (Schneider & Newman, 2015). Second, the
results further support the CHC broad factors as distinct, well-supported constructs and the supe-
riority of theory-based confirmatory factor analysis for the selection of the number of factors
over exploratory methods (Keith, Caemmerer, & Reynolds, 2016). Finally, the results suggest
that merging and collapsing across CHC broad factors to produce aggregated constructs such as
executive function is not empirically supported (Jewsbury et al., 2016).
This article was based on the best quality data sets from the first author’s unpublished PhD
dissertation that involved reanalysis of 31 published data sets (Jewsbury, unpublished). Based on
Table 5. Empirically Verified CHC Construct Validity of Popular Neuropsychological Tests.
(continued)
Table 5. (continued)
the results of all 31 reanalyses, empirically verified CHC classification for the most popular clini-
cal cognitive tests is given in Table 5.
CHC model fit well in each data set provides evidence that there are no additional constructs
measured by commonly used clinical tests examined in the present study, over and above the
CHC broad factors. The examination of unique variance provided a direct test of the hypothesis
that the unexplained variance is greater for executive as opposed to nonexecutive tests. The
results failed to support the hypothesis that there is more unique variance in putative executive
tests. Furthermore, the size of the estimated unique variances suggests that there is limited capac-
ity for putative executive function tests to have additional predictive and diagnostic utility above
what is attributable to the common factors in the CHC model.
The putative executive function tests were distributed across the CHC constructs such as Gs,
Gsm, Gv, and Gf. In other words, tests commonly grouped under the executive function rubric do
not load on the same construct. This finding of heterogeneous construct loadings has two impor-
tant implications. First, the results suggest that there is no unitary executive function construct
underlying all executive function tests, consistent with arguments by Parkin (1998) based on
neuropsychological evidence. Executive function should not be referred to as a separate domain
of cognition on the same level as broad CHC constructs such as processing speed (Gs) and visuo-
spatial abilities (Gv). Averaging or combining various executive function test scores potentially
leads to results that confound cognitive constructs. Therefore, systematic reviews and meta-anal-
yses should not group tests under the executive function rubric. Rather the CHC taxonomy may
be more useful for systematic reviews and meta-analyses (Loughman, Bowden, & D’Souza,
2014). Second, the results suggest that equating executive function with Gf, as has been advo-
cated (e.g., Blair, 2006; Decker, Hill, & Dean, 2007), may be misleading, as not all executive
function tests are Gf tests.
Adoption of the CHC model as the basic taxonomy of cognitive abilities in both clinical and
nonclinical populations would allow for more contentious issues to be properly evaluated. A
common view is that studies of nonclinical or mixed clinical populations may obscure cognitive
differences specific to a certain clinical condition or set of conditions (e.g., Delis, Jacobson,
Bondi, Hamilton, & Salmon, 2003). However, an empirically based and well-supported factor
model does not deny the possibility of condition-specific dimensions of cognition but instead
would allow the issues to be evaluated directly with the methods of measurement invariance
(Meredith, 1993).
Conclusion
Analysis of a representative sample of the best available relevant data sets revealed that the same
cognitive constructs that are reflected in test scores in community and educational samples
appear to underlie individual differences captured by neuropsychological tests, including in vari-
ous clinically relevant populations. The present results suggest that the CHC model of cognitive
abilities is an empirically grounded taxonomy for the evaluation of construct validity of diagnos-
tic cognitive tests and provides a basic theoretical paradigm for clinical cognitive assessment.
Finally, to paraphrase an anonymous reviewer, the results provide evidence for a common tax-
onomy of cognitive abilities that enables greater consistency in the meaning and interpretation of
test results across test batteries and practitioners alike.
Acknowledgments
The authors thank the two anonymous reviewers for their constructive criticism and suggestions.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material
The supplementary material is available at http://jpa.sagepub.com/supplemental
References
Ackerman, P. L., & Lohman, D. F. (2006). Individual differences in cognitive functions. In P. A. Alexander
& P. Winne (Eds.), Handbook of educational psychology (2nd ed., pp. 139-161). Mahwah, NJ:
Lawrence Erlbaum.
Alvarez, J. A., & Emory, E. (2006). Executive function and the frontal lobes: A meta-analytic review.
Neuropsychology Review, 16, 17-42. doi:10.1007/s11065-006-9002-x
American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.).
Arlington, VA: American Psychiatric Publishing.
Barch, D. M. (2005). The cognitive neuroscience of schizophrenia. Annual Review of Psychology, 1, 321-
353. doi:10.1146/annurev.clinpsy.1.102803.143959
Blair, C. (2006). How similar are fluid cognition and general intelligence? A developmental neuroscience
perspective on fluid cognition as an aspect of human cognitive ability. Behavioral and Brain Sciences,
29, 109-125. doi:10.1017/S0140525X06009034
Bowden, S. C. (2013). Theoretical convergence in assessment of cognition. Journal of Psychoeducational
Assessment, 31, 148-156. doi:10.1177/0734282913478035
Bowden, S. C., Cook, M. J., Bardenhagen, F. J., Shores, E. A., & Carstairs, J. R. (2004). Measurement invari-
ance of core cognitive abilities in heterogeneous neurological and community samples. Intelligence,
32, 363-389. doi:10.1016/j.intell.2004.05.002
Bowden, S. C., Gregg, N., Bandalos, D., Davis, M., Coleman, C., Holdnack, J. A., & Weiss, L. G.
(2008). Latent mean and covariance differences with measurement equivalence in college stu-
dents with developmental difficulties versus the Wechsler Adult Intelligence Scale–III/Wechsler
Memory Scale–III normative sample. Educational and Psychological Measurement, 68, 621-642.
doi:10.1177/0013164407310126
Bowden, S. C., Lange, R. T., Weiss, L. G., & Saklofske, D. H. (2008). Invariance of the measurement model
underlying the Wechsler Adult Intelligence Scale–III in the United States and Canada. Education and
Psychological Measurement, 68, 1024-1040. doi:10.1177/0013164408318769
Bowden, S. C., Lissner, D., McCarthy, K. A. L., Weiss, L. G., & Holdnack, J. A. (2007). Metric and struc-
tural equivalence of core cognitive abilities measured with the Wechsler Adult Intelligence Scale–III in
the United States and Australia. Journal of Clinical and Experimental Neuropsychology, 29, 768-780.
doi:10.1080/13803390601028027
Bowden, S. C., Ritter, A. J., Carstairs, J. R., Shores, E. A., Pead, S., Greeley, J. D., & Clifford, C. C. (2001).
Factorial invariance for combined Wechsler Adult Intelligence Scale–Revised and Wechsler Memory
Scale–Revised scores in a sample of clients with alcohol dependency. The Clinical Neuropsychologist,
15, 69-80. doi:10.1076/clin.15.1.69.1910
Bowden, S. C., Saklofske, D. H., & Weiss, L. G. (2011). Augmenting the core battery with supplementary
subtests: Wechsler Adult Intelligence Scale–IV measurement invariance across the United States and
Canada. Assessment, 18, 133-140. doi:10.1177/1073191110381717
Bowden, S. C., Weiss, L. G., Holdnack, J. A., Bardenhagen, F. J., & Cook, M. J. (2008). Equivalence of a
measurement model of cognitive abilities in U.S. standardization and Australian neuroscience samples.
Assessment, 15, 132-144. doi:10.1177/1073191107309345
Bowden, S. C., Weiss, L. G., Holdnack, J. A., & Lloyd, D. (2006). Age-related invariance of abilities
measured with the Wechsler Adult Intelligence Scale–III. Psychological Assessment, 18, 334-339.
doi:10.1037/1040-3590.18.3.334
Brown, T. A. (2006). Confirmatory factor analysis for applied research. New York: Guilford.
Carroll, J. B. (1993). Human cognitive abilities: A survey of factor-analytic studies. New York, NY:
Cambridge University Press.
Chapman, J. P., & Chapman, L. J. (1983). Reliability and the discrimination of normal and pathological
groups. Journal of Nervous and Mental Disease, 171, 658-661. doi:10.1097/00005053-198311000-
00003
Chaytor, N., & Schmitter-Edgecombe, M. (2003). The ecological validity of neuropsychological tests:
A review of the literature on everyday cognitive skills. Neuropsychology Review, 13, 181-197.
doi:10.1023/B:NERV.0000009483.91468.fb
Chen, H., Keith, T., Chen, Y., & Chang, B. (2009). What does the WISC-IV measure? Validation of the
scoring and CHC-based interpretative approaches. Journal of Research in Education Sciences, 54,
85-108.
Chen, H., & Zhu, J. (2008). Factor invariance between genders of the Wechsler Intelligence Scale for
Children–Fourth Edition. Personality and Individual Differences, 45, 260-266. doi:10.1016/j.
paid.2008.04.008
Chen, H., & Zhu, J. (2012). Measurement invariance of WISC-IV across normative and clinical samples.
Personality and Individual Differences, 52, 161-166. doi:10.1016/j.paid.2011.10.006
Decker, S. L. (2010). Tactile measures in the structure of intelligence. Canadian Journal of Experimental
Psychology, 64, 53-59. doi:10.1037/a0015845
Decker, S. L., Hill, S. K., & Dean, R. S. (2007). Evidence of construct similarity in executive func-
tions and fluid reasoning abilities. International Journal of Neuroscience, 117, 735-748.
doi:10.1080/00207450600910085
Delis, D. C., Jacobson, M., Bondi, M. W., Hamilton, J. M., & Salmon, D. P. (2003). The myth of test-
ing construct validity using factor analysis or correlations with normal or mixed clinical populations:
Lessons from memory assessment. Journal of International Neuropsychological Society, 9, 936-946.
doi:10.1017/S1355617703960139
Denckla, M. B. (1994). Measurement of executive function. In G. R. Lyon (Ed.), Frames of reference for
the assessment of learning disabilities: New views on measurement issues (pp. 117-142). Baltimore,
MD: Paul H. Brookes.
Diamond, A. (2013). Executive functions. Annual Review of Psychology, 64, 135-168. doi:10.1146/
annurev-psych-113011-143750
Dickinson, D., Goldberg, T. E., Gold, J. M., Elvevåg, B., & Weinberger, D. R. (2011). Cognitive fac-
tor structure and invariance in people with schizophrenia, their unaffected siblings, and controls.
Schizophrenia Bulletin, 37, 1157-1167. doi:10.1093/schbul/sbq018
Dickinson, D., Ragland, J. D., Calkins, M. E., Gold, J. M., & Gur, R. C. (2006). A comparison of cog-
nitive structure in schizophrenia patients and healthy controls using confirmatory factor analysis.
Schizophrenia Research, 85, 20-29. doi:10.1016/j.schres.2006.03.003
Dodrill, C. B. (1997). Myths of neuropsychology. The Clinical Neuropsychologist, 11, 1-17. doi:10.1080/
13854049708407025
Dodrill, C. B. (1999). Myths of neuropsychology: Further considerations. The Clinical Neuropsychologist,
13, 562-572. doi:10.1076/1385-4046(199911)13:04;1-Y;FT562
Dowling, N. M., Hermann, B., La Rue, A., & Sager, M. A. (2010). Latent structure and factorial invariance
of a neuropsychological test battery for the study of preclinical Alzheimer’s disease. Neuropsychology,
24, 742-756. doi:10.1037/a0020176
Duff, K., Schoenberg, M. R., Scott, J. G., & Adams, R. L. (2005). The relationship between executive
functioning and verbal and visual learning and memory. Archives of Clinical Neuropsychology, 20,
111-122. doi:10.1016/j.acn.2004.03.003
Elliott, C. D. (2007). Differential Ability Scales–II. San Antonio, TX: Pearson.
Flanagan, D. P., McGrew, K. S., & Ortiz, S. O. (2000). The Wechsler Intelligence Scales and Gf-Gc theory:
A contemporary approach to interpretation. Boston, MA: Allyn & Bacon.
Floyd, R. G., Bergeron, R., Hamilton, G., & Parra, G. R. (2010). How do executive functions fit with the
Cattell-Horn-Carroll model? Some evidence from a joint factor analysis of the Delis-Kaplan executive
function system and the Woodcock-Johnson III tests of cognitive abilities. Psychology in the Schools,
47, 721-738. doi:10.1002/pits.20500
Friedman, N. P., Miyake, A., Corley, R. P., Young, S. E., DeFries, J. C., & Hewitt, J. K. (2006). Not
all executive functions are related to intelligence. Psychological Science, 17, 172-179. doi:10.1111/
j.1467-9280.2006.01681.x
Gansler, D. A., Jerram, M. W., Vannorsdall, T. D., & Shretlen, D. J. (2011). Does the Iowa Gambling Task
measure executive function? Archives of Clinical Neuropsychology, 26, 706-717. doi:10.1093/arclin/
acr082
Genderson, M. R., Dickinson, D., Diaz-Asper, C. M., Egan, M. F., Weinberger, D. R., & Goldberg,
T. E. (2007). Factor analysis of neurocognitive tests in a large sample of schizophrenic pro-
bands, their siblings, and healthy controls. Schizophrenia Research, 94, 231-239. doi:10.1016/j.
schres.2006.12.031
Gladsjo, J. A., McAdams, L. A., Palmer, B. W., Moore, D. J., Jeste, D. V., & Heaton, R. K. (2004). A six-
factor model of cognition in schizophrenia and related psychotic disorders: Relationships with clinical
symptoms and functional capacity. Schizophrenia Bulletin, 30, 739-754.
Golden, C. J., Kane, R., Jerry, S., Moses, J. A., Cardellino, J. P., Templeton, R., . . . Graber, B.
(1981). Relationship of the Halstead-Reitan Neuropsychological Battery to the Luria-Nebraska
Neuropsychological Battery. Journal of Consulting and Clinical Psychology, 49, 410-417.
doi:10.1037/0022-006X.49.3.410
Goldstein, G., & Shelly, C. H. (1972). Statistical and normative studies of the Halstead Neuropsychological
Test Battery relevant to a neuropsychiatric hospital setting. Perceptual and Motor Skills, 34, 603-620.
doi:10.2466/pms.1972.34.2.603
Goldstein, G., & Watson, J. R. (1989). Test-retest reliability of the Halstead-Reitan battery and the
WAIS in a neuropsychiatric population. The Clinical Neuropsychologist, 3, 265-272. doi:10.1080/
13854048908404088
Greenaway, M. C., Smith, G. E., Tangalos, E. G., Geda, Y. E., & Ivnik, R. J. (2009). Mayo older
Americans normative studies: Factor analysis of an expanded neuropsychological battery. The Clinical
Neuropsychologist, 23, 7-20. doi:10.1080/13854040801891686
Lezak, M., Howieson, D. B., & Loring, D. W. (2004). Neuropsychological assessment (4th ed.). New York,
NY: Oxford University Press.
Loring, D. W., & Larrabee, G. J. (2006). Sensitivity of the Halstead and Wechsler Test Batteries to brain
damage: Evidence from Reitan’s original validation sample. The Clinical Neuropsychologist, 20, 221-
229. doi:10.1080/13854040590947443
Loughman, A., Bowden, S. C., & D’Souza, W. (2014). Cognitive functioning in idiopathic generalised
epilepsies: A systematic review and meta-analysis. Neuroscience & Biobehavioral Reviews, 43, 20-34.
doi:10.1016/j.neubiorev.2014.02.012
MacCallum, R. C., Roznowski, M., & Necowitz, L. B. (1992). Model modifications in covariance structure
analysis: The problem of capitalization on chance. Psychological bulletin, 111, 490.
Marsh, H. W., Hau, K.-T., & Wen, Z. (2004). In search of golden rules. Structural Equation Modeling, 11,
320-341. doi:10.1207/s15328007sem1103_2
McCabe, D. P., Roediger, H. L., McDaniel, M. A., Balota, D. A., & Hambrick, D. Z. (2010). The relation-
ship between working memory capacity and executive functioning: Evidence for a common executive
attention construct. Neuropsychology, 24, 222-243. doi:10.1037/a0017619
McGrew, K. S. (2005). The Cattell-Horn-Carroll theory of cognitive abilities. In D. P. Flanagan & P. L.
Harrison (Eds.), Contemporary intellectual assessment: Theories, tests, and issues (2nd ed., pp. 136-
181). New York, NY: Guilford Press.
McGrew, K. S. (2009). CHC theory and the human cognitive abilities project: Standing on the shoulders of
the giants of psychometric intelligence research. Intelligence, 37, 1-10. doi:10.1016/j.intell.2008.08.004
Meredith, W. (1993). Measurement invariance, factor analysis and factorial invariance. Psychometrika, 58,
525-543. doi:10.1007/BF02294825
Meredith, W., & Teresi, J. A. (2006). An essay on measurement and factorial invariance. Medical Care, 44,
S69-S77. doi:10.1097/01.mlr.0000245438.73837.89
Mitrushina, M., & Satz, P. (1991). Effect of repeated administration of a neuropsychological battery in the
elderly. Journal of Clinical Psychology, 47, 790-801. doi:10.1002/1097-4679(199111)47:6<790::AID-
JCLP2270470610>3.0.CO;2-C
Miyake, A., Friedman, N. P., Emerson, M. J., Witzki, A. H., Howerter, A., & Wager, T. D. (2000). The
unity and diversity of executive functions and their contributions to complex “frontal lobe” tasks: A
latent variable analysis. Cognitive Psychology, 41, 49-100.
Muthén, L. K., & Muthén, B. O. (2010). Mplus user’s guide version 6. Los Angeles, CA: Author.
Newton, J. H., & McGrew, K. S. (2010). Introduction to the special issue: Current research in Cattell–Horn–
Carroll–based assessment. Psychology in the Schools, 47, 621-634. doi:10.1002/pits.20495
Ortiz, S. O. (2015). CHC theory of intelligence. In S. Goldstein, D. Princiotta, & J. A. Naglieri (Eds.),
Handbook of intelligence: Evolutionary theory, historical perspective, and current concepts (pp. 209-
228). New York, NY: Springer.
Paolo, A. M., Axelrod, B. N., & Tröster, A. I. (1996). Test-retest stability of the Wisconsin Card Sorting
Test. Assessment, 3, 137-143. doi:10.1177/107319119600300205
Parkin, A. J. (1998). The central executive does not exist. Journal of the International Neuropsychological
Society, 4, 518-522. doi:10.1017/S1355617798005128
Penadés, R., Catalán, R., Rubia, K., Andrés, S., Salamero, M., & Gastró, C. (2007). Impaired response
inhibition in obsessive compulsive disorder. European Psychiatry, 22, 404-410. doi:10.1016/j.
eurpsy.2006.05.001
Phelps, L., McGrew, K. S., Knopik, S. N., & Ford, L. (2005). The general (g), broad, and narrow CHC
stratum characteristics of the WJ III and WISC-III tests: A confirmatory cross-battery investigation.
School Psychology Quarterly, 20, 66-88. doi:10.1521/scpq.20.1.66.64191
Pontón, M. O., Gonzalez, J. J., Hernandez, I., Herrera, L., & Higareda, I. (2000). Factor analysis of the
neuropsychological screening battery for Hispanics (NeSBHIS). Applied Neuropsychology, 7, 32-39.
doi:10.1207/S15324826AN0701_5
Rabbitt, P. (1997). Introduction: Methodologies and models in the study of executive function. In P. Rabbitt
(Ed.), Methodology of frontal and executive function (pp. 1-38). Hove, UK: Psychology Press.
Rabin, L. A., Barr, W. B., & Burton, L. A. (2005). Assessment practices of clinical neuropsychologists in
the United States and Canada: A survey of INS, NAN, and APA Division 40 members. Archives of
Clinical Neuropsychology, 20, 33-65. doi:10.1016/j.acn.2004.02.005
Reynolds, M. R., Keith, T. Z., Fine, J. G., Fisher, M. E., & Low, J. A. (2007). Confirmatory factor struc-
ture of the Kaufman Assessment Battery for Children: Consistency with Cattell-Horn-Carroll theory.
School Psychology Quarterly, 22, 511-539. doi:10.1037/1045-3830.22.4.511
Reynolds, M. R., Keith, T. Z., Flanagan, D. P., & Alfonso, V. C. (2013). A cross-battery, reference variable,
confirmatory factor analytic investigation of the CHC taxonomy. Journal of School Psychology, 51,
535-555. doi:10.1016/j.jsp.2013.02.003
Roca, M., Parr, A., Thompson, R., Woolgar, A., Torralva, T., Antoun, N., . . . Duncan, J. (2010). Executive
function and fluid intelligence after frontal lobe lesions. Brain, 133, 234-247.
Roid, G. H. (2003). Stanford-Binet Intelligence Scales, fifth edition: Technical manual. Itasca, IL: Riverside
Publishing.
Royall, D. R., Lauterbach, E. C., Cummings, J. L., Reeve, A., Rummans, T. A., Kaufer, D. I., . . . Coffey,
C. E. (2002). Executive control function: A review of its promise and challenges for clinical research.
A report from the committee on research of the American Neuropsychiatric Association. The Journal
of Neuropsychiatry & Clinical Neurosciences, 14, 377-405. doi:10.1176/jnp.14.4.377
Ruff, R. M., Light, R. H., Parker, S. B., & Levin, H. S. (1996). Benton Controlled Oral Word Association
Test: Reliability and updated norms. Archives of Clinical Neuropsychology, 11, 329-338. doi:10.1093/
arclin/11.4.329
Salthouse, T. A. (2005). Relations between cognitive abilities and measures of executive functioning.
Neuropsychology, 4, 532-545. doi:10.1037/0894-4105.19.4.532
Salthouse, T. A., Atkinson, T. M., & Berish, D. E. (2003). Executive functioning as a potential mediator
of age-related cognitive decline in normal adults. Journal of Experimental Psychology: General, 132,
566-594. doi:10.1037/0096-3445.132.4.566
Salthouse, T. A., Fristoe, N., & Rhee, S. H. (1996). How localized are age-related effects on neuropsycho-
logical measures? Neuropsychology, 10, 272-285. doi:10.1037/0894-4105.10.2.272
Sanders, S., McIntosh, D. E., Dunham, M., Rothlisberg, B. A., & Finch, H. (2007). Joint confirmatory fac-
tor analysis of the Differential Ability Scales and the Woodcock-Johnson Tests of Cognitive Abilities–
Third Edition. Psychology in the Schools, 44, 119-138. doi:10.1002/pits.20211
Schneider, W. J., & Flanagan, D. P. (2015). The relationship between theories of intelligence and intelligence
tests. In S. Goldstein, D. Princiotta, & J. A. Naglieri (Eds.), Handbook of intelligence: Evolutionary
theory, historical perspective, and current concepts (pp. 317-340). New York, NY: Springer.
Schneider, W. J., & McGrew, K. S. (2012). The Cattell–Horn–Carroll model of intelligence. In D. P.
Flanagan & P. L. Harrison (Eds.), Contemporary intellectual assessment: Theories, tests, and issues
(3rd ed., pp. 99-144). New York, NY: Guilford Press.
Schneider, W. J., & Newman, D. A. (2015). Intelligence is multidimensional: Theoretical review and impli-
cations of specific cognitive abilities. Human Resource Management Review, 25, 12-27.
Shallice, T. (1982). Specific impairments of planning. Philosophical Transactions of the Royal Society of
London, Series B: Biological Sciences, 298, 199-209. doi:10.1098/rstb.1982.0082
Sherer, M., Scott, J. G., Parsons, O. A., & Adams, R. L. (1994). Relative sensitivity of the WAIS-R subtests
and selected HRNB measures to the effects of brain damage. Archives of Clinical Neuropsychology, 9,
427-436. doi:10.1093/arclin/9.5.427
Siedlecki, K. L., Manly, J. J., Brickman, A. M., Schupf, N., Tang, M. X., & Stern, Y. (2010). Do neu-
ropsychological tests have the same meaning in Spanish speakers as they do in English speakers?
Neuropsychology, 24, 402-411. doi:10.1037/a0017515
Snow, W. G., Tierney, M. C., Zorzitto, M. L., Fisher, R. H., & Reid, D. (1989). WAIS-R test-retest reliabil-
ity in a normal elderly sample. Journal of Clinical and Experimental Neuropsychology, 11, 423-428.
doi:10.1080/01688638908400903
Spooner, D. M., & Pachana, N. A. (2006). Ecological validity in neuropsychological assessment: A case
for greater consideration in research with neurologically intact populations. Archives of Clinical
Neuropsychology, 21, 327-337. doi:10.1016/j.acn.2006.04.004
Stankov, L., Seizova-Calić, T., & Roberts, R. D. (2001). Tactile and kinesthetic perceptual processes
within the taxonomy of human cognitive abilities. Intelligence, 29, 1-29. doi:10.1016/S0160-
2896(00)00038-6
Strauss, E., Sherman, E. M., & Spreen, O. (2006). A compendium of neuropsychological tests: Administration,
norms, and commentary (3rd ed.). New York, NY: Oxford University Press.
Strauss, M. E., & Smith, G. T. (2009). Construct validity: Advances in theory and methodology. Annual
Review of Clinical Psychology, 5, 1-25. doi:10.1146/annurev.clinpsy.032408.153639
Taub, G. E., & McGrew, K. S. (2004). A confirmatory factor analysis of Cattell-Horn-Carroll theory and
cross-age invariance of the Woodcock-Johnson tests of cognitive abilities III. School Psychology
Quarterly, 19, 72-87. doi:10.1521/scpq.19.1.72.29409
Tucker, L. R. (1958). An inter-battery method of factor analysis. Psychometrika, 23, 111-136. doi:10.1007/
BF02289009
Tuokko, H. A., Chou, P. H. B., Bowden, S. C., Simard, M., Ska, B., & Crossley, M. (2009). Partial measure-
ment equivalence of French and English versions of the Canadian Study of Health and Aging neuropsy-
chological battery. Journal of the International Neuropsychological Society, 15, 416-425. doi:10.1017/
S1355617709090602
Tusing, M. E., & Ford, L. (2004). Examining preschool cognitive abilities using a CHC framework.
International Journal of Testing, 4, 91-114. doi:10.1207/s15327574ijt0402_1
Wechsler, D. (1981). Wechsler Adult Intelligence Scale–Revised. New York, NY: The Psychological
Corporation.
Wechsler, D. (1987). Wechsler Memory Scale–Revised. San Antonio, TX: The Psychological Corporation.
Wechsler, D. (2003). Wechsler Intelligence Scale for Children–Fourth Edition (WISC-IV). San Antonio,
TX: The Psychological Corporation.
Weiss, L. G., Keith, T. Z., Zhu, J., & Chen, H. (2013a). WAIS-IV and clinical validation of the four-
and five-factor interpretative approaches. Journal of Psychoeducational Assessment, 31, 94-113.
doi:10.1177/0734282913478030
Weiss, L. G., Keith, T. Z., Zhu, J., & Chen, H. (2013b). WISC-IV and clinical validation of the four-
and five-factor interpretative approaches. Journal of Psychoeducational Assessment, 31, 114-131.
doi:10.1177/0734282913478032
Whitely, S. E. (1983). Construct validity: Construct representation versus nomothetic span. Psychological
Bulletin, 93, 179-197. doi:10.1037/0033-2909.93.1.179
Widaman, K. F., & Reise, S. P. (1997). Exploring the measurement invariance of psychological instruments:
Applications in the substance use domain. In K. Bryant & M. Windle (Eds.), The science of prevention:
Methodological advances from alcohol and substance abuse research (pp. 281-324). Washington, DC:
American Psychological Association.
Woodcock, R. W. (1990). Theoretical foundations of the WJ-R measures of cognitive ability. Journal of
Psychoeducational Assessment, 8, 231-258. doi:10.1177/073428299000800303
Woodcock, R. W., McGrew, K. S., & Mather, N. (2001). Woodcock-Johnson tests of achievement. Itasca,
IL: Riverside Publishing.