You are on page 1of 13

ARTICLE IN PRESS

G Model
TSC-181; No. of Pages 13

Thinking Skills and Creativity xxx (2012) xxx–xxx

Contents lists available at SciVerse ScienceDirect

Thinking Skills and Creativity


journal homepage: http://www.elsevier.com/locate/tsc

The development and psychometric validation of a Critical


Thinking Disposition Scale
Edward M. Sosu ∗
School of Education, University of Aberdeen, King’s College, AB24 5UA Aberdeen, UK

a r t i c l e i n f o a b s t r a c t

Article history: This article describes the development and psychometric evaluation of a Critical Thinking
Received 14 May 2012 Disposition Scale (CTDS). Items for the scale were generated using taxonomies of important
Received in revised form 16 August 2012
thinking dispositions discussed in the literature. Participants comprised of two cohorts of
Accepted 12 September 2012
first year undergraduate and graduate students enrolled on a programme in education. Psy-
Available online xxx
chometric evaluation was carried out using two studies. In Study 1, an exploratory factor
analysis (n = 467) revealed a two-factor model for the scale: Critical Openness and Reflec-
Keywords:
Critical thinking tive Scepticism. In Study 2 (n = 371), a multigroup confirmatory factor analysis (MGCFA)
Dispositions supported the two-factor model across both undergraduate and graduate groups. Results
Psychometric validation from the MGCFA showed that both groups understood the items in a similar way, and the
Scale development CTDS successfully discriminated between these theoretically different groups. Educators,
Multigroup confirmatory factor analysis psychologists and researchers may find the CTDS a useful tool for measuring individuals’
disposition to critical thinking. Immediate future research should focus on establishing the
strength of relationship between the CTDS and other cognitive measures of critical thinking.

© 2012 Elsevier Ltd. All rights reserved.

1. Introduction

Critical thinking has become an educational ideal with most policy makers and educationists calling for the development
of critical attitudes in students (Ennis, 2008; McBride, Xiang, & Wittenburg, 2002; Stapleton, 2010). Most definitions of
critical thinking acknowledge the importance of both cognitive (i.e. skills) and dispositional (i.e. propensity) dimensions in
the thinking process (Facione, Sanchez, Facione, & Gainen, 1995; Fasko, 2003; Ku, 2009; Lawrence, Serdikoff, Zinn, & Baker,
2009). For instance, McPeck (1981) defined critical thinking as “the propensity and skill to engage in an activity with reflective
scepticism” (p. 8). A more recent perspective which combines elements from other definitions proposes that critical thinking
is “the propensity and skills to engage in activity and ‘mental activity’ with reflective skepticism focused on deciding what
to believe or do. . .” (Fasko, 2003, p. 8).
The cognitive dimension of critical thinking has a long tradition and generally emphasises reasoning and logical thinking,
skills closely associated with intellectual ability. It focuses on an individual’s ability to comprehend a problem and to come up
with reasonable solutions for the identified problem. This dimension has been the most researched and there are numerous
instruments dedicated to its measurement (see e.g., Ku, 2009). Key cognitive skills usually include ability to make inference,
recognise assumptions, deductions, interpretations, analysis, and evaluation of arguments (e.g., American Philosophical
Association [APA], 1990; Halpern, 1998; Watson & Glaser, 1980). Dispositions on the other hand examine the tendency to
do something (Ennis, 1996), or the manner in which individuals approach a task (Ku & Ho, 2010). In the literature, critical

∗ Tel.: +44 1224274518.


E-mail address: e.sosu@abdn.ac.uk

1871-1871/$ – see front matter © 2012 Elsevier Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.tsc.2012.09.002

Please cite this article in press as: Sosu, E.M. The development and psychometric validation of a Critical Thinking Dispo-
sition Scale. Thinking Skills and Creativity (2012), http://dx.doi.org/10.1016/j.tsc.2012.09.002
ARTICLE IN PRESS
G Model

TSC-181; No. of Pages 13

2 E.M. Sosu / Thinking Skills and Creativity xxx (2012) xxx–xxx

thinking dispositions have been defined as a constellation of attitudes, intellectual virtues or habits of mind, thus describing
the way an individual reasons, argues and make decisions (Facione et al., 1995; Perkins, Jay, & Tishman, 1993). According to
some theorists, having a disposition to think critically implies having the ability to do so. These theorists argue that critical
thinking ability can exist without disposition to use it, but having a disposition implies having the associated ability (Facione
et al., 1995; Norris, 2003). This latter view is not entirely supported by the available evidence. However, there is consistent
support for the existence of both a dispositional and a cognitive dimension (e.g., Ku & Ho, 2010; Macpherson & Stanovich,
2007; Taube, 1997; West, Toplak, & Stanovich, 2008).
In contrast to the cognitive domain, interest in the dispositional dimension is a more recent phenomenon. Several
empirical studies linking critical thinking dispositions to performance in differing domains as well as wider psycholog-
ical characteristics have emerged over the past two decades. For instance, the disposition to think critically has been
linked with improved academic performance, deep learning, good professional practice, professional expertise, anxiety,
ego-resilience and overcoming cognitive bias in reasoning (El-sayed, Sleem, El-sayed, & Ramada, 2011; Facione et al., 1995;
Fahim, Bagherkazemi, & Alemi, 2010; Kwon, Onwuegbuzie, & Alexander, 2007; Macpherson & Stanovich, 2007; West et al.,
2008). Significant relationships have also been found between disposition to critical thinking and some personality charac-
teristics (e.g., Clifford, Boufal, & Kurtz, 2004). To this end, there has been suggestion that personality characteristics such as
openness to experience are conceptually related to critical thinking (Clifford et al., 2004).
This emergent interest in thinking dispositions has however not extended to issues around its measurement. A review
of the extant literature shows a paucity of instruments for measuring thinking dispositions. Theorists within the field have
lamented this limited attention given to the measurement of thinking dispositions (Edman, 2009; Ennis, 2003; Halpern,
2003; Ku, 2009; Norris, 2003). This is despite the fact that the availability of suitable dispositional measures is crucial in
determining whether programmes have been successful in nurturing critical thinking attitudes in participants (Ku, 2009).
The absence of valid and reliable dispositional instruments also calls into question the claims that thinking dispositions are
associated with improved performance in various domains of life. The main objective of the current study was therefore to
develop a Critical Thinking Disposition Scale and to evaluate its psychometric properties.

2. Taxonomies of important thinking dispositions

Several taxonomies of important thinking dispositions have been described in the literature (Table 1). These taxonomies
generally range from broad characteristics that are empirically derived to specific theoretical propositions. The first sys-
tematic approach to defining thinking dispositions was taken in 1987 by the American Philosophical Association (APA)
who commissioned a study to explore the notion of critical thinking and its operationalisation for purposes of assessment.
Findings from this study, known as the Delphi Report (APA, 1990), proposed 19 broad dispositions which critical thinkers
are expected to posses. Facione and Facione (1992) drawing on the Delphi report argued for seven instead of the original
19 dispositional dimensions of critical thinking following an exploratory factor analysis (cf Table 1). Both the Delphi report

Table 1
Taxonomies of important thinking dispositions.

Author Year No. of dispositions Examples

APA Delphi Report 1990 19 Inquisitiveness; well-informed; alertness to use CT; trust in reasoned inquiry;
self-confidence in one’s own ability to reason; open-mindedness; flexibility in
considering alternatives; understand opinions of others; fair-mindedness;
honesty in facing own biases; prudence in making judgments; revise views
where change is warranted; clarity in stating concern; working with
complexity; diligence in seeking relevant information; reasonableness in
selecting and applying criteria; focusing attention on the concern at hand;
persistence in face of difficulties; precision
Facione and Facione 1992 7 Inquisitiveness; open-mindedness; systematicity; analyticity; truth-seeking;
critical thinking self-confidence; maturity
Perkins, Jay, and Tishman 1993 7 Broad and adventurous; sustain intellectual curiosity; clarify and seek
understanding; planful and strategic; intellectually careful; seek and evaluate
reasons; metacognitive
Halonen 1995 5 Tentativeness, scepticism; tolerance of ambiguity; appreciation of individual
differences; regard for ethical practices
Ennis 1996 12 Seek alternatives and be open to them; endorse a position when it is justified
to do so; well-informed; consider other points of view; clear about intended
meaning; determine, and maintain focus on, the conclusion or question; seek
and offer reasons; take into account the total situation; reflectively aware of
own beliefs; discover and listen to others’ view and reasons; take into account
others’ feelings and level of understanding; be concerned about others’ welfare
Halpern 1998 5 Willingness to engage in and persist at a complex task; habitual use of plans
and the suppression of impulsive activity; flexibility or open-mindedness;
willingness to abandon non productive strategies in an attempt to self-correct;
awareness of social realities so that thoughts can become actions.

Please cite this article in press as: Sosu, E.M. The development and psychometric validation of a Critical Thinking Dispo-
sition Scale. Thinking Skills and Creativity (2012), http://dx.doi.org/10.1016/j.tsc.2012.09.002
ARTICLE IN PRESS
G Model

TSC-181; No. of Pages 13

E.M. Sosu / Thinking Skills and Creativity xxx (2012) xxx–xxx 3

and subsequent studies by Facione and colleagues provided the first comprehensive empirical synthesis of what should
characterise disposition to critical thinking.
Several other theorists have identified discrete characteristics that show a disposition to critical thinking (Table 1).
However, very little overarching theoretical perspective connects these discrete characteristics. Only Perkins et al.’s (1993)
model combined their proposed dispositions into a comprehensive theory. According to their triadic dispositional theory
of thinking, there are seven “master” dispositions that characterise good thinking. Each of the seven dispositions is further
characterised by a triad of inclinations, sensitivities and abilities. For instance, the disposition to be broad and adventurous
entails both the tendency to be ‘open-minded’ (inclination), ‘alert to sweeping generalities’ (sensitivity), and ‘adopt flexible
thinking’ (ability).
An important observation in Table 1 is the similarities in the dispositional characteristics identified by the different
taxonomies. Notions of open-mindedness, intellectual curiosity, and reflective thinking pervade the different taxonomies.
Further, the taxonomies suggest that dispositions to critical thinking are multidimensional. However, there is absence of
clarity about the ideal number of dispositional taxonomies. It can be argued that some of the discrete dispositions could be
clustered into more parsimonious dimensions due to similarities in meaning. This potential for parsimony was acknowledged
by Perkins et al. (1993) in discussing their triadic model. Finally, questions exist about whether or not all identified character-
istics constitute disposition to critical thinking. A case in point is ‘critical thinking self confidence’ (Facione & Facione, 1992)
and ‘being concerned about others welfare’ (Ennis, 1996). While these constitute desirable virtues, they may not be strictly
defined as disposition to critical thinking. Another example is the sensitivities and abilities component identified in Perkins
et al.’s model. Ennis (1996) contends that sensitivities and abilities are not essential for every disposition. Most theorists
(e.g., Ennis, 1996; Halpern, 1998) agree that abilities are best constituted as a separate dimension of critical thinking.

3. Measurement

There is general agreement in the extant literature that measures of critical thinking should capture and reflect cognitive as
well as dispositional components. However, as mentioned earlier, analyses of existing instruments show a greater emphasis
on measurement of cognitive dimensions with little or no consideration given to the dispositional components of critical
thinking (Ennis, 2003; Halpern, 2003; Ku, 2009; Norris, 2003). Currently, the only existing instruments specifically developed
to measure thinking disposition is the California Critical Thinking Dispositional Inventory (CCTDI: Facione & Facione, 1992).
This instrument proposes 75 items measuring seven dispositions. Although widely used, this scale is not without its problems.
Results from cross-validation studies show inconsistencies in the pattern of item loadings, excessive cross loading of items,
overlap of constructs, and instability of the hypothesised factor structure (e.g., Bondy, Koenigse, Ishee, & Williams, 2001;
Kakai, 2003; Walsh & Hardy, 1997; Walsh, Seldomridge, & Badros, 2007), calling into question the validity and reliability for
the CCTDI subscales. These issues have led authors to recommend the creation of a shorter version of the CCTDI (e.g., Walsh
et al., 2007).
Alternative instruments used by researchers to measure critical thinking dispositions include the Need for Cognition
Scale (Cacioppo, Petty, Feinstein, & Jarvis, 1996) and the adapted versions of NEO Five-Factor Inventory (Costa & McCrae,
1992). However, these instruments have not been specifically developed to measure thinking dispositions, and are therefore
of limited explanatory power.
On the whole, the recognition that cognitive skills alone is not a sufficient measure of critical thinking is not reflected
by progress in the development of dispositional instruments (Ennis, 2008; Halpern, 1998). This absence of appropriate
instruments makes it difficult to ascertain the extent to which intervention programmes aimed at improving critical thinking
attitudes have been successful (Aukes, Geertsma, Cohen-Schotanus, Zwierstra, & Slaets, 2007; Kember & Leung, 2000; Ku,
2009). It also calls into question the evidential basis for claims about critical thinking dispositions. The urgent need to develop
new and more adequate dispositional measures has been recognised in the literature (e.g., Edman, 2009; Ennis, 2003; Ku,
2009; Norris, 2003). The current study aims to fill this research gap by developing an instrument to measure critical thinking
dispositions. It attempts to address the validity and reliability difficulties that have plagued existing dispositional measures
by adopting exploratory and confirmatory strategies to validate the new instrument. Specifically, the current study is aimed
at achieving two main goals:

1. To develop a Critical Thinking Disposition Scale (CTDS), and


2. To establish the validity and reliability of the instrument using multigroup structural equation modelling techniques.

4. Method

4.1. Item development

An exploratory approach was used in constructing items for the critical thinking dispositions scale. To ensure a com-
prehensive coverage of key dispositions, each of the taxonomies discussed above (see Table 1) were drawn on to generate
an initial pool of 98 items. Each taxonomy was considered individually and pool of items was generated to reflect the key

Please cite this article in press as: Sosu, E.M. The development and psychometric validation of a Critical Thinking Dispo-
sition Scale. Thinking Skills and Creativity (2012), http://dx.doi.org/10.1016/j.tsc.2012.09.002
ARTICLE IN PRESS
G Model

TSC-181; No. of Pages 13

4 E.M. Sosu / Thinking Skills and Creativity xxx (2012) xxx–xxx

dispositions identified in the particular taxonomy. During this process, discussions were held with two other researchers
in Education regarding whether identified dispositions actually qualified as critical thinking. No items were generated for
a disposition where there was lack of consensus within the literature that the particular disposition constitutes a dispo-
sition to critical thinking. For instance, items were not generated for the Facione and Facione’s (1992) ‘critical thinking
self-confidence’ and ‘maturity’ dispositions, and Ennis’ (1996) ‘concern for others welfare’ scale because these were deemed
not to constitute strict dimensions of critical thinking. The initial pool of items were subsequently reviewed and refined to
eliminate duplications. Forty-six items were retained through this process.
The second stage involved structuring the 46 items as five-point Likert-type responses (1 = strongly disagree, 5 = strongly
agree) for piloting. Six faculty colleagues with varying disciplinary backgrounds and four graduate students attending a
summer school in measurement and structural equation modelling were requested to complete the questionnaires to see if
items were relevant measures of critical thinking disposition, and if choices were easy to make. They were also encouraged
to check for possible ambiguities and ease of interpretability of items. Several items were flagged, reworded or deleted
through this process to ensure content validity. A final 24-item pool was retained for empirical testing.

4.2. Participants and procedure

Two separate samples were used for the study. Participants in each sample were made up of first year undergraduate
students in education, and graduate students enrolled on a one year education programme. The first sample (obtained in
2008) consisted of a total of 467 (128 undergraduate and 338 graduate) students. The gender composition was (79% female,
21% male). The average age band of this sample was 26. Data for the second sample (obtained in 2009) consisted of a total of
371 (114 undergraduate and 257 graduate) students. Gender distribution was (78% female, 22% male). The average age band
was 25. The questionnaires were administered at a general lecture. A covering letter inviting participants to take part in the
study explained the purpose of the study and assured participants of the confidentiality of the data provided. Participants
completed the questionnaire and all responses were collected immediately after the lecture.

4.3. Data analysis

A strategy employing exploratory factor analysis (EFA) followed by multigroup confirmatory factor analysis (MGCFA)
was adopted. The EFA was conducted using data from the first sample followed by a MGCFA on the second sample. Both
analyses were conducted using the latent variable software program Mplus 6.0 (Muthen & Muthen, 2010). The use of Mplus
provided an opportunity for testing the goodness-of-fit of both models.

4.3.1. Exploratory factor analysis


The EFA was aimed at ascertaining initial factor structure of the items specified in the instrument and to retain those
items that exhibit good psychometric properties. EFA was useful at this stage of scale development because a strong theory
was lacking (Koufteros, 1999). The procedure helps to determine the number of latent factors that underlie a set of items
and the degree to which the items are related to the factor (Kahn, 2006). In so doing, EFA contributes to the development
of a theory which can subsequently be tested using more confirmatory procedures (Haig, 2005; Henson & Roberts, 2006;
Kahn, 2006; Russell, 2002).

4.3.2. Multigroup confirmatory factor analysis


The multigroup confirmatory factor (MGCFA) was used to test the stability of the scale derived from the exploratory
analysis across different groups in an attempt to establish construct validity. Confirmatory analysis helps to determine the
structural validity and reliability of measurement instruments (Kline, 2005; Noar, 2003). More specifically, MGCFA provides
an opportunity for testing measurement invariance (equivalence) across different groups (Joreskog, 1971; Steenkamp &
Baumgartner, 1998). Measurement invariance examines the extent to which a proposed factor structure, meaning of items
and latent means are equivalent across different groups (Byrne, 2012; Davidov, 2008). Invariance is generally seen as a
prerequisite for comparing latent constructs across different groups (Davidov, 2008).
The test of invariance involves several logical steps with increasing levels of restrictions (Byrne, 2012; Vandenberg &
Lance, 2000). In a preliminary phase, baseline models that fit each group are established separately. When models for each
group fit the data well, one can proceed to impose invariance constraints (Byrne, Shavelson, & Muthen, 1989). The first
stage of measurement invariance begins with the least restrictive model, configural invariance. This involves simultaneously
testing the baseline models in a single analysis where the factor structure (number of factors and items) is the same across
groups. This is the only requirement and no other equality constraints are imposed. Achievement of configural invariance
suggests that the structure of the model is the same across the different groups. This provides the model against which
models with stricter restrictions are compared (Byrne, 2012; Vandenberg & Lance, 2000).
Metric invariance constitutes the next level in the test of equivalency. This requires specifying the factor structure as well
as factor loadings to be equal across the different groups (Byrne, 2012; Vandenberg & Lance, 2000). Achievement of metric
invariance suggests that the meaning of constructs is the same across the different groups, and that participants in both
groups understand the questions in the same way (Davidov, 2008). Thus, item level scores can be meaningfully compared
across groups (Richardson, Ratner, & Zumbo, 2007).

Please cite this article in press as: Sosu, E.M. The development and psychometric validation of a Critical Thinking Dispo-
sition Scale. Thinking Skills and Creativity (2012), http://dx.doi.org/10.1016/j.tsc.2012.09.002
ARTICLE IN PRESS
G Model

TSC-181; No. of Pages 13

E.M. Sosu / Thinking Skills and Creativity xxx (2012) xxx–xxx 5

The final and highest level of invariance is scalar invariance. This assesses the equivalence of item intercepts across groups.
Here, item intercepts, factor structure and factor loadings are specified as equal across groups. Scalar invariance is needed
to justify mean comparisons across groups (Meredith, 1993). Scalar invariance implies that differences in item means are
the results of differences in the means of corresponding latent factors (Byrne, 2012; Davidov, 2008).
Full measurement invariance (i.e., all factor loadings and intercepts constrained to be equal) is most often not achieved.
Where that is the case, many researchers argue that the testing of partial invariance is a legitimate process (Byrne, 2012;
Steenkamp & Baumgartner, 1998). Partial invariance involves relaxing the model restrictions so that some items are allowed
to vary across the different groups. According to Byrne et al. (1989), achievement of partial measurement invariance is
sufficient for cross-group comparison especially where the number of noninvariant items is small.

4.3.3. Model evaluation


The exploratory and confirmatory models were evaluated using multiple indices following the state of practice (Boomsma,
2000; MacCallum & Austin, 2000; McDonald & Ho, 2002). The Satorra–Bentler X2 goodness-of fit statistic, the Tucker–Lewis
Index (TLI), the comparative fit index (CFI), the root mean square error of approximation (RMSEA) and the standard root
mean residual (SRMR) were used in evaluating the various models. Model fit is generally achieved when the X2 value is not
significant (Kline, 2005). However, due to sensitivity of the X2 statistic (Hu & Bentler, 1995) it is not relied upon to determine
model fit. Other rules of thumb were used to determine the adequacy of the overall model. Models with TLI and CFI values
greater than .90 and .95 were indicative of “acceptable” and “good” fit respectively (Hu & Bentler, 1999; Marsh, Hau, & Wen,
2004; Medsker, Williams, & Holahan, 1994). With respect to the RMSEA, values lower than .05 indicate good fit, with those
ranging from .05 to .08 indicative of acceptable fit (Browne & Cudeck, 1993; Hu & Bentler, 1999; Kline, 2005). SRMR values
below .08 were considered to be evidence of good fit whiles those just below .10 signified acceptable fit (Hu & Bentler, 1999;
Kline, 2005). Other criteria for model acceptability included the significance of parameter estimates, and the interpretability
of the obtained solution.
The sequence for testing for invariance implies that the models are nested. It is therefore possible to evaluate the decrease
in model fit for each additional restriction. Absence of deterioration in the model suggests support for invariance. For the
present study, the corrected chi-square-difference test (MLMX2 ) was computed for evaluating the difference in model
fit for the nested models (Byrne, 2012; MPlus Support, 2010). There is support for invariance if the chi-square test is non-
significant as this shows that the model is not deteriorating. The choice of model evaluation indices for EFA and CFA was
influenced by the availability and type of fit indices reported in Mplus. All items were scored in the same direction prior to
analysis.

5. Results

5.1. Study 1: exploratory factor analysis

A total sample of 467 (sample 1) was used for the EFA to determine the latent structure of the critical thinking disposition
items. Descriptive statistics and correlation matrix were used to screen the 24 items. Items that displayed no correlation
with other items were considered poor and eliminated. This statistical analysis was combined with a critical review of each
statement to make sure that the pattern of response and wording of items were meaningful and free from ambiguity. This
screening process led to an iterative elimination of 11 items that were deemed to be ambiguous and could possibly introduce
errors into the results. The mean values for the remaining 13 items ranged from 3.67 (item U) to 4.53 (item J). The standard
deviation ranged from 0.49 to 0.77, and the skew and kurtosis indices from −0.18 to −1.33 and −0.06 to 6.91 respectively.
Considering the sample size, it can be argued that the skew and kurtosis indices reported here did not make a substantive
difference to the analysis (Tabachnick & Fidell, 2007).
The decision on the total number of latent factors to extract was determined using multiple criteria. Kaiser’s (1960) rule
(eigenvalue > 1) revealed the presence of three factors while Cattell’s (1966) scree test suggested the extraction of two or
three factors. The two factor solution of the scree test was supported by results of Horn’s (1965) parallel analysis, a procedure
that is considered to be a more accurate assessment of the number of factors to retain (Henson & Roberts, 2006). Results of
the eigenvalues generated from the EFA and parallel analysis using MonteCarlo parallel analysis software (Watkins, 2000)
are shown in Table 2. A factor is retained for extraction if an EFA eigenvalue is greater than the corresponding parallel analysis
value (Pallant, 2006).

Table 2
Comparison of results from EFA and parallel analysis.

Component Eigenvalue generated from EFA Criterion value from parallel analysisa Decision

1 3.968 1.283 Accept


2 1.224 1.215 Accept
3 1.009 1.159 Reject
a
This was based on a randomly generated data matrix similar to the EFA sample (13 variables × 467 respondents).

Please cite this article in press as: Sosu, E.M. The development and psychometric validation of a Critical Thinking Dispo-
sition Scale. Thinking Skills and Creativity (2012), http://dx.doi.org/10.1016/j.tsc.2012.09.002
ARTICLE IN PRESS
G Model

TSC-181; No. of Pages 13

6 E.M. Sosu / Thinking Skills and Creativity xxx (2012) xxx–xxx

Table 3
CF-equamax pattern/structure coefficient for the Critical and Reflective Thinking Scale.

Items Pattern coefficients Structure coefficients Communality

Factor I Factor II Factor I Factor II

N .530 .087 .576 .370 0.337


C .506 .090 .554 .360 0.313
D .493 .090 .540 .352 0.298
A .485 .109 .543 .368 0.303
O .389 −.036 .396 .171 0.137
J .351 .057 .381 .244 0.148
E .344 .097 .395 .280 0.163
T −.032 .883 .439 .866 0.750
S .098 .612 .425 .664 0.448
K .257 .341 .439 .478 0.276
U .255 .399 .468 .535 0.333

NB: See Appendix A for name of items. Factor loadings above .30 are highlighted in bold.

The data were specified as continuous variables and estimated using the maximum likelihood (ML) estimator with CF-
equamax rotation. Unlike other methods, ML provides the means to conduct significance tests and to derive confidence
intervals (Fabrigar, Wegener, MacCallum, & Strahan, 1999). The ML method is also recommended when the goal is to deter-
mine the latent factors that underlie one’s data (Fabrigar et al., 1999; Russell, 2002). CF-equamax is an oblique rotation
that produces unbiased factor pattern loadings (Schmitt & Sass, 2011). It is particularly suitable when a new measure is
being developed and item quality is questionable; items are thought to measure multiple factors; and/or researchers seek to
remove items with large cross loadings (Schmitt & Sass, 2011). A .30 cut-off point was specified and used for extraction. This
cut-off value represents factor loadings that are of practical significance (Cudeck & O’Dell, 1994). Additionally, it allowed for
the retention of a sufficient number of items in order to have stable factors (Kahn, 2006).
The initial specification for the EFA requested two factors. The fit indices from this model suggested adequate model fit
for the two factor model X2 (53) = 116.55, TLI = .91, CFI = .94, RMSEA = .051 with 90% CI .038–.063, SRMR = .037. Inspection of
the rotated and structural loadings showed that ‘item R’ cross-loaded on both factors and the values were below the cut-off
value of .30. The deletion of this item resulted in the simultaneous deletion of another item (item P) with loadings below the
cut-off value. The deletion of these two items resulted in improved model fit X2 (34) = 56.52, TLI = .96, CFI = .98, RMSEA = .038
with 90% CI .019–.055, SRMR = .028 with all items uniquely loading on to only one factor. The final CF-equamax rotated
loadings and structural loadings are presented in Table 3. The first factor was labelled ‘Critical Openness’ and the second
‘Reflective Scepticism’. The highest loading item on the Critical Openness scale was item N (‘I usually try to think about the
bigger picture during a discussion’), while that for Reflective Scepticism was item T (‘I often re-evaluate my experiences so
that I can learn from them’). The observed correlation between the two latent factors was moderate (r = .53). The coefficient
of internal consistency for the total scale in Study 1 was high (Cronbach’s alpha = .79).
Following recommendation of several authors (e.g., Fabrigar et al., 1999; Henson & Roberts, 2006; Kahn, 2006), alternative
rotations and a varying number of factors were iteratively explored. These however did not provide meaningful and inter-
pretable solutions. For instance, although three and four factor models appeared to be a good fit for the data, the additional
factors were difficult to interpret. The presence of only a limited number of items per factor also raised questions about the
reliability and stability of these factors. Since the extraction of one, three and four factors were not supported by results
from parallel analysis, subsequent evaluation of the EFA proceeded on the basis of retaining two factors.

5.2. Study 2: confirmatory factor analysis

The aim of the CFA was twofold. Firstly, to test the fit of the hypothesised two-factor model obtained in Study 1 and
secondly, to test measurement invariance across undergraduate and graduate samples. An acceptable evidence of invariance
would enable an exploration of the mean differences between the two groups. A different sample (sample 2) was used for this
analysis. All models were estimated using the robust maximum likelihood estimator (MLM) due to non-normal distribution
of the data (Byrne, 2012; Muthen & Muthen, 2010). Descriptive statistics for the final 11 items in Study 2 are as follows:
Mean values ranged from 3.76 (item S) to 4.51 (item J). The standard deviation ranged from 0.53 to 0.77, and the skew and
kurtosis indices from −0.29 to −0.73 and −0.36 to 1.49 respectively. These skew and kurtosis indices are unlikely to make
any substantive difference to the analysis due to the sample size (Tabachnick & Fidell, 2007).

5.2.1. Baseline model


As discussed earlier, testing of multi-group invariance is preceded by establishment of a well-fitting baseline model for
each group (Byrne, 2012). Since, baseline models have no between-group constraints, they can be analysed separately for
each group. Baseline models were specified for the undergraduate and graduate groups using results from Study 1 (Fig. 1).
Analysis of the two-factor model revealed less than optimal fit for both the undergraduate and graduate group (Table 4,
Model 1). However, all items were significantly loaded onto their respective factors. Inspection of the models’ modification

Please cite this article in press as: Sosu, E.M. The development and psychometric validation of a Critical Thinking Dispo-
sition Scale. Thinking Skills and Creativity (2012), http://dx.doi.org/10.1016/j.tsc.2012.09.002
ARTICLE IN PRESS
G Model

TSC-181; No. of Pages 13

E.M. Sosu / Thinking Skills and Creativity xxx (2012) xxx–xxx 7

Reflective Critical
Scepticism Openness

K S T U A C D E J N O

Fig. 1. Hypothesised two-factor CFA model for the Critical Thinking Disposition Scale.

indices showed that a residual covariance involving items A and C will contribute to the improvement of the model for
both groups. These two items were relatively similar in wording and specified to measure the same construct. It is however
important to note that the procedure of correlating residuals in SEM methodology is controversial (e.g., Landis, Edwards, &
Cortina, 2009). Indicator error covariance is only advisable where there is a substantive theoretical basis for doing so, and an
opportunity to cross-validate the results with an independent sample (e.g., Brown, 2006). The suggested error correlation
between items A (‘I am often on the lookout for new ideas’) and C (‘I often use new ideas to shape the way I do things’) is
theoretically justifiable on the grounds that both items contain very similar wordings involving the notion of “new ideas”,
and are specified to measure the same construct. Evidence within SEM literature suggests that similarly worded items can
induce error correlation between indicators (e.g., Brown, 2006; Kline, 2005; Podsakoff, MacKenzie, Lee, & Podsakoff, 2003;
Schwartz et al., 2012), and allowing residual correlations to account for similar item wording is not problematic (Lindwall,
Asci, & Hagger, 2011; Schwartz et al., 2012). Alternate approaches to dealing with similarly worded items include either a
deletion of one item or a test to evaluate the internal consistency of the items (e.g., Grawitch & Barber, 2010; Landis et al.,
2009).
It was reasoned that both items A and C represented unique dimensions of the ‘Critical Openness’ scale. This is because
even though they are similarly worded, they differ in meaning. As a result, a decision was taken to allow for residual
covariance and to test internal consistency rather than a deleting one item. Additionally, the procedure of MGCFA employed
within this study provided an opportunity to cross-validate this new specification within different independent samples. A
respecified model incorporating the covariance between these items led to significant improvement in the model (Table 4,
Model 2). This new models represented achievement of acceptable model fit for both groups. The specified covariance was
also statistically significant. Although there was evidence of misspecifications unique to each group, a decision was taken
not to carry out further modification because of difficulties in cross-validating such changes with an independent sample.
The final baseline models and standardised factor loadings are represented in Figs. 2 and 3.

5.2.2. Configural invariance


Having established separate baseline models for the undergraduate and graduate groups, a test for configural invariance
was undertaken. This involved simultaneously estimating the separate baseline models as a multigroup model without any
equality constraints (Byrne, 2012). Assessment of this model indicated a good fit (Table 5a). All items produced significant
loadings to their specified factors.

5.2.3. Metric invariance


The next level of invariance tested was measurement invariance. Measurement invariance guarantees that undergraduate
and graduate participants understand the items in a similar way and that the factors are related to the items equally across
both groups. The parameters of interest were the factor loadings and the commonly specified residual covariances. In the first

Table 4
Baseline confirmatory factor analysis for undergraduate and graduate students.

Model MLMX2 df CFI TLI RMSEA SRMR Model MLMX2 df p


Comparison

Undergraduate baseline model


Model 1: initial EFA model 88.66 43 .79 .74 .097 .074 – – – –
Model 2: covariance between items A and C 56.72 42 .93 .91 .056 .055 2 and 1 15.82 1 .001

Graduate baseline model


Model 1: initial EFA model 132.87 43 .84 .80 .090 .066 – – – –
Model 2: covariance between items A and C 76.96 42 .94 .92 .057 .051 2 and 1 55.92 1 .001

Please cite this article in press as: Sosu, E.M. The development and psychometric validation of a Critical Thinking Dispo-
sition Scale. Thinking Skills and Creativity (2012), http://dx.doi.org/10.1016/j.tsc.2012.09.002
ARTICLE IN PRESS
G Model

TSC-181; No. of Pages 13

8 E.M. Sosu / Thinking Skills and Creativity xxx (2012) xxx–xxx

.83

Reflective Critical
Scepticism Openness

.55 .54
.71 .50 .42 .35
.51 .43 .47 .72 .45
K S T U A C D E J N O

.55

Fig. 2. Final undergraduate baseline model for the Critical Thinking Disposition Scale.

.81

Reflective Critical
Scepticism Openness

.44 .66
.77 .55 .48 .50
.55 .48 .50 .52 .45

K S T U A C D E J N O

.51

Fig. 3. Final graduate baseline model for the Critical Thinking Disposition Scale.

Table 5
Test for invariance of the Critical Reflective Thinking Scale across undergraduate and graduate students.

Model MLMX2 df CFI TLI RMSEA SRMR Model MLMX2 df p


comparison

Configural invariance
(a) No constraints 137.43 86 .93 .92 .057 .056 – – – –

Metric invariance
(b) All factor loadings invariant 147.80 95 .93 .92 .055 .062 B and A 10.08 9 ns
(c) All factor loadings invariant; 1 148.69 96 .93 .92 .055 .064 C and B 0.462 1 ns
common residual covariance
invariant

Scalar invariance
(d) All factor loadings invariant; 1 232.88 107 .84 .83 .080 .115 D and C 84.353 11 <.001
common residual covariance
invariant; all intercepts of items
invariant
(e) Partial scalar invariance (all factor 173.72 101 .91 .90 .063 .077 E and C 24.905 5 <.001
loadings invariant; 1 common
residual covariance invariant; all
intercepts of items except items A, D,
K, O, T & U invariant)

NB: Model comparison requires that you compare a more restrictive model with a less restrictive one. Model ‘E’ was compared with ‘C’ because it is more
restrictive than ‘C’. In this case model ‘E’ cannot be compared with ‘D’ because ‘D’ is more restrictive.

Please cite this article in press as: Sosu, E.M. The development and psychometric validation of a Critical Thinking Dispo-
sition Scale. Thinking Skills and Creativity (2012), http://dx.doi.org/10.1016/j.tsc.2012.09.002
ARTICLE IN PRESS
G Model

TSC-181; No. of Pages 13

E.M. Sosu / Thinking Skills and Creativity xxx (2012) xxx–xxx 9

step, factor loadings between the items and corresponding factors were constrained to be the same across the two groups.
Goodness-of-fit statistics related to this model indicated a good model fit (Table 5b). All factor loadings were significant
and model fit was not significantly different from the configural model. The next step of measurement invariance involved
testing invariance of the commonly specified residual covariance. Although the testing of equivalence of residual covariance
is generally not recommended, it provides useful reliability information about items with commonly specified residual
covariances (Byrne, 2012). Model fit results were very similar to the model testing for invariant factor loadings (Table 5c).
The results overall indicated good model fit. This finding suggests that the specified residual covariance between items ‘A
and C’ is operating equivalently across the undergraduate and graduate groups (Byrne, 2012). In other words, the degree of
overlap between the two items is the same across the two groups. The conclusion is that both student groups understand
the questions equally and items load equally on their corresponding factors in the two groups.

5.2.4. Scalar invariance


The final analysis was a test for scalar invariance which allows for the comparison of factor means across groups. This
was conducted by constraining the intercepts of the items to be equal across the two groups in addition to factor loadings
and the common residual covariance. Results from the analysis of this model suggested that it should be rejected (Table 5d).
In other words, there was no support for invariance of intercepts across the two groups. The model was less fitting than
the models that were previously tested and only the RMSEA index was acceptable. Examination of the modification indices
suggests that allowing some of the intercepts to vary across both groups could improve this model. This will imply the
testing of a partial scalar invariance, a condition that is sufficient for mean comparison across groups (Byrne, 2012; Byrne
et al., 1989; Steenkamp & Baumgartner, 1998). Simultaneous testing of subsequent models in which six of the intercepts
were freely estimated (items A, D, K, O, T, U) resulted in acceptable model fit (Table 5e). Although a more improved model
fit was achievable by freely estimating more intercepts across both groups, the current model was chosen on the grounds of
parsimony and to allow for an identified model in the testing of latent mean differences (see e.g., Byrne, 2012, pp. 248–254).

5.2.5. Differences in latent means


The achievement of partial scalar invariance provided a legitimate basis for carrying out a test of latent mean differ-
ences between the undergraduate and graduate groups. Comparing means across groups requires constraining the factor
intercept for one group to zero in order to get the model identified (Muthen & Muthen, 2010). It also serves as the refer-
ence group against which the latent mean for the other group is compared. This makes it possible to test whether latent
means differ across groups. The factor intercept for the graduate group was constrained to zero. The fit for this model was
good X2 (97) = 158.82, TLI = .91, CFI = .92, RMSEA = .059 with 90% CI .042–.075, SRMR = .064. The result indicated significant
differences in latent mean scores across the groups on ‘Critical Openness’ and ‘Reflective Scepticism’ scale. The differences
in latent mean score for the undergraduate students was −.59 for Critical Openness and −.45 for Reflective Scepticism. This
corresponds to a 4.13 (.59 × 7) and 1.8 (.45 × 4) point difference on the total subscale scores between undergraduate and
graduate students respectively. In line with expected hypothesis, the results show that graduate students scored signifi-
cantly higher on both dimensions of critical thinking disposition than undergraduate students. The coefficient of internal
consistency for the total scale in Study 2 was high (Cronbach’s alpha = .81).

6. Discussion and conclusion

The current study was aimed at achieving two main goals. These were to develop an instrument for the assessment of
critical thinking dispositions, and to establish the validity and reliability of the instrument. Results from exploratory (EFA)
and multigroup confirmatory factor analyses (MGCFA) provided support for the Critical Thinking Disposition Scale (CTDS).
A two factor model obtained through EFA was subsequently confirmed through MGCFA on a separate sample. The MGCFA
also showed that the hypothesised two factor model was the same for both undergraduate and graduate students, and both
groups understood the items in the scale in a similar way. The coefficient of internal consistency for the scale was equally
high in both Study 1 and 2. Finally, there was evidence to suggest that the instrument successfully discriminated between
groups that were theoretically different and comparison of mean scores across groups yielded reliable and valid results.
The CTDS is an 11-item instrument that measures two dispositional domains – ‘Critical Openness’ and ‘Reflective Scepti-
cism’. The Critical Openness subscale reflects the tendency to be actively open to new ideas, critical in evaluating these ideas
and modifying ones thinking in light of convincing evidence. The Reflective Scepticism subscale on the other hand conveys
the tendency to learn from ones past experiences and be questioning of evidence. The two dimensions appear to capture
the perspectives inherent in the definition of critical thinking, and the taxonomies of thinking dispositions (see e.g., Ennis,
1996; Fasko, 2003; Perkins et al., 1993). The Critical Openness subscale is similar to the open-mindedness construct which
is one of the few agreed upon dispositions and has consistently been found to be distinct from other dispositional dimen-
sions (Ennis, 1996; Facione & Facione, 1992). The elements of reflection and scepticism captured by the second subscale are
implicit in definitions of critical thinking (see e.g., Fasko, 2003; Halonen, 1995; McPeck, 1981). Overall, these two subscales
raise very important dimensions of critical thinking disposition. The argument from the finding of this study is that both a
degree of Critical Openness and Reflective Scepticism are required for an individual to have a disposition to critical thinking.
This parsimonious two factor dispositional taxonomy can serve as a useful framework for research into the measurement
of critical thinking.

Please cite this article in press as: Sosu, E.M. The development and psychometric validation of a Critical Thinking Dispo-
sition Scale. Thinking Skills and Creativity (2012), http://dx.doi.org/10.1016/j.tsc.2012.09.002
ARTICLE IN PRESS
G Model

TSC-181; No. of Pages 13

10 E.M. Sosu / Thinking Skills and Creativity xxx (2012) xxx–xxx

Support for the validity and reliability of the CTDS comes from MGCFA. Results from this analysis shows that the factor
structure of the CTDS is equivalent across undergraduate and graduate groups and participants in both groups understood
the items in the same way. Additionally, the instrument was able to discriminate between both groups in line with general
expectations. For instance, the difference in latent means observed between graduate and undergraduates is concomitant
with general developmental perspectives and previous studies which suggest that years of formal education has a signif-
icant impact on critical thinking processes (McCarthy, Schuster, Zehr, & McDougal, 1999; Perry, 1970; Profetto-McGrath,
2003). Furthermore it concurs with findings from previous studies that graduate students show higher levels of think-
ing dispositions in comparison to their undergraduate peers (Kember & Leung, 2000). Finally, results of Cronbach’s alpha
show a high coefficient of internal consistency for the scale. These results demonstrate strong evidence of reliability and
validity.
There are several practical and research implications of the current study. Firstly, educators, psychologists and researchers
may find the CTDS a useful tool to collect information on students’ disposition to critical thinking at the start and end
of an intervention programme. The absence of suitable dispositional measures calls into question the extent to which
previous programmes have been successful in nurturing critical thinking attitudes in participants (Ku, 2009). The cur-
rent study helps fill this gap in the paucity of dedicated dispositional instrument. Secondly, the instrument can be used
as a screening test to identify aspects of critical thinking disposition that needs to be nurtured in students, and to
provide feedback that will assist students in developing critical attitudes (Ennis, 2003; Ku, 2009). For example, results
from the instrument can serve as a catalyst for discussion with students and for providing them with information on
aspects of thinking dispositions that they need to develop. Thirdly, the findings can inform future research on criti-
cal thinking dispositions. Researchers can use this instrument in longitudinal studies that aim to evaluate the impact
of dispositions on various domains of learning and professional performance. The proposition of a two-factor disposi-
tional taxonomy can also provide a parsimonious framework within which discussions and alternative measures could be
conceptualised.
In developing this scale, the current study has been innovative in several ways. Previous approaches in instrument
development have mostly relied on principal component analysis (e.g., Facione & Facione, 1992). This is one of the few
studies using a combination of exploratory and multigroup confirmatory factor analysis to test stability of hypothesised
structures in the development of a new instrument. The method used in this study can therefore provide a useful strategy
for future studies aimed at developing new measurement instruments. Further, this is one of the very few studies that
have attempted to test invariance of a critical thinking dispositional measure among different groups. Finally, the study
proposed a two-factor dispositional taxonomy of critical thinking. This model is more parsimonious than existing models
and incorporates the key dispositional elements inherent in existing taxonomies.
The limitations of the study should be taken into account in using this instrument. Firstly, the CTDS only measures the
dispositional dimension of critical thinking. This does not mean that there should be an emphasis on only dispositions but
rather the dispositional measure should be used in combination with existing cognitive measures to gain fuller understanding
of a person’s critical thinking (Ennis, 2003; Ku, 2009). Secondly, the sample for the current study involves undergraduate
and graduate students enrolled on one programme and this could potentially bias the nature of responses. It is however
important to state that the graduate students have taken courses in different disciplines prior to enrolling on the programme
and so should bring some variability in responses. Thirdly, the observed correlation between the two factors is quite high.
This may suggest the existence of redundant factors or a higher-order structure. As indicated earlier, the test for a one
factor model produced a poorer fit to the data. Considering that the observed correlations are below the general cut-off
of .85 (e.g., Brown, 2006), it can be concluded that the two factors measure a higher-order factor, ‘disposition to critical
thinking’. Fourthly, the potential for social desirability responses cannot be ignored. A way to resolve this is to incorporate
measures of social desirability during future data collection and to evaluate its effects on responses provided. Fifthly, it is
important to note that the current instrument is a general dispositional measure and it is possible that thinking dispositions
are domain specific (McPeck, 1990). However, such general dispositions are desirable as they can help test the assumption
of transferability of dispositions across domains (e.g., Ennis, 2008; Perkins et al., 1993). Finally, the case for a two-factor
model does not necessarily reject the validity of other theoretical constructs that are proposed to measure disposition to
critical thinking. It may be the case that future advances in measurement theory may lead to the development of more
comprehensive instruments.
It is anticipated that initial future research with the CTDS will seek to provide further evidence for the validity of the
instrument. Immediate research will focus on establishing the strength of relationship between this dispositional measure
and other cognitive measures of critical thinking, as well as its predictive validity. This is in line with the claim that capturing
both the attitudinal and cognitive component is needed to provide a holistic evaluation of an individual’s critical thinking
performance (Ku, 2009). Additional focus will be placed on varying the sample of the study to cover other disciplines and
professional groups in order to establish broader reliability for the instrument. Finally, there are plans to vary the response
length (e.g., 6–9) and anchor words (e.g., Not at all – Always) in attempt to further develop the robustness of the instrument.
In conclusion, although several theorists have emphasised the importance of critical thinking dispositions, little empir-
ical work has been carried out to develop dedicated instruments for measuring thinking dispositions. The current study,
apart from the California critical thinking disposition inventory, is the only attempt at developing a dedicated instrument
for measuring dispositions to critical thinking. The study employed both exploratory and multigroup confirmatory factor
analyses techniques to offer evidence for the validity, reliability, and stability of factor structure for a CTDS. It also proposes

Please cite this article in press as: Sosu, E.M. The development and psychometric validation of a Critical Thinking Dispo-
sition Scale. Thinking Skills and Creativity (2012), http://dx.doi.org/10.1016/j.tsc.2012.09.002
ARTICLE IN PRESS
G Model

TSC-181; No. of Pages 13

E.M. Sosu / Thinking Skills and Creativity xxx (2012) xxx–xxx 11

a parsimonious dispositional taxonomy. It is hoped that the CTDS will be used to aid future applied work and research on
critical thinking.

Appendix A.

The following preliminary suggestions can be used in deriving composite scores. The scores of the 11 items can be summed
to provide an overall dispositional score for an individual with a range of 11–55. Scores between 11 and 34 will indicate low
disposition; 35–44 moderate disposition; and 45–55 high disposition. A useful strategy should also include examination of
subscale scores. The total score for the Critical Openness scale will range from 7 to 35 with the following cut-offs (7–21 low;
22–28 moderate; 29–35 high). Reflective Scepticism will have a range of 4–20 with cut-off ranges being 4–12 low; 13–16
moderate; and 17–20 high.

Item no. Critical Openness

N I usually try to think about the bigger picture during a discussion


C I often use new ideas to shape (modify) the way I do things
D I use more than one source to find out information for myself
A I am often on the lookout for new ideas
O I sometimes find a good argument that challenges some of my firmly held beliefs
J It’s important to understand other people’s viewpoint on an issue
E It is important to justify the choices I make

Reflective Scepticism

T I often re-evaluate my experiences so that I can learn from them


S I usually check the credibility of the source of information before making judgements
K I usually think about the wider implications of a decision before taking action
U I often think about my actions to see whether I could improve them

References

American Philosophical Association. (1990). Critical thinking: A statement of expert consensus for purposes of educational assessment and instruction (“The
Delphi Report”). ERIC Document Reproduction, No. ED 315423.
Aukes, L. C., Geertsma, J., Cohen-Schotanus, J., Zwierstra, R. P., & Slaets, J. P. J. (2007). The development of a scale to measure personal reflection in medical
practice and education. Medical Teacher, 29, 177–182.
Bondy, K. N., Koenigse, L. A., Ishee, J. H., & Williams, B. G. (2001). Psychometric properties of the California critical thinking tests. Journal of Nursing
Measurement, 9, 309–328.
Boomsma, A. (2000). Reporting analyses of covariance structures. Structural Equation Modeling, 7(3), 461–483.
Brown, T. A. (2006). Confirmatory factor analysis foe applied research. New York: The Guilford Press.
Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In K. A. Bollen, & J. S. Long (Eds.), Testing structural equation models (pp. 136–162).
Newbury Park, CA: Sage.
Byrne, B. M. (2012). Structural equation modeling with Mplus: Basic concepts, applications, and programming. NY, USA: Taylor & Francis Group.
Byrne, B. M., Shavelson, R. J., & Muthen, B. O. (1989). Testing for the equivalence of factor covariance and mean structures: The issue of partial measurement
invariance. Psychological Bulletin, 105, 456–466.
Cacioppo, J. T., Petty, R. E., Feinstein, J., & Jarvis, W. (1996). Dispositional differences in cognitive motivation: The life and times of individuals varying in
need for cognition. Psychological Bulletin, 119, 197–253.
Cattell, R. B. (1966). The scree test for the number of factors. Multivariate Behavioral Research, 1, 245–276.
Clifford, J. S., Boufal, M. M., & Kurtz, J. E. (2004). Personality traits and critical thinking skills in college students: Empirical tests of a two-factor theory.
Assessment, 11, 169–176.
Costa, P. T., & McCrae, R. R. (1992). Revised NEO personality inventory and NEO five factor inventory: Professional manual. Odessa, FL: Psychological Assessment.
Cudeck, R., & O’Dell, L. L. (1994). Applications of standard error estimates in unrestricted factor analysis: Significance tests for factor loadings and correlations.
Psychological Bulletin, 115, 475–487.
Davidov, E. (2008). A cross-country and cross-time comparison of the human values measurements with the second round of the European social survey.
Survey Research Methods, 2, 33–46.
Edman, L. R. O. (2009). Are they ready yet? Developmental issues in teaching thinking. In D. S. Dunn, J. S. Halonen, & R. A. Smith (Eds.), Teaching critical
thinking in psychology: A handbook of best practices (pp. 35–48). NJ, USA: Wiley-Blackwell.
El-sayed, R. S., Sleem, W. F., El-sayed, N. M., & Ramada, F. A. (2011). Disposition of staff nurses’ critical thinking and its relation to quality of their performance
at Mansoura University Hospital. Journal of American Science, 7, 388–395.
Ennis, R. H. (1996). Critical thinking dispositions: Their nature and assessability. Informal Logic, 18, 165–182.
Ennis, R. H. (2003). Critical thinking assessment. In D. Fasko (Ed.), Critical thinking and reasoning: Current research, theory and practice (pp. 293–313). Cresskill,
NJ: Hampton Press.
Ennis, R. H. (2008). Nationwide testing of critical thinking for higher education: Vigilance required. Teaching Philosophy, 31, 1–26.
Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological
Methods, 4, 272–299.
Facione, P. A., & Facione, N. C. (1992). California critical thinking disposition inventory. Millbrae, CA: California Academic Press.
Facione, P. A., Sanchez, C. A., Facione, N. C., & Gainen, J. (1995). The dispositions towards critical thinking. Journal of General Education, 44, 1–25.
Fahim, M., Bagherkazemi, M., & Alemi. (2010). The relationship between test takers critical thinking ability and their performance on the reading section
of TOEFL. Journal of Language Teaching and Research, 1, 830–837.
Fasko, D. (2003). Critical thinking: Origins, historical development, future directions. In D. Fasko (Ed.), Critical thinking and reasoning: Current research, theory
and practice (pp. 3–18). Cresskill, NJ: Hampton Press.
Grawitch, M. J., & Barber, L. K. (2010). Work flexibility or nonwork support? Theoretical and empirical distinctions for work-life initiatives. Consulting
Psychology Journal: Practice and Research, 62, 169–188.
Haig, B. D. (2005). Exploratory factor analysis, theory generation, and scientific method. Multivariate Behavioral Research, 40, 303–329.
Halonen, J. S. (1995). Demystifying critical thinking. Teaching of Psychology, 22, 75–81.

Please cite this article in press as: Sosu, E.M. The development and psychometric validation of a Critical Thinking Dispo-
sition Scale. Thinking Skills and Creativity (2012), http://dx.doi.org/10.1016/j.tsc.2012.09.002
ARTICLE IN PRESS
G Model

TSC-181; No. of Pages 13

12 E.M. Sosu / Thinking Skills and Creativity xxx (2012) xxx–xxx

Halpern, D. F. (1998). Teaching critical thinking for transfer across domains: Dispositions, skills, structure training, and metacognitive monitoring. American
Psychologists, 53, 449–455.
Halpern, D. F. (2003). The “how” and “why” of critical thinking assessment. In D. Fasko (Ed.), Critical thinking and reasoning: Current research, theory and
practice (pp. 355–366). Cresskill, NJ: Hampton Press.
Henson, R. K., & Roberts, J. K. (2006). Use of exploratory factor analysis in published research: Common errors and some comment on improved practice.
Educational and Psychological Measurement, 66, 393–416.
Horn, J. L. (1965). A rationale and test for the number of factors in factor analysis. Psychometrika, 30, 179–185.
Hu, L., & Bentler, P. M. (1995). Evaluating model fit. In R. H. Hoyle (Ed.), Structural equation modeling: Concepts, issues and applications (pp. 76–99). Thousand
Oaks, CA: Sage.
Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation
Modeling, 6, 1–55.
Joreskog, K. G. (1971). Simultaneous factor analysis in several populations. Psychometrika, 36, 409–426.
Kahn, J. H. (2006). Factor analysis in counseling psychology research, training, and practice: Principles, advances, and applications. The Counseling Psychol-
ogist, 34(5), 684–718.
Kaiser, H. F. (1960). The application of electronic computers to factor analysis. Educational and Psychological Measurement, 20, 141–151.
Kakai, H. (2003). Re-examining the factor structure of the California critical thinking disposition inventory. Perceptual and Motor Skills, 96, 435–438.
Kember, D., & Leung, D. Y. P. (2000). Development of a questionnaire to measure the level of reflective thinking. Assessment and Evaluation of Higher Education,
25, 381–389.
Kline, R. B. (2005). Principles and practice of structural equation modeling. New York, USA: The Guilford Press.
Koufteros, X. A. (1999). Testing a model of pull production: A paradigm for manufacturing research using structural equation modelling. Journal of Operations
Management, 17, 467–488.
Ku, K. Y. L. (2009). Assessing students’ critical thinking performance: Urging for measurements using multi-response format. Thinking Skills and Creativity,
4, 70–76.
Ku, K. Y. L., & Ho, I. T. (2010). Dispositional factors predicting Chinese students’ critical thinking performance. Personality and Individual Differences, 48,
54–58.
Kwon, N., Onwuegbuzie, A. J., & Alexander, L. (2007). Critical thinking disposition and library anxiety: Affective domains on the space of information seeking
and use in academic libraries. College and Research Libraries, 68, 268–278.
Landis, R. S., Edwards, B. D., & Cortina, J. A. (2009). On the practice of allowing correlated residuals among indicators in structural equation modeling. In C.
E. Lance, & R. J. Vandenberg (Eds.), Statistical and methodological myths and urban legends (pp. 193–218). New York, NY: Routledge.
Lawrence, N. K., Serdikoff, S. L., Zinn, T. E., & Baker, S. C. (2009). Have we demystified critical thinking? In D. S. Dunn, J. S. Halonen, & R. A. Smith (Eds.),
Teaching critical thinking in psychology: A handbook of best practices (pp. 23–33). NJ, USA: Wiley-Blackwell.
Lindwall, M., Asci, H., & Hagger, M. S. (2011). Factorial validity and measurement invariance of the Revised Physical Self-Perception Profile (PSPP-R) in three
countries. Psychology, Health & Medicine, 16, 115–128.
MacCallum, R. C., & Austin, J. T. (2000). Application of structural equation modeling in psychological research. Annual Review of Psychology, 51,
201–226.
Macpherson, R., & Stanovich, K. E. (2007). Cognitive ability, thinking dispositions, and instructional set as predictors of critical thinking. Learning and
Individual Differences, 17, 115–127.
Marsh, H. W., Hau, K., & Wen, Z. (2004). In search of golden rules: Comment on hypothesis testing approaches to setting cutoff values for fit indexes and
dangers in overgeneralizing Hu and Bentler’s (1999) findings. Structural Equation Modeling, 11, 320–341.
McBride, R. E., Xiang, P., & Wittenburg, D. (2002). Dispositions toward critical thinking: The preservice teacher’s perspective. Teachers and Teaching: Theory
and Practice, 8, 29–40.
McCarthy, P., Schuster, P., Zehr, P., & McDougal, D. (1999). Evaluation of critical thinking in baccalaureate nursing program. Journal of Nursing Education, 38,
142–144.
McDonald, R. P., & Ho, M.-H. R. (2002). Principles and practice in reporting structural equation analyses. Psychological Methods, 7(1), 64–82.
McPeck, J. E. (1981). Critical thinking and education. New York: St. Martin’s Press.
McPeck, J. E. (1990). Critical thinking and subject specificity: A reply to Ennis. Educational Researcher, 19(4), 10–12.
Medsker, G. J., Williams, L. J., & Holahan, P. J. (1994). A review of current practices for evaluating causal models in organisational behavior and human
resources management research. Journal of Management, 20, 439–464.
Meredith, W. (1993). Measurement invariance, factor analysis and factorial invariance. Psychometrika, 58, 525–543.
MPlus Support. (2010). Chi-square difference testing using the Satorra-Bentler scaled chi square. http://www.statmodel.com/chidiff.shtml
Muthen, L. K., & Muthen, B. O. (2010). Mplus user’s guide (6th ed.). Los Angeles: Authors.
Noar, S. M. (2003). The role of structural equation modeling in scale development. Structural Equation Modeling, 10(4), 622–647.
Norris, S. P. (2003). The meaning of critical thinking test performance: The effects of abilities and dispositions on scores. In D. Fasko (Ed.), Critical thinking
and reasoning: Current research, theory and practice (pp. 315–329). Cresskill, NJ: Hampton Press.
Pallant, J. (2006). A step-by-step guide to data analysis using SPSS version 15. England: Open University Press.
Perkins, D. N., Jay, E., & Tishman, S. (1993). Beyond abilities: A dispositional theory of thinking. Merrill-Palmer Quarterly, 39, 1–21.
Perry, W. G. (1970). Forms of intellectual and ethical development in the college years. New York: Academic Press.
Podsakoff, P. M., MacKenzie, S. B., Lee, J.-Y., & Podsakoff, N. P. (2003). Common method biases in behavioral research: A critical review of the literature and
recommended remedies. Journal of Applied Psychology, 88(5), 879–903.
Profetto-McGrath, J. (2003). The relationship between critical thinking skills and critical thinking disposition of baccalaureate nursing students. Journal of
Advanced Nursing, 43, 569–577.
Richardson, C. G., Ratner, P. A., & Zumbo, B. D. (2007). A test of the age-based measurement invariance and temporal stability of Antonovsky’s Sense of
Coherence Scale. Educational and Psychological Measurement, 6, 679–696.
Russell, D. W. (2002). In search of underlying dimensions: The use (and abuse) of factor analysis in Personality and Social Psychology Bulletin. Personality
and Social Psychology Bulletin, 28, 1629–1646.
Schmitt, T. A., & Sass, D. A. (2011). Rotation criteria and hypothesis testing for exploratory factor analysis: Implications for factor pattern loadings and
interfactor correlations. Educational and Psychological Measurement, 71, 95–113.
Schwartz, S. J., Park, I. J. K., Huynh, Q.-L., Zamboanga, B. L., Umaña-Taylor, A. J., Lee, R. M., et al. (2012). The American identity measure: Development and
validation across ethnic group and immigrant generation. Identity: An International Journal of Theory and Practice, 12, 93–128.
Stapleton, P. (2010). A survey of attitudes towards critical thinking among Hong Kong secondary school teachers: Implications for policy change. Thinking
Skills and Creativity, 6, 14–23.
Steenkamp, J.-B. E. M., & Baumgartner, H. (1998). Assessing measurement invariance in cross national consumer research. Journal of Consumer Research, 25,
78–90.
Tabachnick, B. G., & Fidell, L. S. (2007). Using multivariate statistics (5th ed.). New York: Pearsons Education Inc.
Taube, K. T. (1997). Critical thinking ability and disposition as factors of performance on a written critical thinking test. Journal of General Education, 46,
129–164.
Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis of the measurement invariance literature: Suggestions, practices and recommendations for
organizational research. Organizational Research Methods, 3, 4–69.
Walsh, C. M., & Hardy, R. C. (1997). Factor structure stability of the California Critical Thinking Disposition Inventory across sex and various students’ majors.
Perceptual and Motor Skills, 85, 1211–1228.

Please cite this article in press as: Sosu, E.M. The development and psychometric validation of a Critical Thinking Dispo-
sition Scale. Thinking Skills and Creativity (2012), http://dx.doi.org/10.1016/j.tsc.2012.09.002
ARTICLE IN PRESS
G Model

TSC-181; No. of Pages 13

E.M. Sosu / Thinking Skills and Creativity xxx (2012) xxx–xxx 13

Walsh, C. M., Seldomridge, L. A., & Badros, K. K. (2007). California Critical Thinking Disposition Inventory: Further factor analytic examination. Perceptual
and Motor Skills, 104, 141–151.
Watkins, M. W. (2000). MonteCarlo PCA for parallel analysis [computer software]. State College, PA: Ed & Psych Associates.
Watson, G., & Glaser, E. M. (1980). Watson–Glaser critical thinking appraisal. Cleveland, OH: Psychological Corporation.
West, R. F., Toplak, M. E., & Stanovich, K. E. (2008). Heuristics and biases as measures of critical thinking: Associations with cognitive ability and thinking
dispositions. Journal of Educational Psychology, 100, 930–941.

Please cite this article in press as: Sosu, E.M. The development and psychometric validation of a Critical Thinking Dispo-
sition Scale. Thinking Skills and Creativity (2012), http://dx.doi.org/10.1016/j.tsc.2012.09.002

You might also like