2020 Kawabata Et Al SHRefinedMTQ 48 Accepted

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/345308238
Evolving the validity of a mental toughness measure: Refined versions of the

Mental Toughness Questionnaire-48
Article in Stress and Health · November 2020

DOI: 10.1002/smi.3004
CITATIONS READS
3 1,383
3 authors, including:
Masato Kawabata Toby Pavey

Nanyang Technological University Queensland University of Technology
50 PUBLICATIONS 603 CITATIONS 106 PUBLICATIONS 2,811 CITATIONS
SEE PROFILE SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Massage, muscle stiffness and soreness View project
Development of the Chinese Version of the Sport Motivation Scale-II View project
All content following this page was uploaded by Masato Kawabata on 07 November 2020.
The user has requested enhancement of the downloaded file.

1
Running head: EVOLVING THE VALIDITY OF A MENTAL TOUGHNESS MEASURE
Evolving the validity of a mental toughness measure:
Refined versions of the Mental Toughness Questionnaire-48
Masato Kawabata1,2*, Toby G. Pavey3, Tristan J. Coulter3
1
Nanyang Technological University,
National Institution of Education, Singapore 637616, Singapore
E-mail: masato-k@hotmail.com
*corresponding author
2
The University of Queensland,
School of Human Movement and Nutrition Sciences, Brisbane QLD 4072, Australia
3
Queensland University of Technology,
School of Exercise and Nutrition Sciences, Kelvin Grove, QLD 4059, Australia
Manuscript accepted: 28 October 2020

2
EVOLVING THE VALIDITY OF A MENTAL TOUGHNESS MEASURE
Acknowledgements
The present study was conducted without financial support and preregistration. Parts of
this paper were presented at the Association for Applied Sport Psychology's 2018 Annual
Conference, Toronto, Canada. (October 2018).
Conflict of Interest Statement
The authors declare that the research was conducted in the absence of any commercial or
financial relationships that could be construed as a potential conflict of interest.
Data Accessibility Statement
The datasets used and/or analyzed during the current study will be available from authors
on reasonable request.
3
Abstract
The Mental Toughness Questionnaire-48 (MTQ48) is a 48-item self-report instrument to
measure one’s level of mental toughness. Despite its wide popularity in psychological
studies, the questionnaire has been criticized due to its factorial validity. The present study
aimed to re-assess the factorial validity of the instrument and propose alternative models to
provide researchers with theoretically and practically useful instruments to measure mental
toughness. Two studies were conducted using large samples of university students (Study 1: n
= 2,186; Study 2: n = 3,209). In Study 1, none of 1-, 4- and 6-factor models with 48 items
satisfactorily fit the data set. Instead, two refined 18- and 6-item versions of the
questionnaire, covering 6 aspects of mental toughness, were proposed: the Short MTQ and
Very Short MTQ. Both measures demonstrated excellent fit to the data. These results were
replicated with a larger independent sample in Study 2. With the Short MTQ, it is possible to
represent mental toughness as a multidimensional construct consisting of a global mental
toughness factor and 6 specific factors. The Very Short MTQ is a practical tool for occasions
where constraints prevent use of the Short MTQ. The refined questionnaires are promising
options to measure and understand individuals’ mental toughness with the MTQ.
Keywords: Confirmatory factor analysis; Exploratory structural equation modeling; Scale
improvement; Personal characteristics; Multidimensionality

4
Evolving the validity of a mental toughness measure:
Refined versions of the Mental Toughness Questionnaire-48
In the performance psychology literature, the past two decades has witnessed an
exponential increase of research and applied interest in the topic of mental toughness.
Broadly defined, mental toughness is a personality trait that determines how people deal
effectively with challenges, stressors, and pressure, regardless of the circumstances (Clough
& Strycharczyk, 2015). To outline this capacity, researchers have long debated the core
components of mental toughness (e.g., confidence, emotional regulation, persistence) and
subsequently developed several conceptual models (e.g., Jones, Hanton, & Connaughton,
2007). Based on these models, they have also devised a collection of self-report instruments
for assessing mental toughness (see Coulter, Mallett, & Singer, 2018), of which the most
widely used is the Mental Toughness Questionnaire-48 (MTQ48; Clough, Earle, & Sewell,
2002).
The MTQ48 operationalizes the 4/6Cs model of mental toughness. Clough et al.
(2002) considered mental toughness an extension of the psychological construct of hardiness
(Kobasa, 1979), which buffers the impacts of stress. They proposed that mental toughness is
a multidimensional construct based on the three aspects of hardiness – Commitment,
Challenge, and Control (in life and emotion regulation) – together with the construct of
Confidence (in one’s abilities and interpersonal relationships). Since its publication, the
MTQ48 has been widely regarded as a promising tool for assessing mental toughness (see
Lin, Mutz, Clough, & Papageorgiou, 2017 for a recent review; see also Perry, Clough, Crust,
Earle, & Nicholls, 2013; Vaughan, Hanna, & Breslin, 2018). However, despite its widespread
utilization, the MTQ48 has been criticized, due to its validity issues (e.g., Birch, Crampton,
Greenlees, Lowry, & Coffee, 2017; Gucciardi, Hanton, & Mallett, 2012).
5
Recent studies examining the psychometric properties of the MTQ48 based on the
4/6C model have not demonstrated full support for the factor structure of the instrument.
From these studies, it has been reported that there are consistent factorial validity issues with
the MTQ48 at both overall and individual parameter levels: a) poor or unsatisfactory overall
fit of hypothesized measurement models to data within the framework of confirmatory factor
analysis (CFA) and exploratory structural equation modeling (ESEM) (1-, 4-, and 6-factor
CFA models: Birch et al., 2017; Gucciardi et al., 2012; Perry et al., 2013; 1- and 4-factor
ESEM models: Vaughan et al., 2018); b) weak convergent validity (items either cross-loading
or not loading well onto target factors; Birch et al., 2017; Vaughan et al., 2018); and c) lack
of discriminant validity between several factors in CFA models (too high correlations [e.g., r
> .90] between Competence (Ability) and Control (Life) factors, Gucciardi et al., 2012; Perry
et al., 2013).
Some researchers have also expressed concerns in the scale development and content
validity of the instrument (i.e., adequacy of item content, clarity, and structure; see Birch et
al., 2017; Gucciardi et al., 2012; Vaughan et al., 2018). Consequently, the emerging evidence
has raised major concerns about the questionnaire’s construct validity, which place
uncertainty about the legitimacy of earlier findings and validity of the MTQ48 as a
conceptual representation of the 4/6C model.
Considering its wide popularity in psychological studies, it is constructive to resolve
the factorial validity issue of the MTQ48. Rather than merely criticizing the instrument, scale
development should be seen as an ongoing process, and efforts to improve the measure
should also be respected and encouraged (Kawabata, Mallett, & Jackson, 2008; Mallett,
Kawabata, & Newcombe, 2007). As developers of the MTQ48, Clough and colleagues have
welcomed refinement of their measure on an ongoing basis (Perry et al., 2013). Several
6
approaches are proposed to refine the instrument. The first approach is to identify good and
problematic items to measure each target factor. Subsequently, problematic items need to be
replaced with new, good items or removed from the instrument. Because the MTQ48 is
copyrighted material, the latter approach is suitable for non-developers of the MTQ48 to
refine the questionnaire.
In progressing measurement, it is also important to address measurement issues (e.g.,
factorial validity issues, in the present study) from theoretical, empirical, and practical
perspectives (see Mallett et al., 2007, for the details of the three perspectives). A factorial
validity issue that has been overlooked in past psychometric studies on the MTQ48 is how to
represent a global construct of mental toughness, based on the 4/6C model. For example, a
single global mental toughness score has been calculated based on the total or averaged score
of the 48 items and used in several research studies (e.g., Gerber et al., 2013; Papageorgiou,
Wong, & Clough, 2017). However, such a global representation of mental toughness has not
been supported empirically in the psychometric studies on the MTQ48. For instance, Perry et
al. (2013) examined a single factor measurement model consisting of all the 48 items and
reported that the model did not fit their data satisfactorily according to the overall goodness
of fit indices.
Importantly, it should be clearly understood that in the single factor measurement
model, the mental toughness construct is specified as a unidimensional construct, rather than
a multidimensional construct. Mental toughness is conceptualized as a multidimensional
construct, based on the 4/6C model (Clough et al., 2002), and the single factor measurement
model is not suitable to represent the multidimensionality of the mental toughness construct.
Instead, hierarchical (i.e., higher order) and bifactor (i.e., general-specific) models, in which
7
global and specific factors coexist, should be employed to examine the presence of a global
construct (Morin, Arens, & Marsh, 2016).
In considering how to obtain a global mental toughness score, another interesting
question emerges from a practical perspective. Namely, which score should be used for
correlation analysis when structural equation modeling (SEM) is unavailable; a scale score or
factor score? Scale scores, such as total or averaged scores, are often used by summing up or
averaging item scores when the sample size is too small to analyze data within the framework
of SEM. Even when the sample size is large enough to examine the factor structure of an
instrument with CFA or ESEM in the preliminary analysis, and it is possible to calculate
factor scores, some researchers still use scale scores for correlation analysis (see
Papageorgiou et al., 2017). However, it is important for researchers to understand that sum
scoring requires a quite restricted model which is different from a model used to validate the
scale through factor analysis (see McNeish & Wolf, 2020, for details of this issue).
Latent scores and correlations are corrected for measurement errors within the SEM
framework, whereas scale scores are purely based on items which include a part of random
measurement error (Morin et al., 2016). Pearson’s correlations based on scale scores tend to
be lower than CFA-based latent correlations that are more sensitive to measurement error
(see Mallett et al., 2007, for further details of this tendency). However, it is unknown how
different outcomes emerge when correlation analysis is conducted with Pearson’s correlations
based on scale scores and factor scores, rather than latent correlations obtained from the SEM
framework. This question is practically important for the MTQ48 users to confidently
conduct correlation analysis based on scale scores and then interpret the results when SEM is
unavailable due to methodological reasons (e.g., a small sample size).

8
Dagnall et al. (2019) recently examined the factor structure of other shorter versions
of the MTQ48 (MTQ18: Clough et al., 2002; MTQ10: Papageorgiou et al., 2018) for the
responses collected from 944 high school students within the framework of CFA. They
reported that the overall fit of the 4-factor first-order model and the bifactor model was
acceptable for the MTQ10 responses from high school students. However, information about
individual parameters, such as latent factor correlations and factor loadings, was not
sufficiently reported for each of the models in their study. As a result, it is unclear if the two
CFA models fit the MTQ10 data at the individual parameter level.
The current investigation aimed to improve the MTQ48 by resolving its factorial
validity issue from theoretical, empirical, and practical perspectives. To this end, the
investigation was conducted in three stages, consisting of two studies. Stage 1 of the first
study involved re-assessing the factorial validity and reliability of the MTQ48 with a large
sample of university students. After confirming the lack of factorial validity, Stage 2 of the
study involved proposing refined versions (short and very short versions) of the MTQ48 to
provide researchers with theoretically and practically useful instruments to confidently
measure mental toughness through the 4/6C’s framework. In doing so, other previously
overlooked measurement issues were also addressed with the refined versions of the
questionnaire – specifically, a) a full review of the questionnaire’s face validity, b) the
multidimensional and hierarchical representation of a global mental toughness construct, and
c) comparisons of correlation values between latent correlations from ESEM models and
Pearson’s correlations based on scale and factor scores. Stage 3 was conducted in the second
study for the cross-validation of the refined versions of the MTQ with another independent
sample. To evaluate the usefulness of the newly refined versions of the MTQ48, their
9
psychometric properties were rigorously compared with those of the MTQ18 (Clough et al.,
2002) and the MTQ10 (Papageorgiou et al., 2018) in the present study.
Study 1
Method
Participants. A total of 2,186 university students (802 men, 1,384 women; Mage =
23.9, SD = 8.1, the range of age: 16-70 years old), whose first language was English,
participated in the study. The majority of participants (93.3%) were Australians and the rest
(6.7%) were Americans, British, and Canadians. Participants’ majors were health (29.2%),
science and engineering (20.9%), business (15.6%), law (12.4%), creative industry (11.3%),
and education (10.5%).
Measures.
The Mental Toughness Questionnaire-48 (MTQ48). The MTQ48 (Clough et al.,
2002) is a self-report instrument designed to measure one’s level of mental toughness. The
MTQ48 covers four components of mental toughness and consists of six subscales:
commitment, challenge, control (emotion and life), and confidence (abilities and
interpersonal). These components and subscales are abbreviated to 4/6Cs. Respondents were
asked to indicate the degree to which they generally agreed with the statement of each item
on a 5-point Likert-type scale, ranging from 1 (strongly disagree) to 5 (strongly agree). In the
MTQ48, 22 items include negatively worded statements and the scores of these items were
reversed for analyses (e.g., “At times I expect things to go wrong”).
The Depression Anxiety Stress Scales-21 items (DASS-21). To examine the
concurrent validity of the refined versions of the MTQ, participant’s perceived stress level
was measured with the stress subscale of the DASS-21 (Lovibond & Lovibond, 1995). The
stress subscale, consisting of 7 items, was only used in the present study. Respondents were
10
asked to indicate the degree to which each statement applied to them over the last week on a
4-point Likert-type scale ranging from 0 (did not apply to me at all) to 3 (applied to me very
much or most of the time).
Procedure. The current study was approved by the institutional ethics review
committee of Queensland University of Technology and adhered to the guidelines for ethical
practice. University students were invited to participate in the study via email in Australia.
Participation was voluntary and informed consent was obtained from each of the participants
before they started completing an online survey.
In the first stage, the factorial validity and reliability of the MTQ48 was re-examined
with the current large sample. After observing the lack of the factorial validity of the MTQ48,
problematic items related to the factorial validity issue were identified statistically. In the
second stage, two refined versions of the MTQ48 – an 18-item short version (Short MTQ [S-
MTQ]) and a 6-item brief version (Very Short MTQ [VS-MTQ]) – were proposed from
theoretical and empirical perspectives and, subsequently, their factorial validity and
reliability, as well as concurrent validity, were evaluated statistically.
Item selection. Previous research (Birch et al., 2017; Gucciardi et al., 2012; Vaughan
et al., 2018) has identified conceptual misfit in several of the MTQ48 items that question its
adequacy to represent the dimension definitions of an underpinning 4/6C model. Despite this
acknowledgement, a full review of the questionnaire’s face validity is yet to be reported. In
the current study, Lynn’s (1986) Content Validation Index (CVI) guidelines were used to
conduct the face validity check on the MTQ48 items. Following these guidelines, three
independent experts separately reviewed the content and structure of the items. The experts
are well published in mental toughness literature and experienced in the procedures and
stages involved in designing psychological inventories.

11
In reviewing the MTQ48 items, the experts were asked to rate the relevancy of each
item to its hypothesized factor definition. Relevancy was rated across a 4-point scale, where
‘1’ implies an irrelevant item and ‘4’ a very relevant item (see Lynn, 1986). The CVI score
for an item was determined by the proportion of experts who rated it as content valid. With
the MTQ48 being a previously published instrument, items were only considered content
valid if they received a score of 4 from all expert raters (i.e., 100%). The CVI score for each
item ranged from 0 (0 × 3 raters) to 12 (4 × 3 raters). The CVI score for the whole
questionnaire was calculated as the proportion of total items judged content valid (i.e., the
percentage of 48 items consistently scoring 4 in the rating process). In conducting their
review, the experts were also asked to clarify decisions made in rating each item’s relevancy
to its factor definition.
Identifying reliable and clear definitions of the 4/6Cs is a convoluted task. Different
sources (e.g., technical manual, user guide, empirical articles) are not always consistent in
their descriptive language of each dimension. Moreover, the breadth of descriptors linked to
each dimension (i.e., the attributes of people with high and low scores) make it problematic to
articulate what is, in fact, the core definition of each 4/6C component. Using the MTQ48
technical manual (Clough, Perry, Crust, Strycharczyk, & Rowlands, 2015) and other
comprehensive reviews of the MTQ48 (e.g., Clough & Strychrczyk, 2012), the research team
identified consistent descriptors that define the questionnaire’s 6 main subscales as follows.
• Challenge: The extent to which a person is likely to view a challenge or setback as an
opportunity.
• Commitment: The extent to which an individual is likely to persist with a goal, despite
any problems or obstacles that arise.
• Control Emotion: The extent to which people control their anxieties and emotions.
12
• Control Life: The extent to which people believe they have sufficient control over their
lives and the environment around them.
• Confidence Abilities: The degree of confidence people have in their abilities to
successfully complete tasks.
• Confidence Interpersonal: The extent to which people are prepared to assert themselves
and deal with social challenge or ridicule.
Expert raters were instructed to only review the relevancy of items against these core
definitions. Collectively, this review panel identified 13 of 48 items (27.1%) to be content
valid (see Supporting Information).
Data analyses. To examine the factor structure of the MTQ48, confirmatory factor
analysis (CFA) and exploratory structural equation modeling (ESEM) were conducted with
Mplus (Version 8.4; Muthén & Muthén, 1998-2019) based on Mplus robust maximum
likelihood estimation (MLR). In the CFA model, each item was allowed to load on only one
target factor and all non-target cross-loadings were constrained to be zero. In the ESEM
model, all items were allowed to load on every factor and all factor loadings were estimated
by imposing appropriate restrictions on the factor loading matrix and the factor covariance
matrix (Asparouhov & Muthén, 2009; Marsh et al., 2010). An oblique geomin rotation was
used in the ESEM model, because the MTQ48 factors are expected to covary and the geomin
rotation criterion is the most effective criterion when the true factor loading structure is
unknown (Asparouhov & Muthén, 2009).
In the first stage, CFA and ESEM were conducted for three hypothesized models (1-,
4-, and 6-factor models) with 48 items. Clough et al. (2002) also proposed 18 items for a
shorter version of the MTQ48 (Commitment: Items 11, 35, 42; Challenge: Items 14, 23, 30;
Control Life: Item 2; Control Emotion: Items 21, 27, 31, 37; Confidence Abilities: Items 3,
13
13, 16, 36; Confidence Interpersonal: Items 17, 43, 46.) Furthermore, Papageorgiou et al.
(2018) proposed 10 items for another shorter version of the MTQ48 (Commitment: Items 11,
42; Challenge: Items 23, 30; Control Life: Item 2; Control Emotion: Items 27, 31; Confidence
Abilities: Items 3, 16, 36). Dagnall et al. (2019) recently examined the factor structure of
other shorter versions of the MTQ48 (MTQ18: Clough et al., 2002; MTQ10: Papageorgiou et
al., 2018) within the framework of CFA. They reported the results of 1- and 4-factor CFA
measurement models. For completeness, the 18-items and 10-items 1-factor and 4-factor
models were examined with CFA and ESEM in the present study. In the second stage, CFA
and ESEM were conducted for the models with selected items. When a first-order
measurement model fit the data adequately, hierarchical and bifactor ESEM models were also
examined. Following the procedures by Morin et al. (2016) and Morin and Asparouhov
(2018), an orthogonal target rotation was used for the bifactor ESEM.
To assess overall model fit, several criteria were used: the MLR chi-square statistic
(Muthén & Muthén, 1998–2019), the comparative fit index (CFI; Bentler, 1990), the Tucker-
Lewis index (TLI; Tucker & Lewis, 1973), the root mean square error of approximation
(RMSEA; Steiger, 1990), and the standard root mean square residual (SRMR; Hu & Bentler,
1998). Values on the CFI and TLI that are greater than 0.90 and 0.95 are generally taken to
reflect acceptable and excellent fits to the data (e.g., Marsh et al., 2010). For the RMSEA,
values of 0.05 or less indicate a close fit, and 0.08 or less indicate an adequate fit (Brown &
Cudeck, 1993). Values on the SRMR that are less than 0.08 indicate an adequate fit (Hu &
Bentler, 1998). Conventional multiple cut-off values (i.e., the CFI and TLI ≥ 0.90, the
RMSEA ≤ 0.08, the SRMR ≤ 0.08) were considered minimum thresholds for accepting
overall model fit. For the assessment of the fit of individual items, standardized factor
loadings, and residuals were carefully examined.

14
After confirming that the hypothesized factor structure of the MTQ was tenable for
the current data, the internal consistency reliability of the MTQ responses was assessed using
Cronbach’s (1951) coefficient alpha (α) and McDonald’s (1999) coefficient omega (ɷ). The
coefficient of α is a widely-used measure of reliability, but also misunderstood (Hayes &
Coutts, 2020). High α is not an indicator of unidimensionality, and it is necessary to establish
that a set of items are measuring the single construct before reporting α as a measure of
reliability of the set of the observed scores (Hayes & Coutts, 2020). The assumption of equal
factor loadings (tau equivalent) is essential for α, but ɷ is not based on the assumption.
Methodologists (e.g., Hayes & Coutts, 2020; Raykov, & Marcoulides, 2011) recommend
using ɷ instead of α because ɷ is a more general estimator of reliability. However, α has been
commonly used in the literature on the measurement in mental toughness. Therefore, both
reliability coefficients were reported in the present study, for the sake of completeness.
Results
Descriptive analyses. The means of the 48 item scores ranged from 2.24 (SD = 1.07)
to 4.27 (SD = .66). The items with the lowest and highest mean scores were Item 27 (Control
[Emotion]: “I tend to worry about things well before they actually happen”) and Item 19
(Commitment: “I can generally be relied upon to complete the tasks I am given”),
respectively.
CFA and ESEM.
Stage 1: Re-examination of the factor structure of the MTQ48. None of 1-, 4-, and
6-factor CFA models with 48 items fit to the data adequately (see Table 1). Although values
on the RMSEA and SRMR were acceptable, values on the CFI and TLI were consistently
below minimum acceptable levels for the three models. Similar to the CFA models, all of 1-,
4-, and 6-factor ESEM models with 48 items did not fit to the data satisfactorily. To identify
15
problematic items, the factor loadings of 48 items were carefully examined based on the
solutions of the 6-factor ESEM model. It was found that 12 of 22 items, including negatively
worded statements, did not load well to their targeted factor. The wording effect of the
negatively worded item (Wang, Chen, & Jin, 2014) was apparent in the data.
The 1-factor CFA model with the 18 items (Clough et al., 2002) and 10 items
(Papageorgiou et al., 2018) did not fit the data adequately (Table 1). These results were
consistent with the unsatisfactory fit of the 1-factor CFA model reported in Dagnall et al.
(2019). Although they correlated seven pairs of error terms for the MTQ18 and two pairs for
the MTQ10 to achieve an adequate model fit, it is not encouraged to free up parameters on
the basis of modification indices without substantive meaningfulness (Byrne, 2005). Dagnall
et al. reported goodness-of-fit indices for the 4-factor CFA model; however, the solution of
the model was improper here because the latent correlation between Challenge and Control
was greater than 1. This indicates that the two factors were not empirically indistinguishable.
As for the ESEM results, the 1-factor ESEM model with the 18 items (Clough et al.,
2002) and 10 items (Papageorgiou et al., 2018) did not fit the data either (see Table 1). The 4-
factor ESEM model with the 18 items did not fit data adequately, whereas the 4-factor ESEM
model with 10 items showed an excellent overall fit to the data. However, inspection of item
factor loadings revealed that half of the 10 items did not load on its target factor.
Collectively, the 4-factor CFA model with the 18 items and 10 items produced an
improper solution and the 4-factor ESEM model with the 18 items and 10 items did not fit
data adequately at overall and individual parameter levels, respectively. Therefore,
hierarchical and bifactor CFA and ESEM models with the 18 items and 10 items were not
examined in this study.

16
Stage 2: Refined versions of the MTQ. At least three items are technically required to
test the fit of a single factor model and calculate the model-based McDonald’s (1999)
coefficient omega. To provide theoretically and practically useful instruments to measure
mental toughness through the 4/6C’s framework, five items showing next highest CVI scores
with good face validity (Challenge: 1 item; Control Emotion: 1 item; Control Life: 1 item;
Confidence Abilities: 2 items) were added to the 13 items with high face validity (see the
Section of item selection) so that there were three items for each factor. For Confidence
Ability, Items 18 and 24 were selected although their CVI scores (4 for both) were slightly
lower than Items 3 (CVI = 5) and 13 (CVI = 6). The rational of selecting Items 18 and 24 was
that they are the only other items in this factor that link to ability (or lack of ability).
Consequently, 18 items (3 items × 6 factors) were included in the refined short version of the
MTQ (Short MTQ: S-MTQ) (Commitment: Items 7, 29, 47; Challenge: Items 4, 44, 48;
Control Life: Items 2, 12, 41; Control Emotion: Items 27, 31, 45; Confidence Abilities: Items
8, 18, 24; Confidence Interpersonal: Items 20, 43, 46).
Subsequently, CFA and ESEM were conducted with the 18 items. The 4- and 6-factor
CFA models did not fit to the data adequately (see Table 1). The overall fit of the 4-factor
ESEM model was satisfactory according to all the overall fit indices. However, it was found
that all three items for Confidence Abilities did not load well on their target factor (factor
loadings varying from -.03 to .16). Instead, they loaded on a non-target factor of Control
(factor loadings varying from .32 to .56). The 6-factor ESEM model fit to the data very well.
In the 6-factor ESEM model, latent correlations between the six factors ranged from .13 to
.42, and factor loadings for the target factor ranged from .12 to .73 (see Table 2). The internal
consistency coefficients (α; ɷ [95% CI] in order) for the six subscales of the S-MTQ were
Control Emotion (.62; .63 [.60-.65]), Control Life (.69; .70 [.67-.72]), Challenge (.69; .71
17
[.69-.74]), Commitment (.65; .66 [.64-.69]), Confidence Interpersonal (.60; .63 [.60-.66]),
Confidence Abilities (.63; .64 [.62-.67]).
Because the 6-factor first-order ESEM model fit the data satisfactorily at both overall
and individual parameter levels, corresponding bifactor and hierarchical ESEM models were
also tested. Both bifactor and hierarchical ESEM models fit the data very well (see Table 1).
The bifactor ESEM solution shows that the global mental toughness factor was well-defined
by the presence of strong and significant target loadings from all the 18 items (ranging from
.22 to .68). The six specific factors were also well-defined through strong and significant
target loadings from 16 of 18 items (ranging from .30 to .70). The loading of two items (Item
47 for Commitment; Item 41 for Control Life) to their target factor were non-significant, but
they loaded substantively to the global mental toughness factor (> .56). As for the
hierarchical ESEM solution, the six first-order factors were well-defined through strong and
significant target loadings from all 18 items (ranging from .14 to .86). The factor loadings of
most first-order factors on the global mental toughness factor were significant and substantial
from .40 to .72. However, the loadings from Control Life (.05) and Confidence Interpersonal
(.13) on the global mental toughness factor were non-significant. These results indicated that
Control Life and Confidence Interpersonal were not related to the global mental toughness
factor in the hierarchical ESEM model. Given that the higher-order mental toughness factor
was unable to explain correlations among the 6 first-order factors, the bifactor ESEM model
seems to represent the S-MTQ responses better than the hierarchical ESEM model.
To develop a very short version of the MTQ (VS-MTQ) that covers all the six
components of the MTQ with the minimum number of items, one item was selected for each
of six components from the S-MTQ. In doing so, reverse score items were not selected to
exclude potential wording effects (Wang et al., 2014) from the single factor model. Four sets
18
of competing models were proposed from theoretical and statistical perspectives. Statistical
parameters (e.g., overall fit of the CFA model, standardized factor loadings, and internal
consistency coefficients) were similar between the four different models, but one set of six
items (Items 4, 7, 8, 12, 20, 31; all with 100% CVI scores of 12 for the items) was considered
best among them against core dimension definitions. The fit of both 1-factor CFA and ESEM
models with the finally selected six items were excellent (see Table 1). Factor loadings
ranged from .42 (Confidence Interpersonal: Item 20) to 69 (Confidence Abilities: Item 8) in
the CFA and ESEM models. The internal consistency coefficients (α; ɷ [95% CI] in order)
for the single factor were .72; .72 (.70-.74).
Concurrent validity. Latent correlations between the refined versions of the MTQ
(S-MTQ and VS-MTQ) and the stress factor of the DASS-21 were assessed to examine the
concurrent validity of the S-MTQ and VS-MTQ responses. For the S-MTQ, six factors were
specified as ESEM factors with target rotation and the stress factor was specified as a CFA
factor. Given that Clough et al. (2002) considered mental toughness an extension of hardiness
that buffers the impacts of stress (Kobasa, 1979), individuals who are mentally tough are less
likely to perceive stress symptoms. Gerber et al. (2013) reported negative correlations
between mental toughness subscale scores and a perceived stress score. Thus, it was assumed
that the six MTQ factors would negatively correlate with the stress factor. Both the models
provided an acceptable fit to the data (the first-order MTQ ESEM with the stress CFA model:
χ2 [193, N = 2,186] = 909.68, p < .001; CFI = .957, TLI = .933, RMSEA = .041, SRMR
= .031; the bifactor MTQ ESEM with the stress CFA model: χ2 [180, N = 2,186] = 771.03, p
< .001; CFI = .964, TLI = .940, RMSEA = .039, SRMR = .030). As expected, all the first-
order MTQ factors in the 6-factor ESEM model were significantly and negatively correlated
with the stress factor, ranging from -.17 (Confidence Interpersonal) to -.65 (Control
19
Emotion). The global mental toughness factor in the bifactor ESEM model also provided a
negative correlation (-.45) with the stress factor (see Table 3).
For the VS-MTQ, both the first-order mental toughness factor and the stress factor
were specified as CFA factors. The model fit the data adequately (MLRχ2 [63, N = 2,186] =
504.794, p < .001; CFI = .926, TLI = .908, RMSEA = .053, SRMR = .053.). The first-order
mental toughness factor was significantly negatively correlated (-.49) with the stress factor. It
was found that the latent correlation (i.e., -.45) between the global mental toughness factor
identified with the S-MTQ and the stress factor was slightly lower than the one (i.e., -.49)
between the first-order mental toughness factor identified with the VS-MTQ and the stress
factor; however, their values were comparable.
Finally, correlations between the refined versions of the MTQ (S-MTQ and VS-MTQ)
and the DASS-21 Stress were re-computed by using scale and factor scores to examine if
Pearson’s correlation coefficients based on scale and factor scores were compatible with
latent correlation coefficients obtained from the ESEM models, in which latent constructs
were corrected for measurement errors (see Table 3). Each subscale score was the total of
item scores under the subscale. Factor scores of the S-MTQ were calculated based on the
standardized factor loadings obtained from the bifactor ESEM model. Factor scores of the
VS-MTQ and DASS-21 Stress were calculated based on the standardized factor loadings
from a 1-factor CFA model. The results of correlation analyses between the S-MTQ and the
DASS-21 Stress are summarized in Table 4.
For scale scores, Pearson’s correlation coefficients between the S-MTQ subscales
were found comparable with the latent correlation coefficients based on the 6-factor ESEM
model (see Tables 3 and 4). Again, only for scale scores, Pearson’s correlation coefficients
between the S-MTQ scores and the DASS-21 Stress score were similar to their corresponding
20
latent correlation coefficients, except for the ones between Control Emotion and Stress scores
(see Tables 3 and 4). With regard to the Pearson’s correlation coefficient between the VS-
MTQ and the DASS-21 Stress, it was -.35 (p < .001) for both scale and factor scores, but
slightly weaker than the corresponding latent correlation coefficient (-.49).
Discussion
Study 1 aimed to re-evaluate the factorial validity of the instrument and propose
alternative models to improve its validity with refined versions of the questionnaire
(the S-MTQ and VS-MTQ).
None of 1-, 4-, and 6-factor CFA and ESEM models with 48 items fit to the data
adequately (see Table 1). These results were consistent with those reported in previous
studies (e.g., Gucciardi et al., 2012; Perry et al., 2013). Furthermore, 1-factor CFA models
with the 18 items proposed by Clough et al. (2002) and the 10 items selected by
Papageorgiou et al. (2018) did not fit the data either. The solutions of the 4-factor CFA model
with the 18 items and 10 items were improper, as Challenge and Control were not empirically
distinguishable. In a CFA model, in which cross-loadings are constrained to be zero, factor
correlations are likely to be inflated unless all non-target loadings are close to zero (Marsh et
al., 2010).
Dagnall et al. (2019) reported Pearson’s correlations for the MTQ18 subscale scores,
but they did not report the latent correlations between the four factors in their study.
Therefore, it is unknown if the improper solutions of the 4-factor CFA model with the
MTQ18 are specific to the current sample. In Study 1, the 4-factor ESEM models with the 18
items and 10 items were also examined. The overall fit of the model with 18 items was
unsatisfactory. Despite the excellent overall fit of the model with 10 items, half of 10 items
21
did not load on its target factor. Consequently, the hypothesized factor structures of the
MTQ18 and MTQ10 were invalid for the current large sample of university students.
Based on the CVI, 18 items were selected for the S-MTQ. The 6-factor ESEM model,
with 18 items, fit the data very well. Considering that the corresponding 6-factor CFA model
did not fit the data satisfactorily, it was apparent that there were items which loaded to their
non-targeted factors. However, the sizes of non-targeted cross-loadings were far smaller for
most items compared to the significant substantial targeted factor loadings (see Table 2).
Thus, the 6-factor ESEM solution showed well-defined six factors. Because the structure of
the six factors were well defined, corresponding bifactor and hierarchical ESEM models were
examined further. It was found that the bifactor ESEM model represented the S-MTQ
responses better than the hierarchical ESEM model. The well-defined, bifactor structures of
the S-MTQ responses support the multidimensionality of the mental toughness construct.
As for the internal consistency reliability of the S-MTQ responses, both alpha and
omega coefficients were lower than .70 for four of six factors (Control Emotion,
Commitment, Confidence Interpersonal, Confidence Abilities). Tóth-Király, Morin, Bőthe,
Orosz, and Rigó (2018) stated that “lower level of reliability would be more concerning for
research on scale scores than fully latent variables, given that latent variables are naturally
corrected for measurement errors, and thus perfectly reliable” (p. 278). In the present study,
omega coefficients were calculated within the framework of factor analysis and all of them
were above .60. Thus, the observed ɷ values were considered reasonable. The coefficient α is
affected by the number of items and increases as the number of items increases on a certain
condition (Hayes & Coutts, 2020). Given that the α coefficient for each subscale was
calculated with three items, they would also be reasonable. The concurrent validity of the S-
MTQ responses was examined as one of between-construct studies. As hypothesized, all the
22
specific MTQ factors and the global mental toughness factor were negatively associated with
the stress factor. Thus, the concurrent validity of the S-MTQ responses was well supported.
In the second stage, the VS-MTQ was proposed as a very short version of the MTQ to
provide a practical tool for occasions where constraints prevent use of the S-MTQ. In the VS-
MTQ, each item is a representative indicator of each component proposed in the 4/6C model,
and the single factor is specified as a global mental toughness construct that encompasses the
six components of the 4/6C model. Both 1-factor CFA and ESEM models with the six items
showed an excellent fit to the data, and factor loadings of the six items were adequate. The
internal consistency coefficients for the single factor were also acceptable. The VS-MTQ is
practically useful as it is far more parsimonious than the MTQ48.
Study 2
The purpose of Study 2 was to cross-validate the factor structure of the newly-refined
S-MTQ and VS-MTQ with a larger independent sample. The establishment of measurement
invariance is required to make appropriate group comparisons (Chen, 2007; Cheung &
Rensvold, 2007). However, the MTQ48 has rarely been subject to such examination
(Vaughan et al., 2018 for an exception). Therefore, measurement invariance was also tested
for the responses to the S-MTQ and VS-MTQ.
Method
Participants. A total of 3,209 university students (1,206 men, 2,003 women; Mage =
24.0, SD = 8.4, the range of age: 16-66 years old) voluntary participated in Study 2. Their
first language was English and most of them (94.0%) were Australians. Participants’ majors
were health (29.4%), science and engineering (21.3%), business (15.7%), law (12.1%),
creative industry (12.0%), and education (9.5%). There were no overlapping participants
between Studies 1 and 2.

23
Measures. Participants in Study 2 were also asked to complete the MTQ48 (Clough
et al., 2002) on a 5-point Likert-type scale ranging from 1 (strongly disagree) to 5 (strongly
agree) as well as the stress subscale of the DASS-21 (Lovibond & Lovibond, 1995) on a 4-
point Likert-type scale ranging from 0 (did not apply to me at all) to 3 (applied to me very
much or most of the time).
Procedure. The online survey was conducted the same way as Study 1. Participants
provided informed consent before starting an online survey.
Data analyses. To examine the factor structure of the S-MTQ and VS-MTQ, CFA
and ESEM were conducted with the same procedures as Study 1. For completeness, other
models of the MTQ48, analyzed in Study 1, were also evaluated in Study 2.
Measurement invariance was tested across gender for the combined sample of Studies
1 and 2. Equality constraints were hierarchically imposed on the parameters across the gender
samples in the following sequence: configural invariance (no constraints), factor loadings,
intercepts, and uniqueness of observed variables. The invariance of two nested measurement
models was considered to be tenable when the overall pattern of goodness-of-fit indexes was
adequate and the change in the value of the CFI and RMSEA were negligible (i.e., less than
or equal to .01 for CFI and .015 for RMSEA; Chen 2007, Cheng & Rensvold, 2002).
Results and Discussion
CFA and ESEM. The results of CFA and ESEM on the S-MTQ responses were
similar to Study 1. The 4-factor ESEM model showed adequate overall fit to the data.
Consistent with Study 1, however, all three items for Confidence Abilities did not load on
their target factor (factor loadings varying from .02 to .17) but loaded on a non-target factor
of Control (factor loadings varying from .40 to .54). The fit of the 6-factor ESEM model was
excellent based on all the overall fit indices (see Table 5). In the 6-factor ESEM model, latent
24
correlations between the six factors ranged from .14 to .39, and factor loadings for the target
factor ranged from .12 to .76. The internal consistency coefficients (α; ɷ [95% CI] in order)
for the six subscales of the S-MTQ were Control Emotion (.63; .63 [.61-.66]), Control Life
(.65; .66 [.64-.69]), Challenge (.71; .72 [.71-.74]), Commitment (.67; .67 [.65-.70]),
Confidence Interpersonal (.59; 61 [.58-.64]), Confidence Abilities (.59; 68 [.63-.71]). Like
Study 1, both alpha and omega coefficients were lower than .70 for most factors. However,
the observed values of α and ɷ in Study 2 were comparable with those in Study 1 and
considered reasonable, as stated earlier. Both bifactor and hierarchical ESEM models fit the
data very well (see Table 5). In the bifactor model, all target loadings for the specific factor
were significant, ranging from .07 to .61, and target loadings for the global mental toughness
factor from all the 18 items were also significant, ranging from .25 to .68. These results
replicated that the well-definition of the global mental toughness factor and the six specific
factors. In the hierarchical model, all target loadings for the first-order factor were significant,
ranging from .11 to 1.00, and the factor loadings of most of the first-order factors on the
global mental toughness factor were significant and substantial, from .67 to .90. However, the
loadings from Challenge and Confidence Abilities on the global mental toughness factor were
found non-significant. These results also replicated that correlations among the 6 first-order
factors were not explained well by the higher-order mental toughness factor. Study 2 cross-
validated that the bifactor ESEM model was better to represent the S-MTQ responses than the
hierarchical ESEM model.
As for the VS-MTQ, the fit of both 1-factor CFA and ESEM to the data were
excellent (see Table 5). Factor loadings ranged from .47 (Confidence Interpersonal: Item 20)
to .69 (Confidence Abilities: Item 8) in the CFA and ESEM models. The internal consistency
coefficients (α; ɷ [95% CI] in order) for the single factor were .72; .73 (.71-.74). Consistent
25
with Study 1, none of 1-, 4-, and 6-factor CFA and ESEM models with 48 items fit the data
satisfactorily. Furthermore, the 1- and 4-factor CFA and ESEM models with the 18 items by
Clough et al. (2002) and the 10 items by Papageorgiou et al. (2018) fit the large data
unsatisfactorily or produced improper solutions (see Table 5).
The results of the invariance analyses are summarized in Table 6. For both S-MTQ
and VS-MTQ, measurement invariance across gender was achieved at the factor-loading and
uniqueness levels, but not at the intercept level. These results indicated that the strength of
relationships between items and the underlying factors is identical across gender, but the
origin of the latent variable may differ. Measurement invariance at the factor-loading level is
a prerequisite for meaningful cross-group comparison (Cheng & Rensvold, 2002). The
comparison of relationships between the mental toughness factor, measured by the S-MTQ
and the VS-MTQ, and other external variables is possible across gender.
Instrument validation is an ongoing process and cross-validation studies are necessary
(e.g., Kawabata et al., 2008), since parameter estimates are unique to the sample on which
they are based. The results of Study 2 cross-validated the factor structure of the S-MTQ and
VS-MTQ for a large independent sample of university students.
General Discussion
In the present study, two refined versions of the MTQ48 were proposed to improve
the questionnaire by resolving its factorial validity issue from theoretical, empirical, and
practical perspectives: the S-MTQ and VS-MTQ. The results of the two studies strongly
supported the factorial and concurrent validity, as well as reliability, of the responses to both
the refined versions. The S-MTQ and VS-MTQ are psychometrically sound, but much shorter
than the original MTQ48.

26
Another advantage of the S-MTQ is that there are 3-items for each of the six factors
and it is possible to empirically examine mental toughness as a multidimensional construct. If
researchers are interested in measuring each aspect of the mental toughness construct,
proposed by Clough et al. (2002), or the global mental toughness factor, the S-MTQ would be
a suitable option. Alternatively, if researchers are interested in briefly measuring mental
toughness, as one of many constructs in their study, and scoring it as a unidimensional single
score, the VS-MTQ would be a possible choice for that need. Such usage of short and very
short versions of an instrument is also seen for other psychological constructs, such as the big
five personality traits (Gosling, Rentfrow, & Swann, 2003) and flow (Jackson, Martin, &
Eklund, 2008).
Both the refined versions were developed based on data from a large sample in Study
1 and their factorial validity and reliability were replicated with an even larger sample in
Study 2. The advantage of the S-MTQ and VS-MTQ over the MTQ18 (Clough et al., 2002)
and MTQ10 (Papageorgiou et al., 2018) is that a) the selected 18 items were considered to
have high or good content validity, b) they measure all six components proposed in the 6C
model, and c) possess sound psychometric properties demonstrated through rigorous
examinations of the measures.
Measurement invariance has been rarely tested for the MTQ48 (see Vaughan et al.,
2018). Therefore, measurement invariance of gender was also tested for the S-MTQ and VS-
MTQ responses. The results of the measurement test showed that it is possible to compare the
relationships between the mental toughness factor measured by the S-MTQ and VS-MTQ and
other variables across gender.
Correlations between the refined versions of the MTQ (S-MTQ and VS-MTQ) and
the DASS-21 Stress were re-computed by using scale and factor scores and compared with
27
latent correlations obtained from the ESEM models. It was revealed that Pearson’s
correlation coefficients based on scale and factor scores were found comparable with their
corresponding latent correlation coefficients. These results were somewhat surprising,
because latent correlations are corrected for measurement errors, whereas scale scores are
purely based on items which include a part of random measurement error (Morin et al.,
2016). The findings suggest that when data are collected from a large sample, Pearson’s
correlations based on scale scores can be used confidently to examine relationships between
the refined versions of the MTQ (S-MTQ and VS-MTQ) and other constructs. The absolute
value of Pearson’s correlations, based on scale scores, were found to be compatible with
latent correlations. The information observed in the comparisons is practically useful for
applied researchers to use and interpret Pearson’s correlation coefficients based on scale
scores, as the correlation coefficients are often used in their research.
Lastly, in evolving the validity of the MTQ48, we were cognizant of the potential
statistical and conceptual consequences of refining the instrument. Previous debate has raised
issues with the problems associated with scale purification and an over emphasis in seeking
statistical fit (Clough, Earle, Perry, & Crust, 2012). The VS-MTQ, in particular, has a
significant reduction in items from its predecessor, which, some researchers might argue,
lacks the conceptual essence and breadth of the original work. In this study, the aim of
conducting the item face validity check was an attempt to maintain the integrity of the 4/6C
model, as originally defined. The cost of strictly matching each item to its main dimensional
definition (for Confidence, Commitment, etc.) meant only 13 items were retained. The
rejection of 35 items in this procedure did not go unnoticed and, presumably, represented a
large proportion of the conceptual work originally included in the MTQ48’s design.
However, on the surface, 25 of the 35 rejected items seem to be measuring something else,
28
outside their allocated factor; for instance, they appear more relevant to other psychological
constructs – such as, self-esteem, optimism, focus, decision making, motivation, coping, and
extraversion (see Supporting Information) – some of which are represented in other mental
toughness frameworks (e.g., Gucciardi, Hanton, Gordon, Mallett, & Temby, 2015). The other
10 rejected items were either conceptually vague (e.g., Item 9), used ambiguous language
(e.g., Item 35), lacked specificity (e.g., Item 21), or incorrectly categorized (e.g., Item 5).
It appears the MTQ48’s designers have included items that either measure their
dimension directly – and, hence, are considered a “good fit” in this study (with 100% CVI
score for items) – or, in the case of the 25 rejected items, relate to other constructs and actions
associated with high and low scores in that factor. For example, people who see setbacks as
opportunities (Challenge definition) might also be individuals that cope well with their
problems (e.g., Item 23). For others who persist through obstacles (Commitment definition),
keeping focused could be something that they do well, too (e.g., Item 22). Similarly, people
that believe in themselves (Confidence definition) might be equally optimistic (e.g., Item 16),
while those struggling to manage their emotions (Control – Emotion definition) could also,
conceivably, find themselves excessively worrying about the future (e.g., Item 27).
By removing 25 items denoting constructs associated with high and low scores for the
4/6Cs, our approach might be criticized for leaving out “the language of the participants” in
the questionnaire’s original design (see Clough et al., 2012, p. 284). Our decision, however,
with the face validity check, was to focus on items that match the core definitions for each
dimension. If it is later decided that these associated constructs and correlates are actually
central components of mental toughness, the 4/6C model may be better understood as one
currently masking a broader, underlying framework. For example, if optimism is later agreed
to be a key aspect of mental toughness, Items 13, 15, 16, and 32 quickly become much more
29
relevant, according to the face validity review in this study (see Supporting Information). As
it stands, these items sit across 2 dimensions of the 4/6C model – namely, the Control (Life;
Item 15) and Confidence (Abilities; Items 13, 16, 32) dimensions. If deemed to be important
to the conceptual building blocks of mental toughness, reconfiguring the MTQ48 around
these additional constructs might prove generative to a greater inclusion of the original items
for further statistical analysis. Hypothetically, it would raise the inventory’s CVI score above
the existing 27.1%, reported here. This step, of course, would require expansion to the
original 4/6C perspective of mental toughness, proposed by Clough et al. (2002).
Limitations
The S-MTQ and VS-MTQ were proposed as alternative models to evolve the validity
of the MTQ. Their psychometric properties were rigorously examined and cross-validated
with two large data sets. However, both are university student cohorts. Considering that the
MTQ48 has been widely used in education, business, military, and sport, the validity and
reliability of the S-MTQ and VS-MTQ should be further evaluated by examining different
types of validity (e.g., predictive validity) with individuals from different domains (e.g.,
businesspeople and athletes), backgrounds (i.e., other/non-Westernized countries), and stages
of education (e.g., high school) in future research. In the present study, the data were
collected by using the original MTQ48. However, test length is one of multiple factors that
affect true and observed variance of scores due to the possibility that the respondents are
more likely to get tired or disinterested in the questionnaire, carefully or honestly (Hayes &
Coutts, 2020; Raykov, & Marcoulides, 2011). Thus, it is recommended collecting data with
the S-MTQ or the VS-MTQ for further evaluations of their psychometric properties.
30
Conclusion
Clough and colleagues welcomed refinement of their MTQ48 questionnaire on an
ongoing basis (Perry et al., 2013). In following this suggestion, the current study aimed to
improve the MTQ48 by resolving its factorial validity issue and provide researchers with
theoretically and practically useful instruments to confidently measure the mental toughness
construct. The unique and significant contribution of the study was to identify problematic
items that were associated with the issue and propose alternative models to improve the
validity of the MTQ. Based on the findings of the present study, the S-MTQ and VS-MTQ
are considered as valid measures of mental toughness, as defined by a 4/6C framework.

31
References
Asparouhov, T., & Muthén, B. (2009). Exploratory structural equation modeling. Structural
Equation Modeling, 16, 397–438.
Bentler, P. M. (1990). Comparative fit indexes in structural models. Psychological Bulletin,
107, 238–246.
Birch, P. D. J., Crampton, S., Greenlees, I. A., Lowry, R. G., & Coffee, P. (2017). The
Mental Toughness Questionniare-48: A re-examination of factorial validity.
International Journal of Sport Psychology, 48, 331-355.
Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In K. A.
Bollen & J. S. Long (Eds.), Testing structural equation models (pp. 445–455).
Newbury Park, CA: Sage.
Byrne, B. M. (2005). Factor analytic models: Viewing the structure of an assessment
instrument from three perspectives. Journal of Personality Assessment, 85, 17-32.
Chen, F. F. (2007). Senility of goodness of fit indexes to lack of measurement invariance.
Structural Equation Modeling, 14, 464–504.
Cheung, G. W., & Rensvold, R. B. (2002). Evaluating goodness-of-fit indexes for testing
measurement invariance. Structural Equation Modeling, 9, 233–255.
Clough, P., Earle, K., Perry, J. L., & Crust, L. (2012). Comment on “Progressing
measurement in mental toughness: A case example of the Mental Toughness
Questionnaire 48” by Gucciardi, Hanton, and Mallett (2012). Sport, Exercise, and
Performance Psychology, 1, 283–287.
Clough, P., Earle, K., & Sewell, D. (2002). Mental toughness: the concept and its
measurement. In I. Cockerill (Ed.), Solutions in sport psychology (pp. 32–43).
London: Thomson.
32
Clough, P., Perry, J., Crust, L., Strycharczyk, D., & Rowlands, C. (2015). The MTQ48
technical manual. Wales: AQR International.
Clough, P., & Strycharczyk, D. (2015). Developing mental toughness: Coaching strategies to
improve performance, resilience and wellbeing. Kogan Page Publishers.
Clough, P., & Strycharczyk, D. (2012). Developing mental toughness: Improving
performance, wellbeing and positive behavior in others. London: Kogan Page.
Coulter, T. J., Mallett, C. J., & Singer, J. A. (2018). A three-domain personality analysis of a
mentally tough athlete. European Journal of Personality, 32, 6-29.
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika,
16, 297-334.
Dagnall N., Denovan, A., Papageorgiou, K. A., Clough, P. J., Parker, A., & Drinkwater, K.
G. (2019). Psychometric assessment of shortened Mental Toughness Questionnaires
(MTQ): Factor structure of the MTQ-18 and the MTQ-10. Frontiers in Psychology,
10, 1933.
Gerber, M., Kalak, N., Lemola, S., Clough, P. J., Perry, J. L., Pühse, U., . . . Brand, S. (2013).
Are adolescents with high mental toughness levels more resilient against stress?
Stress and Health, 29, 164-171.
Gosling, S. D, Rentfrow, P. J, Swann, W. B. (2003). A very brief measure of the Big-Five
personality domains. Journal of Research in Personality, 37, 504–528.
Gucciardi, D. F., Hanton, S., Gordon, S., Mallett, C. J., & Temby, P. (2015). The concept of
mental toughness: Tests of dimensionality, nomological network, and traitness.
Journal of Personality, 83, 26-44.
Gucciardi, D. F., Hanton, S., & Mallett, C. J. (2012). Progressing measurement in mental
toughness: A case example of the Mental Toughness Questionnaire 48. Sport,
Exercise, and Performance Psychology, 1, 194-214.

33
Hayes, A. F., & Coutts, J. J. (2020). Using omega rather than Cronbach’s alpha for estimating
reliability. But… Communication Methods and Measures, 14, 1-24.
Hu, L., & Bentler, P. M. (1998). Fit indices in covariance structure modeling: Sensitivity to
underparameterized model misspecification. Psychological Methods, 3, 424–453.
Jackson, S. A., Martin, A. J., & Eklund, R. C. (2008). Long and short measures of flow: The
construct validity of the FSS-2, DFS-2, and new brief counterparts. Journal of Sport
& Exercise Psychology, 30, 561-587.
Jones, G., Hanton, S., & Connaughton, D. (2007). A framework of mental toughness in the
world’s best performers. The Sport Psychologist, 21, 243-264.
Kawabata, M., Mallett, C. J., & Jackson, S. A. (2008). The Flow State Scale-2 and
Dispositional Flow Scale-2: Examination of their factorial validity and reliability for
Japanese adults. Psychology of Sport and Exercise, 9, 465–485.
Kobasa, S. C. (1979). Stressful life events, personality and health: An enquiry into hardiness.
Journal of Personality and Social Psychology, 37, 1–11.
Lin, Y., Mutz, J., Clough, P. J., & Papageorgiou, K. A. (2017). Mental toughness and
individual differences in learning, educational and work performance, psychological
well-being, and personality: A systematic review. Frontiers in Psychology, 8, 1345.
Lovibond, S. H., & Lovibond, P. F. (1995). Manual for the Depression Anxiety & Stress
Scales (2nd ed.). Sydney: Psychology Foundation.
Lynn, M. R. (1986). Determination and quantification of content validity. Nursing Research,
35, 382–385.
Mallett, C. J., Kawabata, M., & Newcombe, P. (2007). Progressing measurement in sport
motivation with the SMS-6: A response to Pelletier, Vallerand, and Sarrazin.
Psychology of Sport and Exercise, 8, 622-631.

34
Marsh, H. W., Lüdtke, O., Muthén, B., Asparouhov, T., Morin, A. J. S., Trautwein, U., &
Nagengast, B. (2010). A new look at the big-five factor structure through exploratory
structural equation modeling. Psychological Assessment, 22, 471–491.
McDonald, R. P. (1999). Test theory: A unified treatment. Mahwah, NJ: Erlbaum.
McNeish, D. & Wolf, M.G. (2020). Thinking twice about sum scores. Behavior Research
Methods (online first).
Morin, A. J. S., Arens, A.K., & Marsh, H. W. (2016). A bifactor exploratory structural
equation modeling framework for the identification of distinct sources of construct-
relevant psychometric multidimensionality. Structural Equation Modeling, 23, 116-
139.
Morin, A. J. S., & Asparouhov, T. (2018). Estimation of a hierarchical exploratory structural
equation model (ESEM) using ESEM-within-CFA. Montreal, QC: Substantive
Methodological Synergy Research Laboratory.
Muthén, L. K., & Muthén, B. (1998–2019). Mplus user’s guide (8th ed.). Los Angeles, CA:
Muthén & Muthén.
Papageorgiou, K. A., Wong, B., & Clough, P. J. (2017). Beyond good and evil: Exploring the
mediating role of mental toughness on the dark triad of personality traits. Personality
and Individual Differences, 119, 19-23.
Papageorgiou, K. A., Malanchini, M., Denovan, A., Clough, P. J., Shakeshaft, N., Schofield,
K., & Kovas, Y. (2018). Longitudinal associations between narcissism, mental
toughness and school achievement. Personality and Individual Differences, 131, 105-
110.
Perry, J. L., Clough, P. J., Crust, L., Earle, K., Nicholls, A. R. (2013). Factorial validity of the
Mental Toughness Questionnaire-48. Personality and Individual Differences, 54, 587-
592.
35
Raykov, T., & Marcoulides, G. A. (2011). Introduction to psychometric theory. New York:
Routledge.
Steiger, J. H. (1990). Structural model evaluation and modification: An interval estimation
approach. Multivariate Behavioral Research, 25, 173–180.
Tóth-Király, I., Morin, A. J. S., Bőthe, B., Orosz, G., & Rigó, A. (2018). Investigating the
multidimensionality of need fulfillment: A bifactor exploratory structural equation
modeling representation. Structural Equation Modeling, 25, 267–286.
Tucker, L. R., & Lewis, C. (1973). A reliability coefficient for maximum likelihood factor
analysis. Psychometrika, 38, 1–10.
Vaughan, R., Hanna, D., & Breslin, G. (2018). Psychometric properties of the Mental
Toughness Questionnaire 48 (MTQ48) in elite, amateur and nonathletes. Sport,
Exercise, and Performance Psychology, 7, 128-140.
Wang, W. C., Chen, H. F., & Jin K. Y. (2014). Item response theory models for wording
effects in mixed-format scales. Educational and Psychological Measurement, 75,
157-178.
36
Table 1
Summary of Goodness-of-Fit Statistics for Specified Models (N = 2,186)
Items Models χ2 df CFI TLI SRMR RMSEA (90% CI)
48 items 1-factor CFA 12665.08 1080 .624 .607 .072 .070 .069 – .071
4-factor CFA 11318.42 1074 .668 .651 .069 .066 .065 – .067
6-factor CFA 9968.11 1065 .711 .694 .070 .062 .061 – .063
18 items
1-factor CFA 3028.397 135 .695 .655 .076 .099 .096 – .102
(Clough et al., 2002)
(Dagnall et al., 2019) 4-factor CFA –a – – – – – –
10 items
1-factor CFA 819.502 35 .843 .798 .057 .101 .095 – .107
(Papageorgiou et al., 2018)
18 items (S‐MTQ) 4-factor CFA 1611.15 129 .838 .808 .056 .072 .069 – .076
6-factor CFA 1045.86 120 .899 .871 .048 .059 .056 – .063
6 items (VS‐MTQ) 1-factor CFA 49.05 9 .976 .959 .022 .045 .033 – .058
(Continued)
37
48 items 4-factor ESEM 6871.01 942 .832 .799 .038 .054 .052 – .055
6-factor ESEM 4294.55 855 .903 .871 .026 .043 .042 – .044
18 items
1-factor ESEM 3474.62 135 .692 .651 .076 .106 .103 – .109
4-factor ESEM 867.97 87 .928 .873 .031 .064 .066 – .068
10 items
1-factor ESEM 976.40 35 .839 .792 .057 .111 .111 – .117
4-factor ESEM 35.120b 11 .996 .983 .010 .032 .020 – .044
18 items (S‐MTQ) 4-factor ESEM 661.13 87 .947 .906 .025 .055 .051 – .059
6-factor ESEM 235.19 60 .984 .958 .014 .037 .032 – .042
Bifactor ESEM 130.57 48 .992 .976 .010 .028 .022 – .034
Hierarchical ESEM 281.65c 69 .980 .956 .018 .038 .033 – .042
6 items (VS‐MTQ) 1-factor ESEM 58.54 9 .975 .958 .022 .050 .038 – .063
Note. CFA = confirmatory factor analysis; ESEM = exploratory structural equation modeling; CFI = robust comparative fit index; TLI = Tucker-
Lewis index; SRMR = standard root mean square residual; RMSEA = robust root mean square error of approximation; S‐MTQ = the short
version of the Mental Toughness Questionnaire; VS‐MTQ = the very short version of the Mental Toughness Questionnaire. aSolutions were
improper; bHalf of 10 items did not load on its target factor; cESEM within CFA was estimated with maximum likelihood estimation.
38
Table 2
Standardized Factor Loadings for ESEM Solution of the S-MTQ (N = 2,186)

Item F1 F2 F3 F4 F5 F6 R
F1: Commitment
Item 07 .406 .151 .135 .142 -.077 .013 .651
Item 29 .667 .120 .029 .037 .113 .026 .368
Item 47 .433 .009 .088 .033 .153 .058 .673
F2: Challenge
Item 04 .033 .616 .067 .049 .046 .010 .522
Item 44 .066 .663 .006 .065 .028 .093 .428
Item 48 .288 .255 .101 .172 .053 .023 .618
F3: Control Life
Item 02 .048 .047 .638 .082 .045 .078 .435
Item 12 .009 .028 .626 .038 .150 -.002 .464
Item 41 .196 .094 .149 -.114 .509 .005 .546
F4: Control Emotion
Item 27 -.031 -.026 -.078 .405 .390 .042 .625
Item 31 .079 .119 .114 .553 -.059 .025 .554
Item 45 .069 .085 .098 .474 .069 .072 .603
F5: Confidence Abilities
Item 08 .117 .156 .297 .188 .124 .106 .576
Item 18 .058 .042 .165 .098 .553 .066 .459
Item 24 -.152 .002 .144 .293 .314 -.059 .703
F6: Confidence Interpersonal
Item 20 .131 .230 .126 -.047 -.009 .281 .736
Item 43 -.131 .096 .035 .044 -.057 .725 .444
Item 46 .158 -.078 -.018 -.017 .109 .631 .543
Note. ESEM = exploratory structural equation modeling; S-MTQ = the short version of the
Mental Toughness Questionnaire; F = factor; R = residuals. Item numbers are based on the
MTQ48 (Clough et al., 2002). ESEM was estimated with an oblique geomin rotation. Target
factor loadings are presented in bold and all targeted factor loadings were significant at p
< .001.
39
Table 3
Latent Factor Correlations Between the S-MTQ and the DASS-21 Stress (N = 2,186)
Subscale CM CH CL CE CA CI ST Subscale ST
6-factor ESEM Bifactor ESEM
Commitment (CM) — -.36 Commitment (CM) -.19
Challenge (CH) .55 — -.23 Challenge (CH) .03
Control Life (CL) .46 .54 — -.46 Control Life (CL) -.24
Control Emotion (CE) .33 .47 .58 — -.65 Control Emotion (CE) -.50
Confidence Abilities (CA) .29 .05 .31 .29 — -.48 Confidence Abilities (CA) -.24
Confidence Interpersonal (CI) .34 .42 .35 .29 .12 — -.17 Confidence Interpersonal (CI) .06
Global Mental Toughness -.45
Note. S-MTQ = the short version of the Mental Toughness Questionnaire; DAAS-21 = the Depression Anxiety Stress Scale-21; ST = Stress;
CFA = confirmatory factor analysis; ESEM = exploratory structural equation modelling. In the model, the S-MTQ factors were specified as
ESEM factors with target rotation and the Stress factor was specified as a CFA factor. All latent correlations larger than |.07| were significant at
p < .01.
40
Table 4
Pearson’s Correlations Between the S-MTQ and the DASS-21 Stress Based on Scale and
Factor Scores (N = 2,186)
Subscale CM CH CL CE CA CI STs STf
Specific factor
Commitment (CM) – .66 .28 .20 .01 .24 -.30 -.05
Challenge (CH) .54 – .40 .43 .14 .42 -.25 -.11
Control Life (CL) .50 .47 – .53 .62 .23 -.43 -.33
Control Emotion (CE) .40 .45 .47 – .68 .22 -.48 -.44
Confidence Abilities (CA) .44 .42 .61 .54 – .10 -.43 -.45
Confidence Interpersonal (CI) .29 .35 .29 .25 .27 – -.10 -.08
Global factor
Mental Toughness (Total) – – – – – – -.47 –
Mental Toughness (Factor) – – – – – – – -.47
Note. S-MTQ = the short version of the Mental Toughness Questionnaire; DAAS-21 = the
Depression Anxiety Stress Scale-21; STs = Stress (scale score); STf = Stress (factor score);
CFA = confirmatory factor analysis; ESEM = exploratory structural equation modelling.
Pearson’s correlations of the S-MTQ subscale scores are below diagonals while correlations
of the S-MTQ factor scores are above diagonal. All correlations larger than |.02| were
significant at p < .001.

41
Table 5
Summary of Goodness-of-Fit Statistics for Specified Models (N = 3,209)
48 items 1-factor CFA 18327.28 1080 .609 .592 .073 .071 .070 – .071
4-factor CFA 16244.10 1074 .656 .639 .071 .066 .065 – .067
6-factor CFA 14236.511 1065 .701 .684 .071 .062 .061 – .063
18 items
1-factor CFA 4413.795 135 .677 .634 .077 .099 .097 – .102
10 items
1-factor CFA –a – – – – – –
18 items (S‐MTQ) 4-factor CFA 2421.36 129 .823 .790 .058 .074 .072 – .077
6-factor CFA 1650.63 120 .882 .849 .050 .063 .060 – .066
6 items (VS‐MTQ) 1-factor CFA 76.57 9 .973 .955 .023 .048 .039 – .059
(Continued)
42
48 items 4-factor ESEM 9363.60 942 .835 .835 .037 .053 .052 – .054
6-factor ESEM 5512.87 855 .908 .879 .025 .041 .040 – .042
18 items
1-factor ESEM 5076.21 135 .673 .629 .077 .107 .104 – .109
4-factor ESEM 1247.92 87 .923 .865 .031 .064 .061 – .068
10 items
1-factor ESEM 1438.06 35 .824 .774 .059 .112 .107 – .117
4-factor ESEM –b – – – – – –
18 items (S‐MTQ) 4-factor ESEM 1008.43 87 .941 .895 .026 .057 .054 – .061
6-factor ESEM 305.06 60 .984 .960 .013 .036 .032 – .040
Bifactor ESEM 183.99 48 .991 .972 .010 .030 .025 – .034
Hierarchical ESEM 444.07c 69 .976 .948 .020 .040 .037 – .044
6 items (VS‐MTQ) 1-factor ESEM 58.54 9 .975 .958 .022 .050 .038 – .063
Note. CFA = confirmatory factor analysis; ESEM = exploratory structural equation modeling; CFI = robust comparative fit index; TLI = Tucker-
Lewis index; SRMR = standard root mean square residual; RMSEA = robust root mean square error of approximation; S‐MTQ = the short
version of the Mental Toughness Questionnaire; VS‐MTQ = the very short version of the Mental Toughness Questionnaire. aSolutions were
improper; bSolutions were not converged; cESEM within CFA was estimated with maximum likelihood estimation.
43
Table 6
Summary of fit statistics for testing measurement invariance (N = 5,395)

Model
Model Description χ2 df CFI TLI SRMR RMSEA ΔCFI ΔRMSEA
Comparison
The Short MTQ (18-items bifactor ESEM)
M1 Configural invariance 315.27 96 .992 .973 .010 .029 – – –
M2 Factor loadings invariant 456.71 173 .989 .981 .017 .025 M1 vs. M2 .003 .004
M3 Model 2 with intercepts invariant 1048.15 191 .967 .947 .017 .041 M2 vs. M3 .022 -.016
M4 Model 2 with uniqueness 577.592 191 .985 .976 .029 .027 M2 vs. M4 .004 .002
The Very Short MTQ (6-items CFA)

M6 Configural invariance 128.73 18 .972 .954 .023 .048 – – –
M7 Factor loadings invariant 143.25 24 .970 .963 .031 .043 M6 vs. M7 .002 .005
M8 Model 7 with intercepts invariant 470.77 30 .889 .889 .072 .074 M7 vs. M8 .081 -.031
M9 Model 7 with uniqueness 182.981 30 .962 .962 .042 .043 M7 vs. M9 .008 .000
Note. MTQ = Mental Toughness Questionnaire; ESEM = exploratory structural equation modeling; CFA = confirmatory factor analysis; CFI =
comparative fit index; TLI = Tucker-Lewis index; SRMR = standard root mean square residual; RMSEA = robust root mean square error of
approximation; RMSEA = root mean square error of approximation.

44
Supporting Information
Face Validity Analysis of MTQ48 Items
Subscale, definition, items CVI scores / review Rationale
Challenge: The extent to which a person is likely to view a challenge or setback as an opportunity.
Item 4. Challenges… 4, 4, 4 Good fit
Item 6. Unexpected… 2, 1, 2 Emphasis on inability to cope than viewing challenge as opportunity
Item 14. I often wish… 1, 1, 1 More aligned with life control
Item 23. I generally… 3, 3, 2 Emphasis on coping than seeing challenge as opportunity
Item 30. I am generally… 2, 2, 1 More aligned with reaction time / deliberation
Item 40. I usually look… 1, 2, 1 Emphasis on seeking variety in life
Item 44. I usually enjoy… 4, 4, 4 Good fit
Item 48. I can usually… 3, 2, 3 Emphasis on coping
Commitment: The extent to which an individual is likely to persist with a goal, despite any problems or obstacles that arise.
Item 1. I usually find… 1, 1, 1 No emphasis to motivation in the face of problems / obstacles
Item 7. I don’t usually… 4, 4, 4 Good fit

45
Emphasis more on deliberation / decision making / time management and

Item 11. I just don’t… 1, 1, 1
organization. Inconsistent item structure
Item 19. I can generally… 3, 3, 3 Missing emphasis to completing task despite problems
Item 22. I am easily… 2, 1, 2 Emphasis on focus
Item 25. I generally try… 3, 3, 3 Emphasis on effort
Item 29. When faced… 4, 4, 4 Good fit

Ambiguous language (mental effort) and does not involve commitment to a
Item 35. I usually find… 2, 1, 1
particular goal.
Item 39. I can normally… 2, 1, 1 Ambiguous language (mental effort), emphasis on concentration
Item 42. I usually find… 2, 2, 2 Emphasis on enjoyment than commitment
Item 47. When I face… 4, 4, 4 Good fit
Control Emotion: The extent to which people control their anxieties and emotions.
Item 21. I generally find... 2, 1, 1 Ambiguous emphasis on controlling emotions

Conflicting language - ‘letting others know’ indicates both attempt and
Item 26. When I am… 2, 2, 2
inability to control emotions
Item 27. I tend to… 1, 2, 1 Emphasis on concern
Item 31. Even when… 4, 4, 4 Good fit
Item 34. I generally hide... 1, 2, 2 Emphasis on hiding (not controlling) emotion
Item 37. When I am… 1, 1, 1 Emphasis on motivation

46
Item 45. I can usually… 4, 4, 4 Good fit
Control Life: The extent to which people believe they have sufficient control over their lives and the environment around them.
Item 2. I generally feel… 3, 3, 3 Lacks specificity (e.g., could relate to life and/or emotions factor)
Item 5. When working… 1, 1, 1 More aligned with interpersonal confidence
Item 9. I usually find… 1, 1, 1 Conceptually vague - unclear how item links to definition
Item 12. I generally feel… 4, 4, 4 Good fit
Item 15. Whenever I try… 1, 1, 1 Emphasis on pessimism
Item 33. Things just… 1, 1, 1 Conceptually vague - unclear how item links to definition
Item 41. I feel that… 4, 4, 4 Good fit
Confidence Abilities: The degree of confidence people have in their abilities to successfully complete tasks.
Item 3. I generally feel… 1, 2, 2 Emphasis on self-esteem
Item 8. I am generally… 4, 4, 4 Good fit

Normal occurrence even for people high in confidence, emphasis on
Item 10. At times I… 1, 2, 1
pessimism
Item 13. However bad… 2, 2, 2 Emphasis on optimism
Item 16. I generally look... 1, 1, 1 Emphasis on optimism

Normal occurrence even for people high in confidence, emphasis on
Item 18. At times I… 1, 2, 1
depression
47
View publication stats
Item 24. I do not… 1, 2, 1 Emphasis on self-esteem / negative perfectionism
Item 32. If something… 1, 1, 1 Emphasis on pessimism / external locus of control
Item 36. When I make… 1, 1, 1 Emphasis on anxiety / emotion control
Confidence Interpersonal: The extent to which people are prepared to assert themselves and deal with social challenge or ridicule.
Item 17. I usually speak… 2, 2, 2 Emphasis on extraversion (not about preparedness)
Item 20. I usually take… 4, 4, 4 Good fit
Item 28. I often feel… 2, 2, 2 Emphasis on introversion (not about preparedness)
Item 38. I am… 2, 2, 2 Emphasis on happiness to allocate
Item 43. If I feel… 4, 4, 4 Good fit

Item 46. In discussions… 4, 4, 4 Good fit
Note. MTQ48 = the Mental Toughness Questionnaire-48; CVI = Content Validity Index.

2020 Kawabata Et Al SHRefinedMTQ 48 Accepted

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2020 Kawabata Et Al SHRefinedMTQ 48 Accepted

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Evolving the validity of a mental toughness measure: Reﬁned versions of the

Article in Stress and Health · November 2020

Masato Kawabata Toby Pavey

SEE PROFILE SEE PROFILE

Massage, muscle stiffness and soreness View project

The user has requested enhancement of the downloaded file.

Running head: EVOLVING THE VALIDITY OF A MENTAL TOUGHNESS MEASURE

Evolving the validity of a mental toughness measure:

Refined versions of the Mental Toughness Questionnaire-48

Masato Kawabata1,2*, Toby G. Pavey3, Tristan J. Coulter3

National Institution of Education, Singapore 637616, Singapore

Manuscript accepted: 28 October 2020

Conference, Toronto, Canada. (October 2018).

Conflict of Interest Statement

financial relationships that could be construed as a potential conflict of interest.

Data Accessibility Statement

The Mental Toughness Questionnaire-48 (MTQ48) is a 48-item self-report instrument to

represent mental toughness as a multidimensional construct consisting of a global mental

Keywords: Confirmatory factor analysis; Exploratory structural equation modeling; Scale

improvement; Personal characteristics; Multidimensionality

Evolving the validity of a mental toughness measure:

Refined versions of the Mental Toughness Questionnaire-48

components of mental toughness (e.g., confidence, emotional regulation, persistence) and

(2002) considered mental toughness an extension of the psychological construct of hardiness

a multidimensional construct based on the three aspects of hardiness – Commitment,

conceptual representation of the 4/6C model.

Considering its wide popularity in psychological studies, it is constructive to resolve

refine the questionnaire.

In progressing measurement, it is also important to address measurement issues (e.g.,

Importantly, it should be clearly understood that in the single factor measurement

a multidimensional construct. Mental toughness is conceptualized as a multidimensional

construct (Morin, Arens, & Marsh, 2016).

In considering how to obtain a global mental toughness score, another interesting

unavailable due to methodological reasons (e.g., a small sample size).

provide researchers with theoretically and practically useful instruments to confidently

questionnaire – specifically, a) a full review of the questionnaire’s face validity, b) the

multidimensional and hierarchical representation of a global mental toughness construct, and

and education (10.5%).

The Mental Toughness Questionnaire-48 (MTQ48). The MTQ48 (Clough et al.,

reversed for analyses (e.g., “At times I expect things to go wrong”).

The Depression Anxiety Stress Scales-21 items (DASS-21). To examine the

much or most of the time).

before they started completing an online survey.

reliability, as well as concurrent validity, were evaluated statistically.

acknowledgement, a full review of the questionnaire’s face validity is yet to be reported. In

stages involved in designing psychological inventories.

percentage of 48 items consistently scoring 4 in the rating process). In conducting their

to its factor definition.

• Challenge: The extent to which a person is likely to view a challenge or setback as an

any problems or obstacles that arise.

lives and the environment around them.

• Confidence Abilities: The degree of confidence people have in their abilities to

successfully complete tasks.

and deal with social challenge or ridicule.

definitions. Collectively, this review panel identified 13 of 48 items (27.1%) to be content

valid (see Supporting Information).

unknown (Asparouhov & Muthén, 2009).

loadings, and residuals were carefully examined.

coefficient of α is a widely-used measure of reliability, but also misunderstood (Hayes &

Coutts, 2020). High α is not an indicator of unidimensionality, and it is necessary to establish

(Commitment: “I can generally be relied upon to complete the tasks I am given”),

CFA and ESEM.

data adequately at overall and individual parameter levels, respectively. Therefore,