Professional Documents
Culture Documents
Measurement Invariance of the Short Dark Tetrad across Cultures and Genders
Institute for Sexual and Gender Minority Health and Wellbeing, Northwestern University, USA
This is an unedited manuscript accepted for publication. The manuscript will undergo
copyediting, typesetting, and review of resulting proof before it is published in its final form.
dortmund.de.
MEASUREMENT INVARIANCE SD4 2
Abstract
The last two decades revealed a plethora of scientific examinations on the Dark Triad (narcissism,
psychopathy, Machiavellianism) and Dark Tetrad traits (Dark Triad + sadism) in a variety of
contexts. Short scales for the assessment of these traits have been very influential and widely used.
Building upon previous research, the 28-item Short Dark Tetrad (SD4) was introduced as a measure
for the assessment of the Dark Tetrad traits. A recent study found that the SD4 is invariant across
genders, but little is known concerning invariance across cultures. Therefore, we tested
extant findings on MI across genders. The analyses suggested configural MI across cultures, metric
MI between genders in a US sample, and scalar MI across genders in a German sample. To address
that the SD4 revealed only modest fit in the samples, we further computed Exploratory Structural
Equation Models. Those were mostly consistent with the original model structure and indicated
that adding marginal cross-loadings among the factors accounts for enhanced model fit. Possible
Keywords: Dark Personality, Short Dark Tetrad, Dark Triad, Gender, Culture
MEASUREMENT INVARIANCE SD4 3
Measurement Invariance of the Short Dark Tetrad across Cultures and Genders
Since the Dark Triad was introduced (Paulhus & Williams, 2002), research on the area of
antagonistic personality traits is burgeoning. The Dark Triad comprises subclinical forms of
narcissism (i.e., striving for admiration by others; restoration of the grandiose self after ego
threats), Machiavellianism (Mach; i.e., cynical, distrustful, strategic orientation; self-control), and
psychopathy (i.e., impulsivity, aggression, and antisocial behavior; Back et al., 2013; Jones,
2017; Paulhus & Williams, 2002, Skeem et al., 2011). Previous work illustrated the importance
of these traits in a variety of everyday contexts, such as romantic relationships (Jonason et al.,
2012), school (Stellwagen & Kerig, 2013), and work settings (O’Boyle et al., 2012). The Dark
Triad was recently expanded by sadism (i.e., deriving feelings of joy from hurting others or from
seeing others suffer) and thus became a Dark Tetrad (Chabrol et al., 2009; Paulhus, 2014). As the
instruments originally proposed by Paulhus and Williams revealed undesired overlaps (especially
psychopathy and Mach scales; Grosz et al., 2020) or suffered from unfavorable content coverage
(Paulhus & Jones, 2015), a plethora of research dealt with the development and evaluation of
measures (e.g., Blötner & Bergold, 2021; Jonason & Webster, 2010; Jones & Paulhus, 2014;
Paulhus & Jones, 2015). Specialized short scales focusing on each trait’s specifics became
increasingly popular and reduced some of the issues of earlier scales. One of those is the Short
Dark Tetrad (SD4; Paulhus et al., 2020). It is the successor of the widely used Short Dark Triad
(SD3; Jones & Paulhus, 2014). The SD4 can be validly interpreted concerning central correlates
of narcissism, psychopathy, Mach, and sadism (Blötner et al., 2021; Neumann et al., 2021).
However, ensuring an instrument’s nomological network is not yet sufficient to assume its
usefulness. Beyond being in line with theoretical expectations about the underlying constructs,
the structural stability of the measure must be examined across different groups (measurement
invariance [MI]). In doing so, users can be sure that differences between groups indicate different
MEASUREMENT INVARIANCE SD4 4
MI comprises a hierarchical process of imposing more and more restrictions to a latent model
involving two or more independent groups (or two or more measurement occasions within one
person in the case of longitudinal MI). The most common tests of equality refer to the item-factor
composition (configural MI), loadings (metric MI), and item intercepts (scalar MI) across groups.
Numerous measures cannot withstand more severe restrictions (Putnick & Bornstein, 2016).
Thus far, the SD4 has been used in German, US, and Canadian samples (Blötner et al.,
2021; Furnham & Horne, 2021; Neumann et al., 2021; Paulhus et al., 2020, 2021), and MI has
been demonstrated across genders (Neumann et al., 2021), but the degree of invariance is yet
unclear concerning different cultures (Blötner et al., 2021). Manifestations of the Dark Tetrad
traits could differ across cultures as cultural norms allow, dictate, or prohibit particular behaviors
to members of particular groups (Eagly & Wood, 1991; Hofstede et al., 2010) and therefore
might shape expressions of the Dark Tetrad. To ensure meaningful interpretations of the SD4
subscales across cultures, we tested whether it is invariant between samples from different
et al. (2021).
Method
Samples
We used two samples, which we derived from extant studies. First, we used data from
Blötner et al.’s (2021) study on the German version of the SD4 (N = 594). Second, and with
permission from the authors, we used Webster and Wongsomboon’s (2020) SD4 data involving
participants from the US (N = 451, complete data available for 428 participants). Since the two
samples differed regarding the expected (Webster and Wongsomboon, 2020, suggested that their
participants were between 18 and 23 years of age) or observed age distributions (Blötner et al.,
MEASUREMENT INVARIANCE SD4 5
2021: Mage = 28.4, SDage = 9.0, ranging from 18 to 79 years), we carried out the culture-related
analyses in two different ways. First, we computed the analyses using the total samples. Second,
we restricted the German sample to freshmen between ages 18 and 23 (N = 170) so that the
samples agree regarding the age ranges (Webster & Wongsomboon, 2020). On the other hand, for
the analyses of gender-related MI within each sample, we used the whole datasets, excluded
Measures
Dark Tetrad
Webster and Wongsomboon (2020) and Blötner et al. (2021) presented the English and
the German version of the SD4 (Paulhus et al., 2020), respectively, to assess narcissism,
psychopathy, Machiavellianism, and sadism. Each scale comprises seven items. Five-point Likert
strongly agree). Estimations of reliability ranged from Cronbach’s α = .63 to .78 (see Table 1).
Table 1
Cronbach’s α Coefficients of the Subscales of the Short Dark Tetrad per Sample
Facet German (N = 594) Restricted German (N = 170) US (N = 451)
Machiavellianism .70 .68 .63
Narcissism .77 .74 .78
Psychopathy .74 .68 .75
Sadism .69 .70 .78
Note. Restricted German = Subset of the total German sample, entailing only participants between ages 18 and 23.
Analysis Plan
We used the R package semTools (version 0.5-4; Jorgensen et al., 2021) to examine
ensured if the difference between the fit measures of the respective models were smaller than the
cutoffs proposed by Chen (2007; ΔCFI ≤ .010, accompanied by ΔRMSEA ≤ .015). Differences
between the models were thereby due to imposing equality restrictions to specific parameters
among groups as compared to free estimations of these parameters. We deemphasized the Δχ²-
test as it is overly sensitive to negligible changes and the SRMR as it lacks sensitivity to detect
non-invariance (Chen, 2007). We did not test partial MI because the SD4 subscales are very
concise — yielding a potentially high ratio of non-invariant items per factor — and because there
out analyses of gender-related MI per sample. Therefore, we applied the same procedures as we
When using the entire German and US samples, the descriptive model fit measures
changed substantially when imposing equal loadings to German and US data (ΔCFI = -.013,
ΔRMSEA = .001; see Table 2). Accordingly, the SD4 revealed configural MI across cultures and
factor structures can be meaningfully compared between German and US participants (Putnick &
Bornstein, 2016). However, when we computed the analysis with the restricted German sample
(i.e., only freshmen between 18 and 23 years of age), the analyses exhibited metric MI between
the cultures so that factor loadings can also be meaningfully compared between the samples
(please find the respective results in Table S1 in the supplement; Blötner et al., 2022).
The SD4 revealed scalar MI across genders in the German sample, but not in the US
MEASUREMENT INVARIANCE SD4 7
sample — as indicated by acceptable or unacceptable changes of the CFIs when imposing equal
item intercepts to men and women (cutoff ΔCFI ≤ .010; Chen, 2007; see Table 2). Thus,
comparisons of latent means are appropriate between German men and German women and
factor loadings are comparable between US men and US women (Putnick & Bornstein 2016).
Table 2
Tests of Measurement Invariance of the SD4 Across Genders and Cultures
Level of MI χ² (df) CFI RMSEA ΔCFI ΔRMSEA
MI of the SD4 Across Cultures a
Configural 1,942.76 (688) .799 .062 — —
Metric 2,043.50 (712) .786 .063 -.013 .001
Note that all CFIs in the analyses of MI indicated less than acceptable fit (< .90), whereas
all RMSEAs were acceptable (< .08; Hu & Bentler, 1999). However, these findings are consistent
with earlier analyses of the SD4 (Blötner et al., 2021; Neumann et al., 2021; Paulhus et al., 2020).
To address that standard CFAs are very restrictive (i.e., constraint of cross-loadings onto items of
other factors, neglecting overlaps among the traits), and in line with Neumann et al. (2021), we
further computed Exploratory Structural Equation Models (ESEM; Asparouhov & Muthén,
MEASUREMENT INVARIANCE SD4 8
2009). By allowing marginal cross-loadings, ESEMs accounted for acceptable fit, CFIs = .90 and
.91 in the US and German samples, respectively, both RMSEAs = .04. Table 3 provides the
loadings from the CFAs and ESEMs. Two out of 196 possible cross-loadings were non-trivial
(i.e., λ ≥ .30), whereas 10 out of 56 expected main-loadings were trivial (28 loadings each
estimated in two samples). As can be seen in Figure S1 in the supplement (Blötner et al., 2022),
the empirical structure of the SD4 differed from the intended one (Paulhus et al., 2020) with
slight differences concerning the Mach, narcissism, and psychopathy subscales and noticeable
differences arising for the sadism scale. The sadism items had substantial cross-loadings with
psychopathy (sixth sadism item) or their loadings were smaller than conventional cutoffs (third,
Table 3
Standardized Factor Loadings of the Short Dark Tetrad by Model Type and Sample
Confirmatory Factor Analysis Exploratory Structural Equation Model
Items M N P S M N P S
M1 .37/.35 .44/.35 -.12/.03 .01/.02 .01/-.02
M2 .60/.45 .51/.43 .14/.15 .14/-.03 .06/.03
M3 .64/.52 .54/.49 -.03/.01 -.16/-.09 .08/-.02
M4 .40/.49 .49/.53 -.07/-.14 .02/.01 .02/-.08
M5 .45/.41 .46/.36 .03/.03 .04/-.09 .00/.13
M6 .55/.42 .44/.46 .08/.00 .01/.05 .05/-.01
M7 .35/.47 .43/.36 .22/.04 -.10/.04 -.06/.13
N1 .69/.60 -.04/-.01 .65/.63 .00/-.05 .06/-.01
N2 .68/.53 .04/.10 .68/.48 .08/.11 .04/.03
N3 .65/.57 .01/.03 .55/.53 .12/.13 .07/-.01
N4 .59/.58 .04/.00 .61/.55 .04/.10 -.02/-.09
N5 .39/.64 .12/-.03 .53/.65 -.04/-.11 .01/.12
N6 .51/.66 .08/.02 .45/.50 .14/.03 .09/.11
N7 .27/.51 .23/.08 .17/.37 .20/.05 -.02/.12
P1 .56/.63 -.11/.04 .02/.14 .59/.53 .17/.07
P2 .40/.73 -.01/.01 .06/.03 .36/.67 .08/.01
MEASUREMENT INVARIANCE SD4 9
General Discussion
This study analyzed the degrees of structural equivalence of the SD4 across cultures and
genders. The findings suggest configural MI between German and US cultures. Thus, the factor
unstandardized factor loadings and item intercepts are not advisable. However, when we
compared German and US participants from the same age ranges, we found hints on metric MI,
suggesting that both the factor structure and unstandardized loadings, but not the item intercepts,
can be compared between the cultures. Furthermore, we found hints on metric MI between men
and women in the US sample, as well as scalar MI between men and women in the German
sample. Accordingly, factor loadings (item intercepts) can be compared between genders in the
US (German) sample.
Intellectual, Rich, Democratic; Henrich et al., 2010), there are moderate to large differences
between those, especially regarding individualism and indulgence (both higher for the USA),
MEASUREMENT INVARIANCE SD4 10
uncertainty avoidance, and long-term orientation (both higher for Germany; Hofstede et al.,
liberal adoption of norms to promote one’s well-being. Valuing own advantages over societal
norms and the bending of rules are two outstanding features of all antagonistic traits (Paulhus,
2014). On the other hand, uncertainty avoidance and long-term orientation are crucial features of
Mach (Blötner & Bergold, 2021). We suggest that these differences accounted for limited levels
of invariance between the cultures. Likewise, social norms imposing more prosocial expectations
to women as opposed to men and distinct expressions of antagonistic behaviors among men and
women (Muris et al., 2017) may have accounted for scalar non-invariance between men and
women in our US sample. In the German sample, we replicated Neumann et al.’s (2021) finding
on scalar MI across genders, whereas the findings from our US sample contradict extant
literature, despite stemming from the same culture as Neumann et al.’s sample. We assume that
scalar MI in Neumann et al.’s (2021) and Blötner et al.’s (2021) respective total samples was due
(2020) sample that entails only students. Blötner et al. and Neumann et al. included a wider array
of individuals from the general population, which also affected our analysis of MI across cultures
when we included all German participants. However, our samples were not sufficient to test this
assumption any further. Hence, we encourage future research to test the equivalence of the SD4
in student samples and samples from the general population by purposefully recruiting from these
Limitations
Given that we reanalyzed data from existing studies, the present work exhibits the same
limitations as the original studies. First, the studies predominantly (i.e., Blötner et al., 2021) or
MEASUREMENT INVARIANCE SD4 11
exclusively recruited students (Webster & Wongsomboon, 2021). Second, the gender ratio of the
German sample was strongly imbalanced. The gender-related imbalance of the German sample
also affected our total sample (i.e., our combined sample used to examine culture-related MI),
limiting the generalizability of our results as men score higher on antagonistic traits and
behaviors than women (e.g., Muris et al., 2017). The last limitation is specific to the analytic
approach in this study: When we matched the age ranges of the two samples to test culture-
related MI, our examination involved a comparatively small German subsample. We restricted it
to ensure the best possible comparability between the German and US samples regarding crucial
characteristics of sample composition (i.e., age and student status). However, the age range is
relatively narrow and limited to students of psychology, affecting the external validity. The Dark
Tetrad refers to subclinical samples and may therefore have different properties in clinical or
forensic samples (Blötner et al., 2021; Neumann et al., 2021). In summary, we encourage future
research to test the SD4 in groups that are more heterogeneous as well as more balanced in terms
of gender.
Conclusion
Because the items of the SD4 have relatively unambiguous contents, artifacts from the
translation process should be unlikely to account for our findings on MI. Differences might rather
be due to social expectations about how men and women or Germans and US Americans should
or should not behave. Therefore, sample characteristics — especially national culture — should
be an important issue for future research on antagonistic traits and behaviors. However, given
stark contrasts between the results obtained in the samples that were (not) matched regarding age,
the SD4 should only be used in cultural comparisons if the general characteristics of the samples
agree.
MEASUREMENT INVARIANCE SD4 12
References
Asparouhov, T., & Muthén, B. (2009). Exploratory Structural Equation Modeling. Structural
https://doi.org/10.1080/10705510903008204
Back, M. D., Küfner, A. C. P., Dufner, M., Gerlach, T. M., Rauthmann, J. F., & Denissen, J. J. A.
(2013). Narcissistic admiration and rivalry: Disentangling the bright and dark sides of
https://doi.org/10.1037/a0034431
Blötner, C., & Bergold, S. (2021). To be fooled or not to be fooled: Approach and avoidance
https://doi.org/10.1037/pas0001069
Blötner, C., Webster, G. D., & Wongsomboon, V. (2022, February 22). Measurement invariance
of the Short Dark Tetrad across cultures and genders. Open Science Framework.
https://osf.io/p8v3k/
Blötner, C., Ziegler, M., Wehner, C., Back, M. D., & Grosz, M. P. (2021). The nomological
network of the Short Dark Tetrad Scale (SD4). European Journal of Psychological
Chabrol, H., Van Leeuwen, N., Rodgers, R., & Séjourné, N. (2009). Contributions of
https://doi.org/10.1016/j.paid.2009.06.020
https://doi.org/10.1080/10705510701301834
MEASUREMENT INVARIANCE SD4 13
Eagly, A. H., & Wood, W. (1991). Explaining sex differences in social behavior: A meta-analytic
https://doi.org/10.1177/0146167291173011
Furnham, A., & Horne, G. (2021). The Tetradic Heart of Darkness: Comparing three dark-side
https://doi.org/10.1016/j.paid.2021.110918
Grosz, M. P., Harms, P. D., Dufner, M., Kraft, L., & Wetzel, E. (2020). Reducing the overlap
Henrich, J., Heine, S. J., & Norenzayan, A. (2010). The weirdest people in the world? Behavioral
Hofstede, G., Hofstede, G. J., & Minkov, M. (2010). Cultures and organizations. Software of the
Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis:
Conventional criteria versus new alternatives. Structural Equation Modeling, 6(1), 1–55.
https://doi.org/10.1080/10705519909540118
Jonason, P. K., Luevano, V. X., & Adams, H. M. (2012). How the Dark Triad traits predict
https://doi.org/10.1016/j.paid.2012.03.007
Jonason, P. K., & Webster, G. D. (2010). The Dirty Dozen: A concise measure of the Dark Triad.
Zeigler-Hill & D. K. Marcus (Eds.), The dark side of personality: Science and practice in
Association. https://doi.org/10.1037/14854-005
Jones, D. N., & Paulhus, D. L. (2014). Introducing the Short Dark Triad (SD3). Assessment,
Jorgensen, T. D., Pornprasertmanit, S., Schoemann, A. M., Rosseel, Y., Miller, P., Quick, C.,
Garnier-Villareal, M., Selig, J., Boulton, A., Preacher, K., Coffman, D., Rhemtulla, M., …
& Ben-Shachar, M. S. (2021). semTools: Useful tools for structural equation modeling (R
project.org/web/packages/semTools/index.html
Muris, P., Merckelbach, H., Otgaar, H., & Meijer, E. (2017). The malevolent side of human
https://doi.org/10.1177/1745691616666070
Neumann, C. S., Jones, D. N., & Paulhus, D. L. (2021). Examining the Short Dark Tetrad (SD4)
https://doi.org/10.1177/1073191120986624
O’Boyle, E. H., Forsyth, D. R., Banks, G. C., & McDaniel, M. A. (2012). A meta-analysis of the
Dark Triad and work behavior: A social exchange perspective. Journal of Applied
Paulhus, D. L., Buckels, E. E., Trapnell, P. D., & Jones, D. N. (2020). Screening for dark
Paulhus, D. L., Gupta, R., & Jones, D. N. (2021). Dark or disturbed?: Predicting aggression from
https://doi.org/10.1002/ab.21990
386915-9.00020-6
Paulhus, D. L., & Williams, K. M. (2002). The Dark Triad of personality: Narcissism,
https://doi.org/10.1016/S0092-6566(02)00505-6
Putnick, D. L., & Bornstein, M. H. (2016). Measurement invariance conventions and reporting:
The state of the art and future directions for psychological research. Developmental
Skeem, J. L., Polaschek, D. L. L., Patrick, C. J., & Lilienfeld, S. O. (2011). Psychopathic
https://doi.org/10.1177/1529100611426706
Stellwagen, K. K., & Kerig, P. K. (2013). Dark triad personality traits and theory of mind among
https://doi.org/10.1016/j.paid.2012.08.019
Webster, G. D., & Wongsomboon, V. (2020, July 6). The Hateful Eight (H8): An efficient
https://doi.org/10.31234/osf.io/pr4u6
Open Science
We report all data exclusions, all data exclusion criteria, whether exclusion criteria were
established prior to data analysis, all measures in the study, and all analyses including all tested
Open Data: We confirm that there is sufficient information for an independent researcher
Preregistration of Studies and Analysis Plans: This study was not preregistered. Data,