Hepp, Niedtfeld, Schulze (Accepted Draft) - Experimental Paradigms in Personality Disorder Research - PDTRT

Experimental paradigms in personality disorder research: A review of covered RDoC
constructs, methodological issues, and future directions
Johanna Heppa, Inga Niedtfelda, Lars Schulzeb
a
Department of Psychosomatic Medicine and Psychotherapy, Central Institute of Mental
Health, Medical Faculty Mannheim, Heidelberg University, Germany

b
Department of Clinical Psychology and Psychotherapy, Freie Universität Berlin, Germany
Access to data and materials: https://osf.io/xmwe9/
Corresponding author: Prof. (apl.) Dr. Inga Niedtfeld, Central Institute of Mental Health,
Department of Psychosomatic Medicine and Psychotherapy, PO Box 12 21 20, 68072
Mannheim, Germany. Tel: +49-621-1703-4403, Fax: +49-621-1703-4405, E-mail:
Inga.Niedtfeld@zi-mannheim.de
Funding: This research was supported by a grant of the German Research Foundation to Inga Niedtfeld
(NI 1591/1–2).
1
Abstract
Studies using experimental paradigms have been paramount in research on psychopathological
processes in personality disorders (PDs). We review 99 articles that report experimental
paradigms and that were published between 2017 and 2021 in thirteen peer-reviewed journals.
We structure the study content according to the NIMH Research Domain Criteria (RDoC), and
report details on demographic variables, experimental design, sample size, and statistical
analyses. We discuss unequal representation of the RDoC domains, representativeness of the
recruited clinical groups, and a lack of sample diversity. Finally, we review issues regarding
statistical power and the data analytic designs that were used. Based on the literature review, we
draw implications for future experimental PD research, encouraging researchers to increase the
breadth of represented RDoC constructs, the representativeness and diversity of the recruited
samples, the statistical power to detect between-person effects, the reliability of estimators, the
adequacy of statistical methods, and the transparency of experimental research.
Keywords: personality disorders, experimental designs, research domain criteria, RDoC,
power, review, meta-science
2
Introduction
Laboratory studies that include behavioral paradigms with experimental manipulations are
central to personality disorder (PD) research in that they have helped shape our understanding
of the psychopathological processes that play out in PDs (Domes et al., 2009; Jeung et al., 2016).
These studies (we will refer to them as studies using “experimental paradigms” henceforth)
comprise either a random allocation to an experimental between-factor (e.g., participants are
randomly assigned to complete a task with either positive or negative stimuli), or manipulation
of an experimental within-factor (e.g., participants complete several trials with positive and
negative stimuli in random order). In contrast to more naturalistic settings, the laboratory
environment affords control over potentially confounding variables (Myers & Hansen, 2011).
The use of (computerized) experimental tasks allows for the repeated measurement of the
processes of interest and can thus afford a level of reliability that is difficult to obtain outside of
a controlled environment. Most importantly, given successful randomization, the manipulation
of experimental factors allows for causal attributions of differences in outcome measures to the
experimental condition (Myers & Hansen, 2011). While findings from experimental paradigms
have been paramount for understanding psychopathological processes in PDs at a basic level,
their utility to the field is limited when considered without further context. The importance and
reliability of the effect of a specific experimental factor can only be recognized if it is embedded
in a methodological and content-related context. In other words, studies that use experimental
paradigms typically manipulate only one specific factor in a psychopathological process and
therefore generate evidence that is highly specific to that factor and process and can be difficult
to integrate into the larger research landscape. Therefore, we herein provide a review of recent
studies in PD samples that implemented experimental paradigms. We review the substantive
focus of previous work and discuss methodological challenges.
More specifically, we reviewed studies using experimental paradigms published in
thirteen peer-reviewed journals within the last five years. We selected target journals from
3
clinical psychology, psychiatry, and psychosomatics, as well as two open access journals that
repeatedly published PD work in the past.1 To be included, studies had to 1) sample individuals
with a PD diagnosis, PD symptoms, features or traits and 2) include a paradigm with an
experimental manipulation (i.e. random allocation to between-participant conditions, or the
inclusion of experimental conditions that vary within participants). For details on the literature
screening and extraction process, see the supplement 1. We include 99 manuscripts in this
review, which report 102 unique studies and 123 experimental paradigms. For a Reference list
for all articles included in the review, see supplement 2. We review all studies with regard to
the covered content, study design, and data analytic aspects. All data that we extracted from the
articles are accessible at https://osf.io/xmwe9/ (Hepp et al., 2022), as is the full citation list for all
reviewed articles.
We structure the content covered by the reviewed studies according to the NIMH
Research Domain Criteria (RDoC, Cuthbert, 2014; Koudys et al., 2019). RDoC were proposed
to stimulate research on mental disorders that overcome central limitations of research relying
on disorder categories, especially a focus on individuals with low levels of overall functioning
who meet the strict diagnostic thresholds, while ignoring the broader continuum of functioning.
RDoC is not intended as a new diagnostic system, but as a research framework that helps
structure and cluster evidence on six major domains of human functioning. The six RDoC
domains are negative valence systems, positive valence systems, cognitive systems, social
processes, arousal and regulatory systems, and sensorimotor systems. Domains form the highest
level of the RDoC matrix, followed by constructs situated within each domain. For definitions
of RDoC constructs and domains, please visit https://www.nimh.nih.gov/research/research-
funded-by-nimh/rdoc/definitions-of-the-rdoc-domains-and-constructs. All constructs are
1
The included journals are: Personality Disorders: Theory, Research and Treatment; Journal of Personality
Disorders; Borderline Personality Disorder and Emotion Dysregulation; Journal of Abnormal Psychology; Clinical
Psychological Science; Behaviour Research and Therapy; Psychological Medicine; Psychiatry Research; Journal
of Affective Disorders; Biological Psychiatry; Neuroimage Clinical; PLoS One; Scientific Reports.
4
conceptualized as falling on a dimension of functioning from normal to abnormal and RDoC
emphasize their dependency on the individual’s environmental and neurodevelopmental
context. The lowest level in the RDoC matrix describes units of analysis used to measure the
constructs. Experimental paradigms are one possible unit of analysis within the RDoC matrix.
Covered RDoC constructs and paradigms
As described above, we review 99 articles that report 123 experimental paradigms. Of these, 35
paradigms (29.41%) tapped into more than one RDoC construct. For a visualization of the
domains and constructs investigated, see Figure 1.
Figure 1: The circular bar chart shows the different constructs studied with experimental paradigms in personality
psychopathology. Colors of the bars reflect overarching RDoc domains. Red bars represent the ‘negative valence
systems’ (NVS), blue bars ‘social processes’ (SP), green bars ‘cognitive systems’ (CS), and purple bars ‘positive
valence systems’ (PVS). The available studies did not investigate the RDoc domains ‘arousal and regulatory’ or
‘sensorimotor systems’. Note that we included the two additional constructs emotion regulation and prosocial
behavior. These are currently not part of the RDoc system but were covered by several paradigms.
5
Negative Valence Systems
This domain subsumes responses to aversive situations or contexts, including fear, anxiety, and
loss (Cuthbert, 2014; National Institute of Mental Health, 2022). This domain was covered 38
times (24.36%). The construct acute threat (fear) was most prominent, as it was investigated 21
times (13.46%). Studies largely focused on stress reactivity in PDs and employed stress-
inducing stimuli (various stress paradigms, aversive pictures or film clips). In nine additional
paradigms, participants were asked to regulate their emotions, mostly by cognitive reappraisal.
Although emotion regulation is not a construct of RDoC (Fernandez et al., 2016), it is frequently
studied in the context of PDs. Further underlining the importance of threat processing, 14
paradigms that tap into more than one RDoC domain combined a stress induction (via aversive
pictures, negative facial expressions, negative words, or a stress test), located on the RDoc
construct acute threat (fear), with other RDoC constructs such as reward learning, cognitive
control, or emotion recognition. In addition to these studies on acute threat, two studies focused
on the construct frustrative nonreward, using different aggression paradigms. The construct loss
was investigated six times, for instance by inducing sadness via a film clip or via social
exclusion. No studies assessed the constructs potential threat (anxiety) or sustained threat.
Positive valence systems
This RDoC domain describes responses to positive motivational situations or contexts, such as
reward seeking, consummatory behavior, and reward/habit learning (Cuthbert, 2014), and was
investigated 17 times (10.90%). Reward responsiveness (i.e. anticipation to reward, and the
response to reward cues or receipt of reward) and reward valuation were investigated six times,
respectively. Reward valuation taps into computational processes about the probability and
benefits of a prospective outcome. Reward learning was studied five times using social
valuation or reinforcement learning paradigms. Notably, several of the reviewed studies used
paradigms that are explicitly not recommended in the RDoC matrix (e.g., the Iowa gambling
6
task), because they cannot disentangle the three constructs of the positive valence systems
domain. In addition to the unmet need to use experimental paradigms that assess RDoC
constructs in this domain, positive valence systems were clearly under-researched, as compared
to negative valence systems.
Cognitive Systems
The domain cognitive systems encompasses various cognitive processes (Cuthbert, 2014;
Morris & Cuthbert, 2012) that were investigated 31 times (19.87%). Fifteen paradigms targeted
the construct cognitive control by using the go/no-go task, stop signal task, stroop task, or task-
switching task. In addition, five paradigms focused on attention, mostly in combination with
other RDoC domains, such as negative valence systems or social processes. The construct
declarative memory was investigated four times, with two paradigms combining declarative
memory and social cognition. Five paradigms assessed working memory (primarily with the n-
back task), and two paradigms captured visual perception (lateral masking task, binocular
rivalry paradigm).
Social Processes
The domain social processes focuses on responses to interpersonal stimuli, including the
perception and interpretation of others’ actions and the interpretation of the self (Cuthbert, 2014;
Hanegraaf et al., 2021). It was by far the most researched RdoC domain, investigated 70 times
(44.87%). The construct social communication was most often investigated (29 times). For the
measurement of this construct, studies commonly employed emotion recognition paradigms,
particularly the reading the mind in the eyes test2 (RMET, seven times). While most paradigms
included stimuli of negative, neutral and positive valence, two paradigms used a restricted
2
Please note that the reading the mind in the eyes test was listed as a paradigm to investigate the perception and
understanding of others by the RDoC taskforce (https://www.nimh.nih.gov/about/advisory-boards-and-
groups/namhc/reports/behavioral-assessment-methods-for-rdoc-constructs). However, since other emotion
recognition tasks are explicitly listed under the construct social communication, we decided to count the reading
the mind in the eyes test likewise.
7
stimulus set with threatening and neutral faces only, possibly also tapping into the threat
construct of the negative valence systems domain. Facial emotion recognition was further
investigated in combination with the cognitive systems domain, with two studies presenting
emotional faces and concurrently studying attention or cognitive control (i.e. emotional faces as
stimuli within a go/no-go task or approach-avoidance task).
A group of nine studies assessed the construct affiliation and attachment, primarily by
measuring responses to social rejection in the cyberball paradigm. Interestingly, four studies
used cyberball or a group rejection paradigm not to study affiliation and attachment primarily,
but to induce negative affect (i.e. domain negative valence systems) and investigate subsequent
alterations in cognitive processing, the perception of facial communication, and prosocial
behavior.
The construct perception and understanding of self was assessed with eight paradigms,
which varied considerably in their theoretical background (e.g., self-referential processing tasks,
implicit association test). The construct perception and understanding of others was studied
more frequently (19 paradigms), and subsumes research on theory of mind as well as empathy.
Again, paradigms varied substantially and included (among others) metaphor comprehension,
Happé’s cartoon task, or personality judgements. Five additional paradigms investigated
prosocial behavior in PDs, which is currently not covered by the RDoC framework. These
studies employed economic games to measure prosocial behavior (see Hepp & Niedtfeld, 2022
for a conceptual paper; Jeung et al., 2016 for a literature review on prosociality in PDs). In light
of marked interpersonal problems as a hallmark of PD (and other mental disorders), we decided
to review prosocial behavior as an additional construct.
Remaining domains
Finally, within the last five years, there were no experimental studies on the RDoC domains
Arousal and Regulatory Systems (i.e. circadian rhythms, sleep-wakefulness), nor on

8
Sensorimotor Systems (e.g. motor actions, agency and ownership) within the target journals.
Study design
Between-person factors and sampling
The reviewed 99 articles reported a total of 102 studies3. The studies show a clear focus on
samples comprising individuals with borderline personality disorder (BPD), which were
included in 89.22% of studies. Findings from these studies have clearly helped shape and update
our concept of BPD and personality pathology in a broader sense. At the same time, the narrow
focus on BPD constitutes one of the most striking limitations to current experimental PD
research. Other PDs were strongly under-represented, as the reviewed literature only covered
antisocial PD/psychopathy (6.60% of studies), narcissistic PD (1.96% of studies), and
schizotypal PD (0.98% of studies) beyond BPD.
Additionally, the majority of studies (83.33%) used a between-group design, comparing
individuals with a PD (as discussed, largely BPD) to healthy control participants, clinical
controls, or a combination of both. A healthy control group was included in 82.35% of studies,
whereas only few studies comprised a clinical control group (21.57%).
The average number of participants in between-group designs was M = 36.34 (Md =
30.00, SD = 24.36, see Figure 2). Notably, 17 of the reviewed studies (16.67%) assessed PD
pathology dimensionally, for instance by using self-report questionnaires or interviews on PD
symptom levels or PD features. However, four of these still opted to split the sample into discrete
groups of individuals with low versus high levels of PD pathology, rather than using the
dimensional indicator to predict behavior in the paradigms. The average sample size in studies
that sampled dimensionally (and did not divide the sample into groups) was substantially higher
than in the studies using a between-groups approach, M = 143.08, Md = 103, SD = 170.89.
3
We were unable to definitively determine how many of these samples were unique and therefore report the
demographic and design data averaged across all reported samples.
9
Figure 2: Density plots visualizing probability distributions of socio-demographic and experimental variables.
Note, average number of participants refers to experimental studies with between-group designs.
The strong focus on categorical PDs also reflects the fact that, until recently, PDs were
defined categorically in DSM-5 (American Psychiatric Association, 2013) and ICD-10 (World
Health Organization, 2016). Yet, diagnostic systems are currently shifting to a dimensional
concept of PDs. This is reflected in the DSM-5 alternative model for personality disorders
(AMPD; Oldham, 2015), which outlines a dimensional approach to diagnosing PDs and was
accompanied by a call for further investigation (for an review on studies using the DSM-5
alternative model, see (Zimmermann et al., 2019). In addition, ICD-11 (World Health
Organization, 2019), which has come into effect in 2022, includes a fully dimensional PD
diagnosis, dropping the categorical approach entirely (Huprich, 2020).
As discussed above, the majority of reviewed studies focused on categorical PDs.
However, this may in part be due to our selection of clinical target journals, which possibly were
preferred outlets for work on categorical and clinically diagnosed PD samples. Further studies
10
focusing on samples of individuals with varying levels of maladaptive traits in community or
clinical samples may have been published predominantly in other outlets, for instance in the
field of personality psychology (e.g., da Costa et al., 2018; Fossati et al., 2018; Papousek et al.,
2018). Nonetheless, even considering this, there is a relatively clear picture that the majority of
reviewed studies focused on categorical PDs (and thus - by definition - pathology above a certain
threshold). As discussed, these studies may have limited utility for understanding the broader
continuum of PD pathology.
The reliance on case-control studies, comparing participants with a PD to healthy
individuals, further aggravates this problem. It introduces an artificial divide between extreme
groups at the high end of the severity continuum (i.e. those meeting diagnostic thresholds for
categorical PD diagnoses) and individuals specifically selected to be entirely free of any present
or past PD symptoms (i.e. participants in the healthy control group). This way, the field has
neglected to generate evidence about milder levels of PD pathology. Additionally, one must
expect that individuals in PD and healthy control groups differ on many characteristics beyond
presence or absence of a PD, rendering any causal attributions of differences in outcomes to PD
pathology impossible. The reliance on healthy control groups also severely limits any
conclusions regarding diagnostic specificity. In fact, for the majority of findings, it remains
entirely unclear whether they are at all specific to a certain PD. By relying on case-control
designs, the field further missed opportunities to investigate whether the associations between
PD pathology and the processes studied in experimental paradigms follow a continuous, dose-
response-like relationship with problems increasing at the higher end of the PD severity
spectrum (or whether they are unique to those with PD and completely absent in “healthy”
individuals). This severely limits our understanding of the processes themselves. Studies that
include clinical control groups in addition to healthy individuals attenuate this problem
somewhat, but were rare among the reviewed studies, and considered a wide range of different
clinical control groups (see Figure 3).

11
Figure 3: Summary of main points of the literature review in a question format with yes (red) - no (grey) answers.
In addition to the predominance of case-control studies and BPD samples, the reviewed
studies are also highly biased with regard to age, gender, and race4. Regarding age, except for
two studies, all reviewed studies included adult participants over the age of 18. The mean age
across studies was 27.66 years (Md = 28.26, SD = 5.18), indicating a focus on young adults.
The average percentage of female participants across all reviewed studies was M = 81.01 (Md
= 95, SD = 26.68) with 46.08% of reviewed studies including only women. Men, on average,
made up 17.87% of participants across samples (Md = 0, SD = 28.69)5. Other gender identities
4
These three demographic variables were selected for the review and the authors acknowledge that they are not
exhaustive and further bias is likely with regard to other variables such as sexual orientation, socioeconomic status,
country of residence, religion, and many more.
5
Several studies only reported percentages for one gender, likely implying that the remaining participants were of
the other binary gender (e.g., reporting 70% of participants were female and implying the remaining 30% were
male). This reporting is problematic insofar as it promotes the idea that gender binary is the norm and excludes
individuals of other gender identities (e.g., bigender, genderfluid). So as not to perpetuate this, we only coded the
percentages for genders that were explicitly described and did not make further inferences based on an assumption
of gender binary. As a result, the percentages for women and men do not sum up to 100% across studies.
12
were assessed in only one of the reviewed studies. The third demographic variable we aimed to
extract from all studies was race. However, only 29.41% of the reviewed studies even reported
any data on race, which precluded an adequate summary of this variable. This, in and of itself,
is a grave oversight that appears to be highly prevalent in our field and constitutes a form of
implicit racism that we must urgently address (see Haeny et al., 2021 for a nomenclature for
antiracist clinical research, and the article on diversity in this special issue for a discussion
focused on the field of PD research). The overall picture suggests that the reviewed evidence is
highly restricted in its generalizability and does a poor job of representing the broad range of
individuals affected by PDs.
Experimental manipulations
Five experimental paradigms comprised a between-person manipulation, where participants
were randomly assigned to one of two experimental conditions. Almost all experimental
paradigms included a within-person manipulation (n = 119), such that participants completed
multiple conditions within the same task. Most commonly, studies comprised one within-factor
(70.97% of studies) with two or three discrete factor levels (see Figure 2). The remaining studies
comprised two (21.77%) or three (3.23%) within factors. The average number of within levels
(across all within-factors) was M = 4.67 (Md = 3.00) and varied substantially between studies
(SD = 6.10), with some including more than thirty levels. If this large number of levels is not
accompanied by a proportional increase in trials per level, too few trials fill each cell of the
experimental design and estimators (e.g., means) become unreliable.6 Beyond any
considerations of statistical power, this substantially limits any conclusions that can be drawn
from the data. Lastly, it is important to note that almost no study included a continuously
6
Note that we tried to extract the number of trials for each within-factor combination from all reviewed articles.
However, only few articles clearly reported this information and trial numbers often varied between levels.
Therefore, we decided not to report this data as, for a substantial proportion of articles, we were left unsure about
the exact design that was used.
13
manipulated within-variable. Even variables that could be manipulated continuously, such as
stimulus intensity, were typically split up into discrete factor levels.
Reporting and Analysis
Finally, we coded different variables pertaining to the reporting as well as to the analysis of the
dependent variables (see Figure 3). Particularly with regard to the statistical analysis, this review
focuses on variables that we were able to classify for all manuscripts. A more detailed analysis
of statistical models was prevented by a lack of available information (see also ‘Open data and
materials’).
Power analyses
Power analyses are of utmost importance for performing informative experimental studies.
When power is high, studies can provide clear and more replicable answers to research
questions, whereas low power directly contributes to low replicability and heterogeneous
findings (Stanley et al., 2018). Given the importance of statistical power for research in general,
this aspect is discussed in greater detail by Vize and Lynam in this special issue.
Most commonly, the reviewed articles did not report power analyses (78.79%, n = 78).
Only a minority of articles included an a-priori power analysis (13.13%, n = 13) and few
(8.08%, n = 8) used alternative approaches, such as sensitivity analyses. Sensitivity analyses are
of particular interest for clinical research, which is commonly confronted with feasibility
concerns that can result in relatively fixed maximum sample sizes. This approach allows
determining the minimum effect size that can be reliably detected given the recruited sample
sizes (Bloom, 1995; Lakens, 2014). Reporting of sensitivity analyses thus allows the reader to
put the reported results into context with the accumulated knowledge of the field (Perugini et
al., 2018). Furthermore, almost half the articles reporting power analyses powered their studies
for within-between interactions (42.86%, n = 9). Accordingly, these studies were adequately
14
powered to detect condition-by-group interactions, but may have been underpowered to detect
simple between-group comparisons, which require more statistical power.
To provide estimations of power in current experimental PD research, we analyzed the
power for all reviewed studies that included case-control comparisons7. The results of these
calculations with varying numbers of participants and effect sizes are presented in Figure 4.
Most case-control studies (median n per group = 30) were only adequately powered (at 1 - β =
86.14%) to detect large effects, whereas power was low for the detection of medium effects (1
- β = 47.79%) or small effects (1 - β = 20.79%). We repeated this procedure for all reviewed
studies that included a dimensional investigation of personality pathology (median sample n =
103). Power estimates showed these studies were sufficiently powered to detect large and
medium-sized, but not small effects (1 - β = 17.18% for a small effect of r = .1; 1 - β = 87.46%
for a medium effect of r = .3; 1 - β = 99.98% for a large effect of r = .5).
Figure 4: Contour plot shows power estimates for a between-group comparison with different sample sizes and
effect sizes of interests. For comparison, we included power estimates for detecting small, medium, or large effects
considering the median sample size of current experimental studies (dotted line). Shaded area reflects .25 and .75
quantiles of sample sizes.
7
We assumed equal sample sizes, an alpha level of .05, and a two-sided t-test.
15
Dependent variables
On average, studies reported 2.77 (SD = 2.06) different dependent variables. Most studies used
a combination of dependent variables8, such as ratings (48.08%, n = 125), accuracy scores
(19.23%, n = 50), and response latencies (13.46%, n = 35). As can be expected in experimental
paradigms, the majority of constructs were measured through the rating of a single item
(72.80%, n = 91) as compared to the rating of a scale (27.20%, n = 34). This reflects a common
decision in experimental designs to lower participant burden related to the repeated assessments.
Accordingly, this pattern changed when focusing on studies with few assessments, such as in
experiments assessing emotional states before and after confrontation with a stressor. However,
the majority of constructs were still assessed by single items (55.17%, n = 32) as compared to
scales (44.83%, n = 26). A notable exception from this practice represents the Cyberball
paradigm. Studies using this task more commonly relied on multi-item scales (single item:
42.86%, n = 9; scales: 57.14%, n = 12).
Aggregation of trials
Most dependent variables with repeated assessments were aggregated before statistical analysis
(85.41%, n = 158). Aggregation of measures has important consequences, which should be
considered more carefully in clinical experimental research. In a nutshell, this procedure
neglects important sources of systematic variation at the trial level, such as drifts over time due
to fatigue or specific associations with experimental stimuli. It is important to keep in mind that
we do not only sample participants, but also aspects of the experiment such as stimuli. Without
adequate modeling of trial-level variations related to experimental stimuli, researchers are
unable to generalize beyond the stimuli applied in the respective experiment. For extensive
discussions of this ‘stimuli-as-fixed-effect fallacy’ see Clark (1973) and Judd et al. (2012).
8
Note that we focused on behavioral dependent variables from the manuscripts. Fixations in eye-tracking studies,
psychophysiological responding or neural activation were not extracted.
16
Response latencies were mostly aggregated using the mean (36.36%, n = 12) or median
(21.21%, n = 7), with a substantial number of studies not describing the chosen method of
aggregation (36.36%, n = 12). In addition, most studies did not report transformation of response
latencies (87.88%, n = 29). These procedures neglect the underlying distribution of response
latencies with its pronounced positive skewness over different trials.
Reliability
The reliability of experimental tasks was rarely reported (12.12%, n = 12). The relations between
reliability and statistical power, but also the implications of low measurement reliability and its
detrimental impact on the interpretability and comparisons of results have been discussed in
detail by Parsons and colleagues (Parsons et al., 2019).
Inclusion of covariates
Some studies included covariates in their main analysis9 (23.30%, n = 24), mainly adjusting for
different socio-demographic variables (79.17%, n = 19). A minority of these studies also
adjusted for psychopathological variables (e.g., depression, or anxiety; 37.50%, n = 9). The
inclusion of covariates may be well justified but requires an a-priori definition of theoretically
relevant covariates and needs to be considered in power estimations (Kraemer, 2015). Ideally,
researchers who report analyses with covariates also report the results of those same models
without covariates either in the main text, supplemental materials, or footnotes.
Open data and materials
This point has been reiterated throughout the manuscript - a lack of available information and
transparent reporting prevented the classification of additional variables of interest (e.g., number
9
Note, we did not count studies repeating their main analysis while controlling for different aspects.
17
of repetitions per condition, statistical models). The benefits and values of open data and
materials for research transparency, reduction of data loss, and fostering of progress have been
discussed in great detail in the last years (Gewin, 2016; Munafò et al., 2017; Nosek et al., 2015).
Still, only a minority of experimental studies (9.09%, n = 9) provided open access to their de-
identified data and/or study materials using available repositories. Even in journals with a
mandated data policy we commonly found the statement that “data are available upon
reasonable request” (for a discussion, see Tedersoo et al., 2021; Wicherts et al., 2006).
Future directions for PD research with experimental paradigms
Based on the above literature review, we have identified six main implications for future
experimental studies in PD samples. Some of these points overlap with general
recommendations for good scientific practice and increasing replicability in psychology, but we
try to focus on their specificity for the study of PDs as much as possible (American
Psychological Association, 2008; Asendorpf et al., 2016).
1. Increase the breadth of represented RDoC constructs
As outlined above, several RDoC domains have been studied extensively using experimental
paradigms, while others require further investigation in future work. While laying out this
implication in more detail, we repeatedly refer to maladaptive personality traits as they are
implemented in the dimensional PD diagnosis in ICD-11 (World Health Organization, 2019)
and the DSM-5 AMPD (Oldham, 2015).
As our review showed, there is a large body of research within the negative valence
systems domain. This is likely due to its close relation to maladaptive personality traits that are
observed frequently in PDs, such as affective instability (e.g., Trull et al., 2008). In line with
this, current dimensional models of PD suggest that PDs are characterized by pronounced
emotional reactivity to stress (Huprich, 2020; Oldham, 2015). The AMPD and ICD-11 PD
18
diagnosis subsume this under the maladaptive trait negative affectivity (Bach et al., 2018;
Hopwood et al., 2012). Two constructs within the negative valence systems domain that should
be targeted further in future work are potential threat (anxiety) and loss, as those with PD tend
to be characterized by increased anxiousness (Hopwood et al., 2012). In dimensional models of
PD, anxiousness is subsumed under the maladaptive personality trait negative affectivity in
DSM-5 AMPD and ICD-11 (Bach et al., 2018; Hopwood et al., 2012). Studying responses to
loss within an experimental paradigm could provide insight into the marked level of loneliness
reported by those with PD (Liebke et al., 2017) and the corresponding maladaptive personality
trait detachment in DSM-5 AMPD/ ICD-11 (Bach et al., 2018; Hopwood et al., 2012).
There is also a need for additional experimental studies on the positive valence systems
domain in PDs, and future work should strive to develop paradigms that can distinguish between
the constructs reward valuation, reward responsiveness, and reward learning. The RDoC
domain cognitive systems was studied extensively in (borderline) PD, using established
experimental paradigms that are also referenced in the RDoC matrix as suitable for investigating
cognitive control and memory. In the future, these findings should be replicated and re-evaluated
with regard to other PD categories, or dimensional assessment of maladaptive personality traits.
Within the RDoC domain social processes, the constructs social communication,
perception and understanding of self, and perception and understanding of others deserve
continued attention in future research, because they closely reflect the new diagnostic criteria
for PDs in ICD-11 and M-5 AMPD. The ICD-11 PD diagnosis details “problems in functioning
of aspects of the self (e.g., accuracy of self-view), and/or interpersonal dysfunction (e.g., the
ability to understand others' perspectives)”, and thus almost verbatim references these RDoC
constructs (World Health Organization, 2019). An additional focus on prosocial behavior
(which was investigated in several of the reviewed studies but is not currently an RDoC
construct) would complement this research (Hepp & Niedtfeld, 2022). An additional issue that
became evident when reviewing paradigms that were used to study the construct perception and
19
understanding of others is that of specificity. Future studies are needed to clarify whether the
paradigms that were used measure the same construct, or whether clustering them into sub-
constructs (affective theory of mind, cognitive theory of mind, empathy) would aid a more fine-
grained understanding of the alterations of social-cognitive functioning in PD.
2. Increase the representativeness of the recruited samples
This implication has several elements. First, there is an urgent need for studies that include
individuals with personality pathology beyond BPD. As outlined above, the current body of
experimental work almost exclusively studied BPD, with little evidence generated for other PDs.
However, we do not argue that what the field needs now is an equally large number of
experimental studies for all other categorical PDs. Following the shift to a dimensional PD
diagnosis in ICD-11 and DSM-5 AMPD, we would rather argue for a dimensional recruitment
strategy that samples individuals with different levels of maladaptive PD traits and various
levels of PD severity. This way, dimensional associations between the processes investigated in
experimental paradigms and maladaptive traits or overall PD severity could be established and
possible interventions could be tested for their effectiveness at different points of this
continuum.
Second, the reviewed studies showed marked bias for samples of younger cis-gender
women, and we deem it very likely that this bias extends to other variables that we did not
extract from all articles (e.g., education level, sexual orientation, religion). Strikingly, only a
minority of studies even assessed race or ethnicity so that we were unable to provide a
conclusive picture of the distribution of these variables across the reviewed studies. In addition
to the failure to report data on race and ethnicity, there was also only one study that assessed
gender identities other than female or male, showing that studies almost exclusively succumbed
to a concept of gender binary. Beyond the evident problem of a lack of representation, this
practice is also at odds with the distribution of PDs in the general population, where we see that
20
PDs are more prevalent among transgender and gender nonconforming individuals (Reisner et
al., 2016). In addition to the failure to assess diverse gender identities, the reviewed studies also
tended to almost exclusively sample women. This is despite well-known epidemiological
findings that (except for antisocial personality disorder) most PDs show similar prevalence rates
among men and women (Grant et al., 2008; Lenzenweger et al., 2007). We would argue that the
field (explicitly including our own research) requires a strong change toward more diverse
samples, both because it is an ethical mandate, and because samples that better represent the
whole population that is affected by PDs will produce findings that are more generalizable and
applicable.
3. Increase the statistical power to detect between-person effects
As discussed above, many of the reviewed studies lack the statistical power to detect between-
person effects of the group factor (in previous studies often BPD vs. HC). The easiest way to
remedy this is, of course, to increase sample size. We realize that this is easier said than done
and that the recruitment of PD samples is always effortful and time-consuming. Often, the
samples recruited for the reviewed studies were highly specific and imposed additional criteria
beyond presence of a categorical PD, such as absence of medication or certain comorbidities.
Collaborations between different labs and distributing recruitment efforts across several study
sites are one possible way to remedy this.
In line with our recommendation for sampling a wider range of PD pathology, moving
away from recruiting participants with a categorical PD may also be helpful for achieving larger
sample sizes (e.g., da Costa et al., 2018). In all likelihood, this would render the recruitment
process much easier as it automatically increases the pool of potential participants and affords
inclusion criteria that are easier to meet. For instance, participants who meet only three or four
BPD criteria have to be excluded from a study that uses the categorical BPD diagnosis as an
inclusion criterion, but could easily be included in a study measuring PD severity level and
21
maladaptive trait combinations dimensionally. Likewise, healthy control participants do not
have to be selected to be “super healthy” and entirely free of any psychopathology, but could
still contribute low levels of maladaptive traits in a dimensional sampling approach. Other
groups that are typically not represented in past studies, such as individuals with partially
remitted or mixed PD pathology, could also be included in studies that quantify PD severity
dimensionally. Lastly, a dimensional sampling approach can be beneficial for statistical power,
as continuous predictors generally afford higher statistical power than categorical ones (for
further discussion of this issue, see implication 5).
In either case, studies using experimental paradigms in PD samples would benefit from
using conservatively estimated power analyses or simulations to inform future sample sizes and
conducting a-priori power analyses (which very few of the reviewed studies did) should go
without saying. We are sympathetic that recruitment of big sample sizes to increase statistical
power represents a serious challenge for clinical psychological research. However, in the long
run, there is no viable alternative to fundamentally advance research in personality pathology.
As of now, the replicability of the majority of the reviewed findings is questionable. A solution
for this problem might lie in a much stronger adoption of open data repositories to aggregate
primary data from multiple sites (see implication 6 for a more detailed discussion).
Alternatively, PD research needs more concerted efforts to assess data across multiple labs,
which has become more common in recent years (Klein et al., 2018; Moshontz et al., 2018).
Such efforts would not only result in higher statistical power, but also address issues of
generalizability.
4. Increase the reliability and validity of measures
Our review revealed that some studies tended to implement a large number of within-factor
levels, which, if not accompanied by a proportional increase in trials per level, can result in
unreliable estimators. While we were unable to determine precisely how big this problem was
22
in the reviewed studies (because authors tended not to report reliability data and even trial
numbers were not readily available for all articles), we would argue that studies could generally
benefit from placing greater emphasis on the number of repetitions per within-factor
combination. In the simplest sense, this could mean to increase the overall number of repetitions
in the experimental paradigm. At the same time, researchers will want to consider participant
burden and avoid overly long experimental sessions. In the tradeoff between session duration
and trial repetitions, researchers must therefore carefully consider how many factor levels they
can implement, and consider whether the design they are thinking of is too complex for the trial
number they can afford.
Measurement of constructs is a neglected topic in experimental psychopathology in
general, and this is despite its considerable importance for construct validity. Single items are
unlikely to represent the breadth of the theoretical concept of interest. As stated above, in most
cases researchers have to make a trade-off between participant burden and validity of measures,
thus there cannot be a clear-cut recommendation for future research. However, we would urge
researchers to consider whether multi-item scales are feasible. Additionally, it was striking how
many of the reviewed studies developed ever-new paradigms to investigate the same constructs.
While the development of new paradigms is not inherently problematic, most studies failed to
pilot these prior to their first application. Additionally, several studies used paradigms developed
in other disciplines such as behavioral economics, social psychology or general psychology.
Often, these paradigms were designed and optimized for investigating within-person effects.
Thus, they tend to produce variance at the within-person and not necessarily the between-person
(or even group) level. Ideally, whenever introducing or adopting a new experimental paradigm
to PD research, researchers should first run a pilot study in a convenience sample to establish
general reliability indices and ensure that the paradigm produces substantial between-person
variance.
23
5. Increase transparency of research
A single study reported a pre-registration for their hypotheses and analyses, and only a handful
of studies provided open data or materials. Nonetheless, we are hopeful that most researchers in
our field want to improve on this, and that some already pre-registered studies will be published
in the next few years. The last decade has seen a scientific reform movement starting from a
serious replicability crisis in psychology. Given the issues pointed out throughout the article, we
fear the same problem affects PD research. If published findings are not replicable, then progress
for the diagnosis and treatment of personality psychopathology is ultimately set back. While the
2010s have been described as a decade of ‘active confrontation’ for psychology in this regard
(Nosek et al., 2021), we have to note that our field still seems to be in a phase of ‘active denial’.
In this review, we provided several recommendations that, while advancing scientific rigor, also
require a great deal of additional effort from researchers. Other recommendations, such as the
following call for increased transparency, are more easily implemented.
Increased transparency refers particularly to providing open data and materials as part
of the publication. This fact has been acknowledged by most scientific organizations, commonly
resulting in a best practice recommendation that research data should be openly available
(Gewin, 2016; Wilkinson et al., 2016). Similar sentiments are reflected by TOP guidelines,
which are increasingly adopted by scientific journals (Nosek et al., 2015). It goes without saying
that these calls for ‘open data’ should be followed while considering possible ethics or privacy
constraints. However, too often, privacy constraints of patients are a fig leaf against taking action
and providing open access to data. Wider adoption of open science procedures would make our
research more efficient by facilitating the reuse of data and would allow for calculating
individual patient data (IPD) meta-analyses. This, in most cases, increases statistical power and
allows for adjustment and investigation of confounding factors at the participant level (Riley et
al., 2010), which most studies are not adequately powered for.
24
In parallel to this call for ‘open data’, we ask researchers to consider publication of their
materials, especially the experimental setup and the code for statistical analyses. It is widely
recognized that the traditional article is insufficient for describing all aspects of the experimental
design, data preprocessing, and analysis (Munafò et al., 2017). This makes an informed
understanding and validation of most articles next to impossible and impedes the ability to
accurately judge these aspects or build on previous work. The availability of experimental and
analysis code makes it possible to fully understand these aspects and (combined with open data)
to reproduce key findings of the literature. A wider dissemination of ‘open materials’ might thus
address the obstacles we encountered during this literature review, such as the ones regarding a
more in-depth analysis of statistical models and choices.
6. Increase adequacy of statistical methods
Despite the difficulties described above, there are some main takeaways from our literature
review regarding the use of statistical models. As of now, most studies aggregate their measures.
We already pointed out the issues associated with this common practice (i.e. stimuli-as-fixed-
effect fallacy). A wider adoption of linear mixed models would not only make the most of
repeated experimental assessments (Brown, 2021), but would also allow moving away from
categorical thinking at the within-level of experimental designs. Rather than implementing
discrete factor levels (e.g. stimuli of mild, moderate, and maximum intensity) and aggregating
the data for each level, researchers could opt to manipulate variables continuously (e.g., using
stimuli sampled along the full range of intensity). Randomly sampling stimuli along an intensity
continuum and modeling them as random effects would enable new insights.
Furthermore, a substantial number of studies adjusted their analyses for different socio-
demographic or psychopathological variables. It has been shown before, and we cannot stress
this point enough, that post-hoc inclusion of covariates to adjust for group differences increases
the likelihood of false-positive results (Simmons et al., 2011). When still doing so (or being
25
asked to do so during the review process), a transparent reporting is required to evaluate the
reliance of results on the presence of covariates, thus both the unadjusted and adjusted models
must be presented (Kraemer, 2015; Simmons et al., 2011).
Conclusion
We reviewed 99 articles published between 2017 and 2021 that report findings from
experimental paradigms in PD samples, identified limitations, and derived implications. Again,
we would like to underline that our own work is not free of these limitations and explicitly
included in all criticism we presented. In addition, our selection of thirteen clinical target
journals might have influenced the results and conclusions of this review. Nonetheless, we
conclude that future research could benefit from: (1) An expansion in content to currently under-
represented RDoC constructs, (2) adopting a dimensional assessment of PD pathology instead
of categorical case-control designs, (3) collecting well-powered, representative, and diverse
samples, (4) carefully examining and increasing the between- and within-person reliability of
the employed paradigms, (5) adopting statistical tests that adequately models trial-level
variations related to experimental stimuli, and (6) embracing open science practices
(preregistration, open data, open materials, open code) to increase transparency, reproducibility,
and replicability.
26
References
American Psychiatric Association. (2013). Diagnostic and statistical manual of mental
disorders (5th ed.). https://doi.org/10.1176/appi.books.9780890425596
American Psychological Association. (2008). Responsible conduct of research. Retrieved
01/18/2022 from https://www.apa.org/research/responsible
Asendorpf, J. B., Conner, M., De Fruyt, F., De Houwer, J., Denissen, J. J., Fiedler, K., Fiedler,
S., Funder, D. C., Kliegl, R., & Nosek, B. A. (2016). Recommendations for increasing
replicability in psychology. In A. E. Kazdin (Ed.), Methodological issues and strategies
in clinical research (pp. 607–622). American Psychological Association.
https://doi.org/10.1002/per.1919
Bach, B., Sellbom, M., Skjernov, M., & Simonsen, E. (2018). ICD-11 and DSM-5 personality
trait domains capture categorical personality disorders: Finding a common ground.
Australian & New Zealand Journal of Psychiatry, 52(5), 425-434.
https://doi.org/10.1177/0004867417727867
Bloom, H. S. (1995). Minimum detectable effects: A simple way to report the statistical power
of experimental designs. Evaluation Review, 19(5), 547-556.
https://doi.org/10.1177/0193841X9501900504
Brown, V. A. (2021). An Introduction to Linear Mixed-Effects Modeling in R. Advances in
Methods and Practices in Psychological Science, 4(1), 1-19.
https://doi.org/10.1177/2515245920960351
Clark, H. H. (1973). The language-as-fixed-effect fallacy: A critique of language statistics in
psychological research. Journal of Verbal Learning and Verbal Behavior, 12(4), 335-
359. https://doi.org/10.1016/S0022-5371(73)80014-3
Cuthbert, B. N. (2014). The RDoC framework: Facilitating transition from ICD/DSM to
dimensional approaches that integrate neuroscience and psychopathology. World
Psychiatry, 13(1), 28-35. https://doi.org/10.1002/wps.20087

27
da Costa, H. P., Vrabel, J. K., Zeigler-Hill, V., & Vonk, J. (2018). DSM-5 pathological
personality traits are associated with the ability to understand the emotional states of
others. Journal of Research in Personality, 75, 1-11.
https://doi.org/10.1016/j.jrp.2018.05.001
Domes, G., Schulze, L., & Herpertz, S. C. (2009). Emotion recognition in borderline personality
disorder—A review of the literature. Journal of Personality Disorders, 23(1), 6-19.
https://doi.org/10.1521/pedi.2009.23.1.6
Fernandez, K. C., Jazaieri, H., & Gross, J. J. (2016). Emotion regulation: A transdiagnostic
perspective on a new RDoC domain. Cognitive Therapy and Research, 40(3), 426-440.
https://doi.org/10.1007/s10608-016-9772-2
Fossati, A., Somma, A., Borroni, S., Markon, K. E., & Krueger, R. F. (2018). Executive
Functioning Correlates of DSM-5 Maladaptive Personality Traits: Initial Evidence from
an Italian Sample of Consecutively Admitted Adult Outpatients. Journal of
Psychopathology and Behavioral Assessment, 40(3), 484-496.
https://doi.org/10.1007/s10862-018-9645-y
Gewin, V. (2016). Data sharing: An open mind on open data. Nature, 529(7584), 117-119.
https://doi.org/10.1038/nj7584-117a
Grant, B. F., Chou, S. P., Goldstein, R. B., Huang, B., Stinson, F. S., Saha, T. D., Smith, S. M.,
Dawson, D. A., Pulay, A. J., & Pickering, R. P. (2008). Prevalence, correlates, disability,
and comorbidity of DSM-IV borderline personality disorder: results from the Wave 2
National Epidemiologic Survey on Alcohol and Related Conditions. The Journal of
Clinical Psychiatry, 69(4), 0-0. https://doi.org/10.4088/jcp.v69n0404
Haeny, A. M., Holmes, S. C., & Williams, M. T. (2021). The need for shared nomenclature on
racism and related terminology in psychology. Perspectives on Psychological Science,
16(5), 886-892. https://doi.org/10.1177/17456916211000760
28
Hanegraaf, L., van Baal, S., Hohwy, J., & Verdejo-Garcia, A. (2021). A Systematic Review and
Meta-Analysis of ‘Systems for Social Processes’ in Borderline Personality and
Substance Use Disorders. Neuroscience & Biobehavioral Reviews, 127, 572-592.
https://doi.org/10.1016/j.neubiorev.2021.04.013
Hepp, J., & Niedtfeld, I. (2022). Prosociality in personality disorders: Status quo and research
agenda. Current Opinion in Psychology, 44, 208-214.
https://doi.org/10.1016/j.copsyc.2021.09.013
Hepp, J., Niedtfeld, I., & Schulze, L. (2022, May 13). Experimental paradigms in personality
disorder research: A review of previous evidence, methodological issues and future
directions. https://doi.org/10.17605/OSF.IO/XMWE9
Hopwood, C. J., Thomas, K. M., Markon, K. E., Wright, A. G., & Krueger, R. F. (2012). DSM-
5 personality traits and DSM–IV personality disorders. Journal of Abnormal
Psychology, 121(2), 424-432. https://doi.org/10.1037/a0026656
Huprich, S. K. (2020). Personality disorders in the ICD-11: opportunities and challenges for
advancing the diagnosis of personality pathology. Current Psychiatry Reports, 22, 1-7.
https://doi.org/10.1007/s11920-020-01161-4
Jeung, H., Schwieren, C., & Herpertz, S. C. (2016). Rationality and self-interest as economic-
exchange strategy in borderline personality disorder: Game theory, social preferences,
and interpersonal behavior. Neuroscience & Biobehavioral Reviews, 71, 849-864.
https://doi.org/10.1016/j.neubiorev.2016.10.030
Judd, C. M., Westfall, J., & Kenny, D. A. (2012). Treating stimuli as a random factor in social
psychology: A new and comprehensive solution to a pervasive but largely ignored
problem. Journal of Personality and Social Psychology, 103(1), 54.
https://doi.org/10.1037/a0028347
Klein, R. A., Vianello, M., Hasselman, F., Adams, B. G., Adams Jr, R. B., Alper, S., Aveyard,
M., Axt, J. R., Babalola, M. T., & Bahník, Š. (2018). Many Labs 2: Investigating
29
variation in replicability across samples and settings. Advances in Methods and Practices
in Psychological Science, 1(4), 443-490. https://doi.org/10.1177/2515245918810225
Koudys, J. W., Traynor, J. M., Rodrigo, A. H., Carcone, D., & Ruocco, A. C. (2019). The NIMH
research domain criteria (RDoC) initiative and its implications for research on
personality disorder. Current Psychiatry Reports, 21(6), 1-12.
https://doi.org/10.1007/s11920-019-1023-2
Kraemer, H. C. (2015). A source of false findings in published research studies: Adjusting for
covariates. JAMA Psychiatry, 72(10), 961-962.
https://doi.org/10.1001/jamapsychiatry.2015.1178
Lakens, D. (2014). Performing high‐powered studies efficiently with sequential analyses.
European Journal of Social Psychology, 44(7), 701-710.
https://doi.org/10.1002/ejsp.2023
Lenzenweger, M. F., Lane, M. C., Loranger, A. W., & Kessler, R. C. (2007). DSM-IV
personality disorders in the National Comorbidity Survey Replication. Biological
Psychiatry, 62(6), 553-564. https://doi.org/10.1016/j.biopsych.2006.09.019
Liebke, L., Bungert, M., Thome, J., Hauschild, S., Gescher, D. M., Schmahl, C., Bohus, M., &
Lis, S. (2017). Loneliness, social networks, and social functioning in borderline
personality disorder. Personality Disorders: Theory, Research, and Treatment, 8(4),
349-356. https://doi.org/10.1037/per0000208
Morris, S. E., & Cuthbert, B. N. (2012). Research Domain Criteria: cognitive systems, neural
circuits, and dimensions of behavior. Dialogues in Clinical Neuroscience, 14(1), 29-37.
https://doi.org/10.31887/DCNS.2012.14.1/smorris
Moshontz, H., Campbell, L., Ebersole, C. R., IJzerman, H., Urry, H. L., Forscher, P. S., Grahe,
J. E., McCarthy, R. J., Musser, E. D., & Antfolk, J. (2018). The Psychological Science
Accelerator: Advancing psychology through a distributed collaborative network.
30
Advances in Methods and Practices in Psychological Science, 1(4), 501-515.
https://doi.org/10.1177/2515245918797607
Munafò, M. R., Nosek, B. A., Bishop, D. V., Button, K. S., Chambers, C. D., Du Sert, N. P.,
Simonsohn, U., Wagenmakers, E.-J., Ware, J. J., & Ioannidis, J. P. (2017). A manifesto
for reproducible science. Nature Human Behaviour, 1(1), 1-9.
https://doi.org/10.1038/s41562-016-0021
Myers, A., & Hansen, C. H. (2011). Experimental psychology (7th, Ed.). Wadsworth Cengage
Learning.
National Institute of Mental Health. (2022). Domain: Negative valence systems. U.S.
Department of Health and Human Services, National Institutes of Health. Retrieved
01/18/2022 from https://www.nimh.nih.gov/research/research-funded-by-
nimh/rdoc/constructs/negative-valence-systems
Nosek, B. A., Alter, G., Banks, G. C., Borsboom, D., Bowman, S. D., Breckler, S. J., Buck, S.,
Chambers, C. D., Chin, G., Christensen, G., Contestabile, M., Dafoe, A., Eich, E.,
Freese, J., Glennerster, R., Goroff, D., Green, D. P., Hesse, B., Humphreys, M., ..., &
Yarkoni, T. (2015). Promoting an open research culture. Science, 348(6242), 1422-1425.
https://doi.org/10.1126/science.aab2374
Nosek, B. A., Hardwicke, T. E., Moshontz, H., Allard, A., Corker, K. S., Almenberg, A. D.,
Fidler, F., Hilgard, J., Kline, M., & Nuijten, M. B. (2021). Replicability, robustness, and
reproducibility in psychological science. https://doi.org/10.1146/annurev-psych-
020821-114157
Oldham, J. M. (2015). The alternative DSM-5 model for personality disorders. World
Psychiatry, 14(2), 234-236. https://doi.org/10.1002/wps.20232
Papousek, I., Aydin, N., Rominger, C., Feyaerts, K., Schmid-Zalaudek, K., Lackner, H. K., Fink,
A., Schulter, G., & Weiss, E. M. (2018). DSM-5 personality trait domains and
withdrawal versus approach motivational tendencies in response to the perception of

31
other people’s desperation and angry aggression. Biological Psychology, 132, 106-115.
https://doi.org/10.1016/j.biopsycho.2017.11.010
Parsons, S., Kruijt, A.-W., & Fox, E. (2019). Psychological science needs a standard practice of
reporting the reliability of cognitive-behavioral measurements. Advances in Methods
and Practices in Psychological Science, 2(4), 378-395.
https://doi.org/10.1177/2515245919879695
Perugini, M., Gallucci, M., & Costantini, G. (2018). A practical primer to power analysis for
simple experimental designs. International Review of Social Psychology, 31(1), 1-23.
https://doi.org/10.1177/0193841X9501900504
Reisner, S. L., Poteat, T., Keatley, J., Cabral, M., Mothopeng, T., Dunham, E., Holland, C. E.,
Max, R., & Baral, S. D. (2016). Global health burden and needs of transgender
populations: a review. The Lancet, 388(10042), 412-436. https://doi.org/10.1016/S0140-
6736(16)00684-X
Riley, R. D., Lambert, P. C., & Abo-Zaid, G. (2010). Meta-analysis of individual participant
data: rationale, conduct, and reporting. BMJ, 340. https://doi.org/10.1136/bmj.c221
Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed
flexibility in data collection and analysis allows presenting anything as significant.
Psychological Science, 22(11), 1359-1366. https://doi.org/10.1177/0956797611417632
Stanley, T. D., Carter, E. C., & Doucouliagos, H. (2018). What meta-analyses reveal about the
replicability of psychological research. Psychological Bulletin, 144(12), 1325.
https://doi.org/10.1037/bul0000169
Tedersoo, L., Küngas, R., Oras, E., Köster, K., Eenmaa, H., Leijen, Ä., Pedaste, M., Raju, M.,
Astapova, A., & Lukner, H. (2021). Data sharing practices and data availability upon
request differ across scientific disciplines. Scientific Data, 8(1), 1-11.
https://doi.org/10.1038/s41597-021-00981-0
32
Trull, T. J., Solhan, M. B., Tragesser, S. L., Jahng, S., Wood, P. K., Piasecki, T. M., & Watson,
D. (2008). Affective instability: measuring a core feature of borderline personality
disorder with ecological momentary assessment. Journal of Abnormal Psychology,
117(3), 647. https://doi.org/10.1037/a0012532
Wicherts, J. M., Borsboom, D., Kats, J., & Molenaar, D. (2006). The poor availability of
psychological research data for reanalysis. American Psychologist, 61(7), 726-728.
https://doi.org/10.1037/0003-066X.61.7.726
Wilkinson, M. D., Dumontier, M., Aalbersberg, I. J., Appleton, G., Axton, M., Baak, A.,
Blomberg, N., Boiten, J.-W., da Silva Santos, L. B., & Bourne, P. E. (2016). The FAIR
Guiding Principles for scientific data management and stewardship. Scientific Data,
3(1), 1-9. https://doi.org/10.1038/sdata.2016.18
World Health Organization. (2016). International statistical classification of diseases and
related health problems (10th ed.). https://icd.who.int/browse10/2016/en
World Health Organization. (2019). International statistical classification of diseases and
related health problems (11th ed.). https://icd.who.int/
Zimmermann, J., Kerber, A., Rek, K., Hopwood, C. J., & Krueger, R. F. (2019). A brief but
comprehensive review of research on the alternative DSM-5 model for personality
disorders. Current Psychiatry Reports, 21(9), 1-19. https://doi.org/10.1007/s11920-019-
1079-z
33

Hepp, Niedtfeld, Schulze (Accepted Draft) - Experimental Paradigms in Personality Disorder Research - PDTRT

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Hepp, Niedtfeld, Schulze (Accepted Draft) - Experimental Paradigms in Personality Disorder Research - PDTRT

Uploaded by

Copyright:

Available Formats

Experimental paradigms in personality disorder research: A review of covered RDoC

constructs, methodological issues, and future directions

Johanna Heppa, Inga Niedtfelda, Lars Schulzeb

Health, Medical Faculty Mannheim, Heidelberg University, Germany

Access to data and materials: https://osf.io/xmwe9/

Studies using experimental paradigms have been paramount in research on psychopathological

processes in personality disorders (PDs). We review 99 articles that report experimental

analyses. We discuss unequal representation of the RDoC domains, representativeness of the

adequacy of statistical methods, and the transparency of experimental research.

Keywords: personality disorders, experimental designs, research domain criteria, RDoC,

power, review, meta-science

comprise either a random allocation to an experimental between-factor (e.g., participants are

a controlled environment. Most importantly, given successful randomization, the manipulation

studies in PD samples that implemented experimental paradigms. We review the substantive

focus of previous work and discuss methodological challenges.

More specifically, we reviewed studies using experimental paradigms published in

with a PD diagnosis, PD symptoms, features or traits and 2) include a paradigm with an

experimental manipulation (i.e. random allocation to between-participant conditions, or the

of RDoC constructs and domains, please visit https://www.nimh.nih.gov/research/research-

funded-by-nimh/rdoc/definitions-of-the-rdoc-domains-and-constructs. All constructs are

emphasize their dependency on the individual’s environmental and neurodevelopmental

Covered RDoC constructs and paradigms

domains and constructs investigated, see Figure 1.

Positive valence systems

to negative valence systems.

measurement of this construct, studies commonly employed emotion recognition paradigms,

stimuli within a go/no-go task or approach-avoidance task).

alterations in cognitive processing, the perception of facial communication, and prosocial

Happé’s cartoon task, or personality judgements. Five additional paradigms investigated

of marked interpersonal problems as a hallmark of PD (and other mental disorders), we decided

to review prosocial behavior as an additional construct.

Arousal and Regulatory Systems (i.e. circadian rhythms, sleep-wakefulness), nor on

Between-person factors and sampling

antisocial PD/psychopathy (6.60% of studies), narcissistic PD (1.96% of studies), and

schizotypal PD (0.98% of studies) beyond BPD.

Additionally, the majority of studies (83.33%) used a between-group design, comparing

whereas only few studies comprised a clinical control group (21.57%).

The average number of participants in between-group designs was M = 36.34 (Md =

pathology dimensionally, for instance by using self-report questionnaires or interviews on PD

than in the studies using a between-groups approach, M = 143.08, Md = 103, SD = 170.89.

diagnosis, dropping the categorical approach entirely (Huprich, 2020).

As discussed above, the majority of reviewed studies focused on categorical PDs.

The reliance on case-control studies, comparing participants with a PD to healthy

presence or absence of a PD, rendering any causal attributions of differences in outcomes to PD

clinical control groups (see Figure 3).

individuals affected by PDs.

Five experimental paradigms comprised a between-person manipulation, where participants

paradigms included a within-person manipulation (n = 119), such that participants completed

stimulus intensity, were typically split up into discrete factor levels.

Reporting and Analysis

simple between-group comparisons, which require more statistical power.

To provide estimations of power in current experimental PD research, we analyzed the

studies that included a dimensional investigation of personality pathology (median sample n =

for a medium effect of r = .3; 1 - β = 99.98% for a large effect of r = .5).

quantiles of sample sizes.

a combination of dependent variables8, such as ratings (48.08%, n = 125), accuracy scores

42.86%, n = 9; scales: 57.14%, n = 12).

(85.41%, n = 158). Aggregation of measures has important consequences, which should be

considered more carefully in clinical experimental research. In a nutshell, this procedure

adequate modeling of trial-level variations related to experimental stimuli, researchers are

latencies with its pronounced positive skewness over different trials.

detail by Parsons and colleagues (Parsons et al., 2019).

different socio-demographic variables (79.17%, n = 19). A minority of these studies also