You are on page 1of 33

Experimental paradigms in personality disorder research: A review of covered RDoC

constructs, methodological issues, and future directions

Johanna Heppa, Inga Niedtfelda, Lars Schulzeb

a
Department of Psychosomatic Medicine and Psychotherapy, Central Institute of Mental

Health, Medical Faculty Mannheim, Heidelberg University, Germany


b
Department of Clinical Psychology and Psychotherapy, Freie Universität Berlin, Germany

Access to data and materials: https://osf.io/xmwe9/

Corresponding author: Prof. (apl.) Dr. Inga Niedtfeld, Central Institute of Mental Health,
Department of Psychosomatic Medicine and Psychotherapy, PO Box 12 21 20, 68072
Mannheim, Germany. Tel: +49-621-1703-4403, Fax: +49-621-1703-4405, E-mail:
Inga.Niedtfeld@zi-mannheim.de

Funding: This research was supported by a grant of the German Research Foundation to Inga Niedtfeld

(NI 1591/1–2).

1
Abstract

Studies using experimental paradigms have been paramount in research on psychopathological

processes in personality disorders (PDs). We review 99 articles that report experimental

paradigms and that were published between 2017 and 2021 in thirteen peer-reviewed journals.

We structure the study content according to the NIMH Research Domain Criteria (RDoC), and

report details on demographic variables, experimental design, sample size, and statistical

analyses. We discuss unequal representation of the RDoC domains, representativeness of the

recruited clinical groups, and a lack of sample diversity. Finally, we review issues regarding

statistical power and the data analytic designs that were used. Based on the literature review, we

draw implications for future experimental PD research, encouraging researchers to increase the

breadth of represented RDoC constructs, the representativeness and diversity of the recruited

samples, the statistical power to detect between-person effects, the reliability of estimators, the

adequacy of statistical methods, and the transparency of experimental research.

Keywords: personality disorders, experimental designs, research domain criteria, RDoC,

power, review, meta-science

2
Introduction

Laboratory studies that include behavioral paradigms with experimental manipulations are

central to personality disorder (PD) research in that they have helped shape our understanding

of the psychopathological processes that play out in PDs (Domes et al., 2009; Jeung et al., 2016).

These studies (we will refer to them as studies using “experimental paradigms” henceforth)

comprise either a random allocation to an experimental between-factor (e.g., participants are

randomly assigned to complete a task with either positive or negative stimuli), or manipulation

of an experimental within-factor (e.g., participants complete several trials with positive and

negative stimuli in random order). In contrast to more naturalistic settings, the laboratory

environment affords control over potentially confounding variables (Myers & Hansen, 2011).

The use of (computerized) experimental tasks allows for the repeated measurement of the

processes of interest and can thus afford a level of reliability that is difficult to obtain outside of

a controlled environment. Most importantly, given successful randomization, the manipulation

of experimental factors allows for causal attributions of differences in outcome measures to the

experimental condition (Myers & Hansen, 2011). While findings from experimental paradigms

have been paramount for understanding psychopathological processes in PDs at a basic level,

their utility to the field is limited when considered without further context. The importance and

reliability of the effect of a specific experimental factor can only be recognized if it is embedded

in a methodological and content-related context. In other words, studies that use experimental

paradigms typically manipulate only one specific factor in a psychopathological process and

therefore generate evidence that is highly specific to that factor and process and can be difficult

to integrate into the larger research landscape. Therefore, we herein provide a review of recent

studies in PD samples that implemented experimental paradigms. We review the substantive

focus of previous work and discuss methodological challenges.

More specifically, we reviewed studies using experimental paradigms published in

thirteen peer-reviewed journals within the last five years. We selected target journals from
3
clinical psychology, psychiatry, and psychosomatics, as well as two open access journals that

repeatedly published PD work in the past.1 To be included, studies had to 1) sample individuals

with a PD diagnosis, PD symptoms, features or traits and 2) include a paradigm with an

experimental manipulation (i.e. random allocation to between-participant conditions, or the

inclusion of experimental conditions that vary within participants). For details on the literature

screening and extraction process, see the supplement 1. We include 99 manuscripts in this

review, which report 102 unique studies and 123 experimental paradigms. For a Reference list

for all articles included in the review, see supplement 2. We review all studies with regard to

the covered content, study design, and data analytic aspects. All data that we extracted from the

articles are accessible at https://osf.io/xmwe9/ (Hepp et al., 2022), as is the full citation list for all

reviewed articles.

We structure the content covered by the reviewed studies according to the NIMH

Research Domain Criteria (RDoC, Cuthbert, 2014; Koudys et al., 2019). RDoC were proposed

to stimulate research on mental disorders that overcome central limitations of research relying

on disorder categories, especially a focus on individuals with low levels of overall functioning

who meet the strict diagnostic thresholds, while ignoring the broader continuum of functioning.

RDoC is not intended as a new diagnostic system, but as a research framework that helps

structure and cluster evidence on six major domains of human functioning. The six RDoC

domains are negative valence systems, positive valence systems, cognitive systems, social

processes, arousal and regulatory systems, and sensorimotor systems. Domains form the highest

level of the RDoC matrix, followed by constructs situated within each domain. For definitions

of RDoC constructs and domains, please visit https://www.nimh.nih.gov/research/research-

funded-by-nimh/rdoc/definitions-of-the-rdoc-domains-and-constructs. All constructs are

1
The included journals are: Personality Disorders: Theory, Research and Treatment; Journal of Personality
Disorders; Borderline Personality Disorder and Emotion Dysregulation; Journal of Abnormal Psychology; Clinical
Psychological Science; Behaviour Research and Therapy; Psychological Medicine; Psychiatry Research; Journal
of Affective Disorders; Biological Psychiatry; Neuroimage Clinical; PLoS One; Scientific Reports.
4
conceptualized as falling on a dimension of functioning from normal to abnormal and RDoC

emphasize their dependency on the individual’s environmental and neurodevelopmental

context. The lowest level in the RDoC matrix describes units of analysis used to measure the

constructs. Experimental paradigms are one possible unit of analysis within the RDoC matrix.

Covered RDoC constructs and paradigms

As described above, we review 99 articles that report 123 experimental paradigms. Of these, 35

paradigms (29.41%) tapped into more than one RDoC construct. For a visualization of the

domains and constructs investigated, see Figure 1.

Figure 1: The circular bar chart shows the different constructs studied with experimental paradigms in personality

psychopathology. Colors of the bars reflect overarching RDoc domains. Red bars represent the ‘negative valence

systems’ (NVS), blue bars ‘social processes’ (SP), green bars ‘cognitive systems’ (CS), and purple bars ‘positive

valence systems’ (PVS). The available studies did not investigate the RDoc domains ‘arousal and regulatory’ or

‘sensorimotor systems’. Note that we included the two additional constructs emotion regulation and prosocial

behavior. These are currently not part of the RDoc system but were covered by several paradigms.
5
Negative Valence Systems

This domain subsumes responses to aversive situations or contexts, including fear, anxiety, and

loss (Cuthbert, 2014; National Institute of Mental Health, 2022). This domain was covered 38

times (24.36%). The construct acute threat (fear) was most prominent, as it was investigated 21

times (13.46%). Studies largely focused on stress reactivity in PDs and employed stress-

inducing stimuli (various stress paradigms, aversive pictures or film clips). In nine additional

paradigms, participants were asked to regulate their emotions, mostly by cognitive reappraisal.

Although emotion regulation is not a construct of RDoC (Fernandez et al., 2016), it is frequently

studied in the context of PDs. Further underlining the importance of threat processing, 14

paradigms that tap into more than one RDoC domain combined a stress induction (via aversive

pictures, negative facial expressions, negative words, or a stress test), located on the RDoc

construct acute threat (fear), with other RDoC constructs such as reward learning, cognitive

control, or emotion recognition. In addition to these studies on acute threat, two studies focused

on the construct frustrative nonreward, using different aggression paradigms. The construct loss

was investigated six times, for instance by inducing sadness via a film clip or via social

exclusion. No studies assessed the constructs potential threat (anxiety) or sustained threat.

Positive valence systems

This RDoC domain describes responses to positive motivational situations or contexts, such as

reward seeking, consummatory behavior, and reward/habit learning (Cuthbert, 2014), and was

investigated 17 times (10.90%). Reward responsiveness (i.e. anticipation to reward, and the

response to reward cues or receipt of reward) and reward valuation were investigated six times,

respectively. Reward valuation taps into computational processes about the probability and

benefits of a prospective outcome. Reward learning was studied five times using social

valuation or reinforcement learning paradigms. Notably, several of the reviewed studies used

paradigms that are explicitly not recommended in the RDoC matrix (e.g., the Iowa gambling

6
task), because they cannot disentangle the three constructs of the positive valence systems

domain. In addition to the unmet need to use experimental paradigms that assess RDoC

constructs in this domain, positive valence systems were clearly under-researched, as compared

to negative valence systems.

Cognitive Systems

The domain cognitive systems encompasses various cognitive processes (Cuthbert, 2014;

Morris & Cuthbert, 2012) that were investigated 31 times (19.87%). Fifteen paradigms targeted

the construct cognitive control by using the go/no-go task, stop signal task, stroop task, or task-

switching task. In addition, five paradigms focused on attention, mostly in combination with

other RDoC domains, such as negative valence systems or social processes. The construct

declarative memory was investigated four times, with two paradigms combining declarative

memory and social cognition. Five paradigms assessed working memory (primarily with the n-

back task), and two paradigms captured visual perception (lateral masking task, binocular

rivalry paradigm).

Social Processes

The domain social processes focuses on responses to interpersonal stimuli, including the

perception and interpretation of others’ actions and the interpretation of the self (Cuthbert, 2014;

Hanegraaf et al., 2021). It was by far the most researched RdoC domain, investigated 70 times

(44.87%). The construct social communication was most often investigated (29 times). For the

measurement of this construct, studies commonly employed emotion recognition paradigms,

particularly the reading the mind in the eyes test2 (RMET, seven times). While most paradigms

included stimuli of negative, neutral and positive valence, two paradigms used a restricted

2
Please note that the reading the mind in the eyes test was listed as a paradigm to investigate the perception and
understanding of others by the RDoC taskforce (https://www.nimh.nih.gov/about/advisory-boards-and-
groups/namhc/reports/behavioral-assessment-methods-for-rdoc-constructs). However, since other emotion
recognition tasks are explicitly listed under the construct social communication, we decided to count the reading
the mind in the eyes test likewise.
7
stimulus set with threatening and neutral faces only, possibly also tapping into the threat

construct of the negative valence systems domain. Facial emotion recognition was further

investigated in combination with the cognitive systems domain, with two studies presenting

emotional faces and concurrently studying attention or cognitive control (i.e. emotional faces as

stimuli within a go/no-go task or approach-avoidance task).

A group of nine studies assessed the construct affiliation and attachment, primarily by

measuring responses to social rejection in the cyberball paradigm. Interestingly, four studies

used cyberball or a group rejection paradigm not to study affiliation and attachment primarily,

but to induce negative affect (i.e. domain negative valence systems) and investigate subsequent

alterations in cognitive processing, the perception of facial communication, and prosocial

behavior.

The construct perception and understanding of self was assessed with eight paradigms,

which varied considerably in their theoretical background (e.g., self-referential processing tasks,

implicit association test). The construct perception and understanding of others was studied

more frequently (19 paradigms), and subsumes research on theory of mind as well as empathy.

Again, paradigms varied substantially and included (among others) metaphor comprehension,

Happé’s cartoon task, or personality judgements. Five additional paradigms investigated

prosocial behavior in PDs, which is currently not covered by the RDoC framework. These

studies employed economic games to measure prosocial behavior (see Hepp & Niedtfeld, 2022

for a conceptual paper; Jeung et al., 2016 for a literature review on prosociality in PDs). In light

of marked interpersonal problems as a hallmark of PD (and other mental disorders), we decided

to review prosocial behavior as an additional construct.

Remaining domains

Finally, within the last five years, there were no experimental studies on the RDoC domains

Arousal and Regulatory Systems (i.e. circadian rhythms, sleep-wakefulness), nor on


8
Sensorimotor Systems (e.g. motor actions, agency and ownership) within the target journals.

Study design

Between-person factors and sampling

The reviewed 99 articles reported a total of 102 studies3. The studies show a clear focus on

samples comprising individuals with borderline personality disorder (BPD), which were

included in 89.22% of studies. Findings from these studies have clearly helped shape and update

our concept of BPD and personality pathology in a broader sense. At the same time, the narrow

focus on BPD constitutes one of the most striking limitations to current experimental PD

research. Other PDs were strongly under-represented, as the reviewed literature only covered

antisocial PD/psychopathy (6.60% of studies), narcissistic PD (1.96% of studies), and

schizotypal PD (0.98% of studies) beyond BPD.

Additionally, the majority of studies (83.33%) used a between-group design, comparing

individuals with a PD (as discussed, largely BPD) to healthy control participants, clinical

controls, or a combination of both. A healthy control group was included in 82.35% of studies,

whereas only few studies comprised a clinical control group (21.57%).

The average number of participants in between-group designs was M = 36.34 (Md =

30.00, SD = 24.36, see Figure 2). Notably, 17 of the reviewed studies (16.67%) assessed PD

pathology dimensionally, for instance by using self-report questionnaires or interviews on PD

symptom levels or PD features. However, four of these still opted to split the sample into discrete

groups of individuals with low versus high levels of PD pathology, rather than using the

dimensional indicator to predict behavior in the paradigms. The average sample size in studies

that sampled dimensionally (and did not divide the sample into groups) was substantially higher

than in the studies using a between-groups approach, M = 143.08, Md = 103, SD = 170.89.

3
We were unable to definitively determine how many of these samples were unique and therefore report the
demographic and design data averaged across all reported samples.
9
Figure 2: Density plots visualizing probability distributions of socio-demographic and experimental variables.

Note, average number of participants refers to experimental studies with between-group designs.

The strong focus on categorical PDs also reflects the fact that, until recently, PDs were

defined categorically in DSM-5 (American Psychiatric Association, 2013) and ICD-10 (World

Health Organization, 2016). Yet, diagnostic systems are currently shifting to a dimensional

concept of PDs. This is reflected in the DSM-5 alternative model for personality disorders

(AMPD; Oldham, 2015), which outlines a dimensional approach to diagnosing PDs and was

accompanied by a call for further investigation (for an review on studies using the DSM-5

alternative model, see (Zimmermann et al., 2019). In addition, ICD-11 (World Health

Organization, 2019), which has come into effect in 2022, includes a fully dimensional PD

diagnosis, dropping the categorical approach entirely (Huprich, 2020).

As discussed above, the majority of reviewed studies focused on categorical PDs.

However, this may in part be due to our selection of clinical target journals, which possibly were

preferred outlets for work on categorical and clinically diagnosed PD samples. Further studies

10
focusing on samples of individuals with varying levels of maladaptive traits in community or

clinical samples may have been published predominantly in other outlets, for instance in the

field of personality psychology (e.g., da Costa et al., 2018; Fossati et al., 2018; Papousek et al.,

2018). Nonetheless, even considering this, there is a relatively clear picture that the majority of

reviewed studies focused on categorical PDs (and thus - by definition - pathology above a certain

threshold). As discussed, these studies may have limited utility for understanding the broader

continuum of PD pathology.

The reliance on case-control studies, comparing participants with a PD to healthy

individuals, further aggravates this problem. It introduces an artificial divide between extreme

groups at the high end of the severity continuum (i.e. those meeting diagnostic thresholds for

categorical PD diagnoses) and individuals specifically selected to be entirely free of any present

or past PD symptoms (i.e. participants in the healthy control group). This way, the field has

neglected to generate evidence about milder levels of PD pathology. Additionally, one must

expect that individuals in PD and healthy control groups differ on many characteristics beyond

presence or absence of a PD, rendering any causal attributions of differences in outcomes to PD

pathology impossible. The reliance on healthy control groups also severely limits any

conclusions regarding diagnostic specificity. In fact, for the majority of findings, it remains

entirely unclear whether they are at all specific to a certain PD. By relying on case-control

designs, the field further missed opportunities to investigate whether the associations between

PD pathology and the processes studied in experimental paradigms follow a continuous, dose-

response-like relationship with problems increasing at the higher end of the PD severity

spectrum (or whether they are unique to those with PD and completely absent in “healthy”

individuals). This severely limits our understanding of the processes themselves. Studies that

include clinical control groups in addition to healthy individuals attenuate this problem

somewhat, but were rare among the reviewed studies, and considered a wide range of different

clinical control groups (see Figure 3).


11
Figure 3: Summary of main points of the literature review in a question format with yes (red) - no (grey) answers.

In addition to the predominance of case-control studies and BPD samples, the reviewed

studies are also highly biased with regard to age, gender, and race4. Regarding age, except for

two studies, all reviewed studies included adult participants over the age of 18. The mean age

across studies was 27.66 years (Md = 28.26, SD = 5.18), indicating a focus on young adults.

The average percentage of female participants across all reviewed studies was M = 81.01 (Md

= 95, SD = 26.68) with 46.08% of reviewed studies including only women. Men, on average,

made up 17.87% of participants across samples (Md = 0, SD = 28.69)5. Other gender identities

4
These three demographic variables were selected for the review and the authors acknowledge that they are not
exhaustive and further bias is likely with regard to other variables such as sexual orientation, socioeconomic status,
country of residence, religion, and many more.
5
Several studies only reported percentages for one gender, likely implying that the remaining participants were of
the other binary gender (e.g., reporting 70% of participants were female and implying the remaining 30% were
male). This reporting is problematic insofar as it promotes the idea that gender binary is the norm and excludes
individuals of other gender identities (e.g., bigender, genderfluid). So as not to perpetuate this, we only coded the
percentages for genders that were explicitly described and did not make further inferences based on an assumption
of gender binary. As a result, the percentages for women and men do not sum up to 100% across studies.
12
were assessed in only one of the reviewed studies. The third demographic variable we aimed to

extract from all studies was race. However, only 29.41% of the reviewed studies even reported

any data on race, which precluded an adequate summary of this variable. This, in and of itself,

is a grave oversight that appears to be highly prevalent in our field and constitutes a form of

implicit racism that we must urgently address (see Haeny et al., 2021 for a nomenclature for

antiracist clinical research, and the article on diversity in this special issue for a discussion

focused on the field of PD research). The overall picture suggests that the reviewed evidence is

highly restricted in its generalizability and does a poor job of representing the broad range of

individuals affected by PDs.

Experimental manipulations

Five experimental paradigms comprised a between-person manipulation, where participants

were randomly assigned to one of two experimental conditions. Almost all experimental

paradigms included a within-person manipulation (n = 119), such that participants completed

multiple conditions within the same task. Most commonly, studies comprised one within-factor

(70.97% of studies) with two or three discrete factor levels (see Figure 2). The remaining studies

comprised two (21.77%) or three (3.23%) within factors. The average number of within levels

(across all within-factors) was M = 4.67 (Md = 3.00) and varied substantially between studies

(SD = 6.10), with some including more than thirty levels. If this large number of levels is not

accompanied by a proportional increase in trials per level, too few trials fill each cell of the

experimental design and estimators (e.g., means) become unreliable.6 Beyond any

considerations of statistical power, this substantially limits any conclusions that can be drawn

from the data. Lastly, it is important to note that almost no study included a continuously

6
Note that we tried to extract the number of trials for each within-factor combination from all reviewed articles.
However, only few articles clearly reported this information and trial numbers often varied between levels.
Therefore, we decided not to report this data as, for a substantial proportion of articles, we were left unsure about
the exact design that was used.
13
manipulated within-variable. Even variables that could be manipulated continuously, such as

stimulus intensity, were typically split up into discrete factor levels.

Reporting and Analysis

Finally, we coded different variables pertaining to the reporting as well as to the analysis of the

dependent variables (see Figure 3). Particularly with regard to the statistical analysis, this review

focuses on variables that we were able to classify for all manuscripts. A more detailed analysis

of statistical models was prevented by a lack of available information (see also ‘Open data and

materials’).

Power analyses

Power analyses are of utmost importance for performing informative experimental studies.

When power is high, studies can provide clear and more replicable answers to research

questions, whereas low power directly contributes to low replicability and heterogeneous

findings (Stanley et al., 2018). Given the importance of statistical power for research in general,

this aspect is discussed in greater detail by Vize and Lynam in this special issue.

Most commonly, the reviewed articles did not report power analyses (78.79%, n = 78).

Only a minority of articles included an a-priori power analysis (13.13%, n = 13) and few

(8.08%, n = 8) used alternative approaches, such as sensitivity analyses. Sensitivity analyses are

of particular interest for clinical research, which is commonly confronted with feasibility

concerns that can result in relatively fixed maximum sample sizes. This approach allows

determining the minimum effect size that can be reliably detected given the recruited sample

sizes (Bloom, 1995; Lakens, 2014). Reporting of sensitivity analyses thus allows the reader to

put the reported results into context with the accumulated knowledge of the field (Perugini et

al., 2018). Furthermore, almost half the articles reporting power analyses powered their studies

for within-between interactions (42.86%, n = 9). Accordingly, these studies were adequately

14
powered to detect condition-by-group interactions, but may have been underpowered to detect

simple between-group comparisons, which require more statistical power.

To provide estimations of power in current experimental PD research, we analyzed the

power for all reviewed studies that included case-control comparisons7. The results of these

calculations with varying numbers of participants and effect sizes are presented in Figure 4.

Most case-control studies (median n per group = 30) were only adequately powered (at 1 - β =

86.14%) to detect large effects, whereas power was low for the detection of medium effects (1

- β = 47.79%) or small effects (1 - β = 20.79%). We repeated this procedure for all reviewed

studies that included a dimensional investigation of personality pathology (median sample n =

103). Power estimates showed these studies were sufficiently powered to detect large and

medium-sized, but not small effects (1 - β = 17.18% for a small effect of r = .1; 1 - β = 87.46%

for a medium effect of r = .3; 1 - β = 99.98% for a large effect of r = .5).

Figure 4: Contour plot shows power estimates for a between-group comparison with different sample sizes and

effect sizes of interests. For comparison, we included power estimates for detecting small, medium, or large effects

considering the median sample size of current experimental studies (dotted line). Shaded area reflects .25 and .75

quantiles of sample sizes.

7
We assumed equal sample sizes, an alpha level of .05, and a two-sided t-test.
15
Dependent variables

On average, studies reported 2.77 (SD = 2.06) different dependent variables. Most studies used

a combination of dependent variables8, such as ratings (48.08%, n = 125), accuracy scores

(19.23%, n = 50), and response latencies (13.46%, n = 35). As can be expected in experimental

paradigms, the majority of constructs were measured through the rating of a single item

(72.80%, n = 91) as compared to the rating of a scale (27.20%, n = 34). This reflects a common

decision in experimental designs to lower participant burden related to the repeated assessments.

Accordingly, this pattern changed when focusing on studies with few assessments, such as in

experiments assessing emotional states before and after confrontation with a stressor. However,

the majority of constructs were still assessed by single items (55.17%, n = 32) as compared to

scales (44.83%, n = 26). A notable exception from this practice represents the Cyberball

paradigm. Studies using this task more commonly relied on multi-item scales (single item:

42.86%, n = 9; scales: 57.14%, n = 12).

Aggregation of trials

Most dependent variables with repeated assessments were aggregated before statistical analysis

(85.41%, n = 158). Aggregation of measures has important consequences, which should be

considered more carefully in clinical experimental research. In a nutshell, this procedure

neglects important sources of systematic variation at the trial level, such as drifts over time due

to fatigue or specific associations with experimental stimuli. It is important to keep in mind that

we do not only sample participants, but also aspects of the experiment such as stimuli. Without

adequate modeling of trial-level variations related to experimental stimuli, researchers are

unable to generalize beyond the stimuli applied in the respective experiment. For extensive

discussions of this ‘stimuli-as-fixed-effect fallacy’ see Clark (1973) and Judd et al. (2012).

8
Note that we focused on behavioral dependent variables from the manuscripts. Fixations in eye-tracking studies,
psychophysiological responding or neural activation were not extracted.
16
Response latencies were mostly aggregated using the mean (36.36%, n = 12) or median

(21.21%, n = 7), with a substantial number of studies not describing the chosen method of

aggregation (36.36%, n = 12). In addition, most studies did not report transformation of response

latencies (87.88%, n = 29). These procedures neglect the underlying distribution of response

latencies with its pronounced positive skewness over different trials.

Reliability

The reliability of experimental tasks was rarely reported (12.12%, n = 12). The relations between

reliability and statistical power, but also the implications of low measurement reliability and its

detrimental impact on the interpretability and comparisons of results have been discussed in

detail by Parsons and colleagues (Parsons et al., 2019).

Inclusion of covariates

Some studies included covariates in their main analysis9 (23.30%, n = 24), mainly adjusting for

different socio-demographic variables (79.17%, n = 19). A minority of these studies also

adjusted for psychopathological variables (e.g., depression, or anxiety; 37.50%, n = 9). The

inclusion of covariates may be well justified but requires an a-priori definition of theoretically

relevant covariates and needs to be considered in power estimations (Kraemer, 2015). Ideally,

researchers who report analyses with covariates also report the results of those same models

without covariates either in the main text, supplemental materials, or footnotes.

Open data and materials

This point has been reiterated throughout the manuscript - a lack of available information and

transparent reporting prevented the classification of additional variables of interest (e.g., number

9
Note, we did not count studies repeating their main analysis while controlling for different aspects.
17
of repetitions per condition, statistical models). The benefits and values of open data and

materials for research transparency, reduction of data loss, and fostering of progress have been

discussed in great detail in the last years (Gewin, 2016; Munafò et al., 2017; Nosek et al., 2015).

Still, only a minority of experimental studies (9.09%, n = 9) provided open access to their de-

identified data and/or study materials using available repositories. Even in journals with a

mandated data policy we commonly found the statement that “data are available upon

reasonable request” (for a discussion, see Tedersoo et al., 2021; Wicherts et al., 2006).

Future directions for PD research with experimental paradigms

Based on the above literature review, we have identified six main implications for future

experimental studies in PD samples. Some of these points overlap with general

recommendations for good scientific practice and increasing replicability in psychology, but we

try to focus on their specificity for the study of PDs as much as possible (American

Psychological Association, 2008; Asendorpf et al., 2016).

1. Increase the breadth of represented RDoC constructs

As outlined above, several RDoC domains have been studied extensively using experimental

paradigms, while others require further investigation in future work. While laying out this

implication in more detail, we repeatedly refer to maladaptive personality traits as they are

implemented in the dimensional PD diagnosis in ICD-11 (World Health Organization, 2019)

and the DSM-5 AMPD (Oldham, 2015).

As our review showed, there is a large body of research within the negative valence

systems domain. This is likely due to its close relation to maladaptive personality traits that are

observed frequently in PDs, such as affective instability (e.g., Trull et al., 2008). In line with

this, current dimensional models of PD suggest that PDs are characterized by pronounced

emotional reactivity to stress (Huprich, 2020; Oldham, 2015). The AMPD and ICD-11 PD
18
diagnosis subsume this under the maladaptive trait negative affectivity (Bach et al., 2018;

Hopwood et al., 2012). Two constructs within the negative valence systems domain that should

be targeted further in future work are potential threat (anxiety) and loss, as those with PD tend

to be characterized by increased anxiousness (Hopwood et al., 2012). In dimensional models of

PD, anxiousness is subsumed under the maladaptive personality trait negative affectivity in

DSM-5 AMPD and ICD-11 (Bach et al., 2018; Hopwood et al., 2012). Studying responses to

loss within an experimental paradigm could provide insight into the marked level of loneliness

reported by those with PD (Liebke et al., 2017) and the corresponding maladaptive personality

trait detachment in DSM-5 AMPD/ ICD-11 (Bach et al., 2018; Hopwood et al., 2012).

There is also a need for additional experimental studies on the positive valence systems

domain in PDs, and future work should strive to develop paradigms that can distinguish between

the constructs reward valuation, reward responsiveness, and reward learning. The RDoC

domain cognitive systems was studied extensively in (borderline) PD, using established

experimental paradigms that are also referenced in the RDoC matrix as suitable for investigating

cognitive control and memory. In the future, these findings should be replicated and re-evaluated

with regard to other PD categories, or dimensional assessment of maladaptive personality traits.

Within the RDoC domain social processes, the constructs social communication,

perception and understanding of self, and perception and understanding of others deserve

continued attention in future research, because they closely reflect the new diagnostic criteria

for PDs in ICD-11 and M-5 AMPD. The ICD-11 PD diagnosis details “problems in functioning

of aspects of the self (e.g., accuracy of self-view), and/or interpersonal dysfunction (e.g., the

ability to understand others' perspectives)”, and thus almost verbatim references these RDoC

constructs (World Health Organization, 2019). An additional focus on prosocial behavior

(which was investigated in several of the reviewed studies but is not currently an RDoC

construct) would complement this research (Hepp & Niedtfeld, 2022). An additional issue that

became evident when reviewing paradigms that were used to study the construct perception and
19
understanding of others is that of specificity. Future studies are needed to clarify whether the

paradigms that were used measure the same construct, or whether clustering them into sub-

constructs (affective theory of mind, cognitive theory of mind, empathy) would aid a more fine-

grained understanding of the alterations of social-cognitive functioning in PD.

2. Increase the representativeness of the recruited samples

This implication has several elements. First, there is an urgent need for studies that include

individuals with personality pathology beyond BPD. As outlined above, the current body of

experimental work almost exclusively studied BPD, with little evidence generated for other PDs.

However, we do not argue that what the field needs now is an equally large number of

experimental studies for all other categorical PDs. Following the shift to a dimensional PD

diagnosis in ICD-11 and DSM-5 AMPD, we would rather argue for a dimensional recruitment

strategy that samples individuals with different levels of maladaptive PD traits and various

levels of PD severity. This way, dimensional associations between the processes investigated in

experimental paradigms and maladaptive traits or overall PD severity could be established and

possible interventions could be tested for their effectiveness at different points of this

continuum.

Second, the reviewed studies showed marked bias for samples of younger cis-gender

women, and we deem it very likely that this bias extends to other variables that we did not

extract from all articles (e.g., education level, sexual orientation, religion). Strikingly, only a

minority of studies even assessed race or ethnicity so that we were unable to provide a

conclusive picture of the distribution of these variables across the reviewed studies. In addition

to the failure to report data on race and ethnicity, there was also only one study that assessed

gender identities other than female or male, showing that studies almost exclusively succumbed

to a concept of gender binary. Beyond the evident problem of a lack of representation, this

practice is also at odds with the distribution of PDs in the general population, where we see that
20
PDs are more prevalent among transgender and gender nonconforming individuals (Reisner et

al., 2016). In addition to the failure to assess diverse gender identities, the reviewed studies also

tended to almost exclusively sample women. This is despite well-known epidemiological

findings that (except for antisocial personality disorder) most PDs show similar prevalence rates

among men and women (Grant et al., 2008; Lenzenweger et al., 2007). We would argue that the

field (explicitly including our own research) requires a strong change toward more diverse

samples, both because it is an ethical mandate, and because samples that better represent the

whole population that is affected by PDs will produce findings that are more generalizable and

applicable.

3. Increase the statistical power to detect between-person effects

As discussed above, many of the reviewed studies lack the statistical power to detect between-

person effects of the group factor (in previous studies often BPD vs. HC). The easiest way to

remedy this is, of course, to increase sample size. We realize that this is easier said than done

and that the recruitment of PD samples is always effortful and time-consuming. Often, the

samples recruited for the reviewed studies were highly specific and imposed additional criteria

beyond presence of a categorical PD, such as absence of medication or certain comorbidities.

Collaborations between different labs and distributing recruitment efforts across several study

sites are one possible way to remedy this.

In line with our recommendation for sampling a wider range of PD pathology, moving

away from recruiting participants with a categorical PD may also be helpful for achieving larger

sample sizes (e.g., da Costa et al., 2018). In all likelihood, this would render the recruitment

process much easier as it automatically increases the pool of potential participants and affords

inclusion criteria that are easier to meet. For instance, participants who meet only three or four

BPD criteria have to be excluded from a study that uses the categorical BPD diagnosis as an

inclusion criterion, but could easily be included in a study measuring PD severity level and
21
maladaptive trait combinations dimensionally. Likewise, healthy control participants do not

have to be selected to be “super healthy” and entirely free of any psychopathology, but could

still contribute low levels of maladaptive traits in a dimensional sampling approach. Other

groups that are typically not represented in past studies, such as individuals with partially

remitted or mixed PD pathology, could also be included in studies that quantify PD severity

dimensionally. Lastly, a dimensional sampling approach can be beneficial for statistical power,

as continuous predictors generally afford higher statistical power than categorical ones (for

further discussion of this issue, see implication 5).

In either case, studies using experimental paradigms in PD samples would benefit from

using conservatively estimated power analyses or simulations to inform future sample sizes and

conducting a-priori power analyses (which very few of the reviewed studies did) should go

without saying. We are sympathetic that recruitment of big sample sizes to increase statistical

power represents a serious challenge for clinical psychological research. However, in the long

run, there is no viable alternative to fundamentally advance research in personality pathology.

As of now, the replicability of the majority of the reviewed findings is questionable. A solution

for this problem might lie in a much stronger adoption of open data repositories to aggregate

primary data from multiple sites (see implication 6 for a more detailed discussion).

Alternatively, PD research needs more concerted efforts to assess data across multiple labs,

which has become more common in recent years (Klein et al., 2018; Moshontz et al., 2018).

Such efforts would not only result in higher statistical power, but also address issues of

generalizability.

4. Increase the reliability and validity of measures

Our review revealed that some studies tended to implement a large number of within-factor

levels, which, if not accompanied by a proportional increase in trials per level, can result in

unreliable estimators. While we were unable to determine precisely how big this problem was
22
in the reviewed studies (because authors tended not to report reliability data and even trial

numbers were not readily available for all articles), we would argue that studies could generally

benefit from placing greater emphasis on the number of repetitions per within-factor

combination. In the simplest sense, this could mean to increase the overall number of repetitions

in the experimental paradigm. At the same time, researchers will want to consider participant

burden and avoid overly long experimental sessions. In the tradeoff between session duration

and trial repetitions, researchers must therefore carefully consider how many factor levels they

can implement, and consider whether the design they are thinking of is too complex for the trial

number they can afford.

Measurement of constructs is a neglected topic in experimental psychopathology in

general, and this is despite its considerable importance for construct validity. Single items are

unlikely to represent the breadth of the theoretical concept of interest. As stated above, in most

cases researchers have to make a trade-off between participant burden and validity of measures,

thus there cannot be a clear-cut recommendation for future research. However, we would urge

researchers to consider whether multi-item scales are feasible. Additionally, it was striking how

many of the reviewed studies developed ever-new paradigms to investigate the same constructs.

While the development of new paradigms is not inherently problematic, most studies failed to

pilot these prior to their first application. Additionally, several studies used paradigms developed

in other disciplines such as behavioral economics, social psychology or general psychology.

Often, these paradigms were designed and optimized for investigating within-person effects.

Thus, they tend to produce variance at the within-person and not necessarily the between-person

(or even group) level. Ideally, whenever introducing or adopting a new experimental paradigm

to PD research, researchers should first run a pilot study in a convenience sample to establish

general reliability indices and ensure that the paradigm produces substantial between-person

variance.

23
5. Increase transparency of research

A single study reported a pre-registration for their hypotheses and analyses, and only a handful

of studies provided open data or materials. Nonetheless, we are hopeful that most researchers in

our field want to improve on this, and that some already pre-registered studies will be published

in the next few years. The last decade has seen a scientific reform movement starting from a

serious replicability crisis in psychology. Given the issues pointed out throughout the article, we

fear the same problem affects PD research. If published findings are not replicable, then progress

for the diagnosis and treatment of personality psychopathology is ultimately set back. While the

2010s have been described as a decade of ‘active confrontation’ for psychology in this regard

(Nosek et al., 2021), we have to note that our field still seems to be in a phase of ‘active denial’.

In this review, we provided several recommendations that, while advancing scientific rigor, also

require a great deal of additional effort from researchers. Other recommendations, such as the

following call for increased transparency, are more easily implemented.

Increased transparency refers particularly to providing open data and materials as part

of the publication. This fact has been acknowledged by most scientific organizations, commonly

resulting in a best practice recommendation that research data should be openly available

(Gewin, 2016; Wilkinson et al., 2016). Similar sentiments are reflected by TOP guidelines,

which are increasingly adopted by scientific journals (Nosek et al., 2015). It goes without saying

that these calls for ‘open data’ should be followed while considering possible ethics or privacy

constraints. However, too often, privacy constraints of patients are a fig leaf against taking action

and providing open access to data. Wider adoption of open science procedures would make our

research more efficient by facilitating the reuse of data and would allow for calculating

individual patient data (IPD) meta-analyses. This, in most cases, increases statistical power and

allows for adjustment and investigation of confounding factors at the participant level (Riley et

al., 2010), which most studies are not adequately powered for.

24
In parallel to this call for ‘open data’, we ask researchers to consider publication of their

materials, especially the experimental setup and the code for statistical analyses. It is widely

recognized that the traditional article is insufficient for describing all aspects of the experimental

design, data preprocessing, and analysis (Munafò et al., 2017). This makes an informed

understanding and validation of most articles next to impossible and impedes the ability to

accurately judge these aspects or build on previous work. The availability of experimental and

analysis code makes it possible to fully understand these aspects and (combined with open data)

to reproduce key findings of the literature. A wider dissemination of ‘open materials’ might thus

address the obstacles we encountered during this literature review, such as the ones regarding a

more in-depth analysis of statistical models and choices.

6. Increase adequacy of statistical methods

Despite the difficulties described above, there are some main takeaways from our literature

review regarding the use of statistical models. As of now, most studies aggregate their measures.

We already pointed out the issues associated with this common practice (i.e. stimuli-as-fixed-

effect fallacy). A wider adoption of linear mixed models would not only make the most of

repeated experimental assessments (Brown, 2021), but would also allow moving away from

categorical thinking at the within-level of experimental designs. Rather than implementing

discrete factor levels (e.g. stimuli of mild, moderate, and maximum intensity) and aggregating

the data for each level, researchers could opt to manipulate variables continuously (e.g., using

stimuli sampled along the full range of intensity). Randomly sampling stimuli along an intensity

continuum and modeling them as random effects would enable new insights.

Furthermore, a substantial number of studies adjusted their analyses for different socio-

demographic or psychopathological variables. It has been shown before, and we cannot stress

this point enough, that post-hoc inclusion of covariates to adjust for group differences increases

the likelihood of false-positive results (Simmons et al., 2011). When still doing so (or being
25
asked to do so during the review process), a transparent reporting is required to evaluate the

reliance of results on the presence of covariates, thus both the unadjusted and adjusted models

must be presented (Kraemer, 2015; Simmons et al., 2011).

Conclusion

We reviewed 99 articles published between 2017 and 2021 that report findings from

experimental paradigms in PD samples, identified limitations, and derived implications. Again,

we would like to underline that our own work is not free of these limitations and explicitly

included in all criticism we presented. In addition, our selection of thirteen clinical target

journals might have influenced the results and conclusions of this review. Nonetheless, we

conclude that future research could benefit from: (1) An expansion in content to currently under-

represented RDoC constructs, (2) adopting a dimensional assessment of PD pathology instead

of categorical case-control designs, (3) collecting well-powered, representative, and diverse

samples, (4) carefully examining and increasing the between- and within-person reliability of

the employed paradigms, (5) adopting statistical tests that adequately models trial-level

variations related to experimental stimuli, and (6) embracing open science practices

(preregistration, open data, open materials, open code) to increase transparency, reproducibility,

and replicability.

26
References

American Psychiatric Association. (2013). Diagnostic and statistical manual of mental

disorders (5th ed.). https://doi.org/10.1176/appi.books.9780890425596

American Psychological Association. (2008). Responsible conduct of research. Retrieved

01/18/2022 from https://www.apa.org/research/responsible

Asendorpf, J. B., Conner, M., De Fruyt, F., De Houwer, J., Denissen, J. J., Fiedler, K., Fiedler,

S., Funder, D. C., Kliegl, R., & Nosek, B. A. (2016). Recommendations for increasing

replicability in psychology. In A. E. Kazdin (Ed.), Methodological issues and strategies

in clinical research (pp. 607–622). American Psychological Association.

https://doi.org/10.1002/per.1919

Bach, B., Sellbom, M., Skjernov, M., & Simonsen, E. (2018). ICD-11 and DSM-5 personality

trait domains capture categorical personality disorders: Finding a common ground.

Australian & New Zealand Journal of Psychiatry, 52(5), 425-434.

https://doi.org/10.1177/0004867417727867

Bloom, H. S. (1995). Minimum detectable effects: A simple way to report the statistical power

of experimental designs. Evaluation Review, 19(5), 547-556.

https://doi.org/10.1177/0193841X9501900504

Brown, V. A. (2021). An Introduction to Linear Mixed-Effects Modeling in R. Advances in

Methods and Practices in Psychological Science, 4(1), 1-19.

https://doi.org/10.1177/2515245920960351

Clark, H. H. (1973). The language-as-fixed-effect fallacy: A critique of language statistics in

psychological research. Journal of Verbal Learning and Verbal Behavior, 12(4), 335-

359. https://doi.org/10.1016/S0022-5371(73)80014-3

Cuthbert, B. N. (2014). The RDoC framework: Facilitating transition from ICD/DSM to

dimensional approaches that integrate neuroscience and psychopathology. World

Psychiatry, 13(1), 28-35. https://doi.org/10.1002/wps.20087


27
da Costa, H. P., Vrabel, J. K., Zeigler-Hill, V., & Vonk, J. (2018). DSM-5 pathological

personality traits are associated with the ability to understand the emotional states of

others. Journal of Research in Personality, 75, 1-11.

https://doi.org/10.1016/j.jrp.2018.05.001

Domes, G., Schulze, L., & Herpertz, S. C. (2009). Emotion recognition in borderline personality

disorder—A review of the literature. Journal of Personality Disorders, 23(1), 6-19.

https://doi.org/10.1521/pedi.2009.23.1.6

Fernandez, K. C., Jazaieri, H., & Gross, J. J. (2016). Emotion regulation: A transdiagnostic

perspective on a new RDoC domain. Cognitive Therapy and Research, 40(3), 426-440.

https://doi.org/10.1007/s10608-016-9772-2

Fossati, A., Somma, A., Borroni, S., Markon, K. E., & Krueger, R. F. (2018). Executive

Functioning Correlates of DSM-5 Maladaptive Personality Traits: Initial Evidence from

an Italian Sample of Consecutively Admitted Adult Outpatients. Journal of

Psychopathology and Behavioral Assessment, 40(3), 484-496.

https://doi.org/10.1007/s10862-018-9645-y

Gewin, V. (2016). Data sharing: An open mind on open data. Nature, 529(7584), 117-119.

https://doi.org/10.1038/nj7584-117a

Grant, B. F., Chou, S. P., Goldstein, R. B., Huang, B., Stinson, F. S., Saha, T. D., Smith, S. M.,

Dawson, D. A., Pulay, A. J., & Pickering, R. P. (2008). Prevalence, correlates, disability,

and comorbidity of DSM-IV borderline personality disorder: results from the Wave 2

National Epidemiologic Survey on Alcohol and Related Conditions. The Journal of

Clinical Psychiatry, 69(4), 0-0. https://doi.org/10.4088/jcp.v69n0404

Haeny, A. M., Holmes, S. C., & Williams, M. T. (2021). The need for shared nomenclature on

racism and related terminology in psychology. Perspectives on Psychological Science,

16(5), 886-892. https://doi.org/10.1177/17456916211000760

28
Hanegraaf, L., van Baal, S., Hohwy, J., & Verdejo-Garcia, A. (2021). A Systematic Review and

Meta-Analysis of ‘Systems for Social Processes’ in Borderline Personality and

Substance Use Disorders. Neuroscience & Biobehavioral Reviews, 127, 572-592.

https://doi.org/10.1016/j.neubiorev.2021.04.013

Hepp, J., & Niedtfeld, I. (2022). Prosociality in personality disorders: Status quo and research

agenda. Current Opinion in Psychology, 44, 208-214.

https://doi.org/10.1016/j.copsyc.2021.09.013

Hepp, J., Niedtfeld, I., & Schulze, L. (2022, May 13). Experimental paradigms in personality

disorder research: A review of previous evidence, methodological issues and future

directions. https://doi.org/10.17605/OSF.IO/XMWE9

Hopwood, C. J., Thomas, K. M., Markon, K. E., Wright, A. G., & Krueger, R. F. (2012). DSM-

5 personality traits and DSM–IV personality disorders. Journal of Abnormal

Psychology, 121(2), 424-432. https://doi.org/10.1037/a0026656

Huprich, S. K. (2020). Personality disorders in the ICD-11: opportunities and challenges for

advancing the diagnosis of personality pathology. Current Psychiatry Reports, 22, 1-7.

https://doi.org/10.1007/s11920-020-01161-4

Jeung, H., Schwieren, C., & Herpertz, S. C. (2016). Rationality and self-interest as economic-

exchange strategy in borderline personality disorder: Game theory, social preferences,

and interpersonal behavior. Neuroscience & Biobehavioral Reviews, 71, 849-864.

https://doi.org/10.1016/j.neubiorev.2016.10.030

Judd, C. M., Westfall, J., & Kenny, D. A. (2012). Treating stimuli as a random factor in social

psychology: A new and comprehensive solution to a pervasive but largely ignored

problem. Journal of Personality and Social Psychology, 103(1), 54.

https://doi.org/10.1037/a0028347

Klein, R. A., Vianello, M., Hasselman, F., Adams, B. G., Adams Jr, R. B., Alper, S., Aveyard,

M., Axt, J. R., Babalola, M. T., & Bahník, Š. (2018). Many Labs 2: Investigating
29
variation in replicability across samples and settings. Advances in Methods and Practices

in Psychological Science, 1(4), 443-490. https://doi.org/10.1177/2515245918810225

Koudys, J. W., Traynor, J. M., Rodrigo, A. H., Carcone, D., & Ruocco, A. C. (2019). The NIMH

research domain criteria (RDoC) initiative and its implications for research on

personality disorder. Current Psychiatry Reports, 21(6), 1-12.

https://doi.org/10.1007/s11920-019-1023-2

Kraemer, H. C. (2015). A source of false findings in published research studies: Adjusting for

covariates. JAMA Psychiatry, 72(10), 961-962.

https://doi.org/10.1001/jamapsychiatry.2015.1178

Lakens, D. (2014). Performing high‐powered studies efficiently with sequential analyses.

European Journal of Social Psychology, 44(7), 701-710.

https://doi.org/10.1002/ejsp.2023

Lenzenweger, M. F., Lane, M. C., Loranger, A. W., & Kessler, R. C. (2007). DSM-IV

personality disorders in the National Comorbidity Survey Replication. Biological

Psychiatry, 62(6), 553-564. https://doi.org/10.1016/j.biopsych.2006.09.019

Liebke, L., Bungert, M., Thome, J., Hauschild, S., Gescher, D. M., Schmahl, C., Bohus, M., &

Lis, S. (2017). Loneliness, social networks, and social functioning in borderline

personality disorder. Personality Disorders: Theory, Research, and Treatment, 8(4),

349-356. https://doi.org/10.1037/per0000208

Morris, S. E., & Cuthbert, B. N. (2012). Research Domain Criteria: cognitive systems, neural

circuits, and dimensions of behavior. Dialogues in Clinical Neuroscience, 14(1), 29-37.

https://doi.org/10.31887/DCNS.2012.14.1/smorris

Moshontz, H., Campbell, L., Ebersole, C. R., IJzerman, H., Urry, H. L., Forscher, P. S., Grahe,

J. E., McCarthy, R. J., Musser, E. D., & Antfolk, J. (2018). The Psychological Science

Accelerator: Advancing psychology through a distributed collaborative network.

30
Advances in Methods and Practices in Psychological Science, 1(4), 501-515.

https://doi.org/10.1177/2515245918797607

Munafò, M. R., Nosek, B. A., Bishop, D. V., Button, K. S., Chambers, C. D., Du Sert, N. P.,

Simonsohn, U., Wagenmakers, E.-J., Ware, J. J., & Ioannidis, J. P. (2017). A manifesto

for reproducible science. Nature Human Behaviour, 1(1), 1-9.

https://doi.org/10.1038/s41562-016-0021

Myers, A., & Hansen, C. H. (2011). Experimental psychology (7th, Ed.). Wadsworth Cengage

Learning.

National Institute of Mental Health. (2022). Domain: Negative valence systems. U.S.

Department of Health and Human Services, National Institutes of Health. Retrieved

01/18/2022 from https://www.nimh.nih.gov/research/research-funded-by-

nimh/rdoc/constructs/negative-valence-systems

Nosek, B. A., Alter, G., Banks, G. C., Borsboom, D., Bowman, S. D., Breckler, S. J., Buck, S.,

Chambers, C. D., Chin, G., Christensen, G., Contestabile, M., Dafoe, A., Eich, E.,

Freese, J., Glennerster, R., Goroff, D., Green, D. P., Hesse, B., Humphreys, M., ..., &

Yarkoni, T. (2015). Promoting an open research culture. Science, 348(6242), 1422-1425.

https://doi.org/10.1126/science.aab2374

Nosek, B. A., Hardwicke, T. E., Moshontz, H., Allard, A., Corker, K. S., Almenberg, A. D.,

Fidler, F., Hilgard, J., Kline, M., & Nuijten, M. B. (2021). Replicability, robustness, and

reproducibility in psychological science. https://doi.org/10.1146/annurev-psych-

020821-114157

Oldham, J. M. (2015). The alternative DSM-5 model for personality disorders. World

Psychiatry, 14(2), 234-236. https://doi.org/10.1002/wps.20232

Papousek, I., Aydin, N., Rominger, C., Feyaerts, K., Schmid-Zalaudek, K., Lackner, H. K., Fink,

A., Schulter, G., & Weiss, E. M. (2018). DSM-5 personality trait domains and

withdrawal versus approach motivational tendencies in response to the perception of


31
other people’s desperation and angry aggression. Biological Psychology, 132, 106-115.

https://doi.org/10.1016/j.biopsycho.2017.11.010

Parsons, S., Kruijt, A.-W., & Fox, E. (2019). Psychological science needs a standard practice of

reporting the reliability of cognitive-behavioral measurements. Advances in Methods

and Practices in Psychological Science, 2(4), 378-395.

https://doi.org/10.1177/2515245919879695

Perugini, M., Gallucci, M., & Costantini, G. (2018). A practical primer to power analysis for

simple experimental designs. International Review of Social Psychology, 31(1), 1-23.

https://doi.org/10.1177/0193841X9501900504

Reisner, S. L., Poteat, T., Keatley, J., Cabral, M., Mothopeng, T., Dunham, E., Holland, C. E.,

Max, R., & Baral, S. D. (2016). Global health burden and needs of transgender

populations: a review. The Lancet, 388(10042), 412-436. https://doi.org/10.1016/S0140-

6736(16)00684-X

Riley, R. D., Lambert, P. C., & Abo-Zaid, G. (2010). Meta-analysis of individual participant

data: rationale, conduct, and reporting. BMJ, 340. https://doi.org/10.1136/bmj.c221

Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed

flexibility in data collection and analysis allows presenting anything as significant.

Psychological Science, 22(11), 1359-1366. https://doi.org/10.1177/0956797611417632

Stanley, T. D., Carter, E. C., & Doucouliagos, H. (2018). What meta-analyses reveal about the

replicability of psychological research. Psychological Bulletin, 144(12), 1325.

https://doi.org/10.1037/bul0000169

Tedersoo, L., Küngas, R., Oras, E., Köster, K., Eenmaa, H., Leijen, Ä., Pedaste, M., Raju, M.,

Astapova, A., & Lukner, H. (2021). Data sharing practices and data availability upon

request differ across scientific disciplines. Scientific Data, 8(1), 1-11.

https://doi.org/10.1038/s41597-021-00981-0

32
Trull, T. J., Solhan, M. B., Tragesser, S. L., Jahng, S., Wood, P. K., Piasecki, T. M., & Watson,

D. (2008). Affective instability: measuring a core feature of borderline personality

disorder with ecological momentary assessment. Journal of Abnormal Psychology,

117(3), 647. https://doi.org/10.1037/a0012532

Wicherts, J. M., Borsboom, D., Kats, J., & Molenaar, D. (2006). The poor availability of

psychological research data for reanalysis. American Psychologist, 61(7), 726-728.

https://doi.org/10.1037/0003-066X.61.7.726

Wilkinson, M. D., Dumontier, M., Aalbersberg, I. J., Appleton, G., Axton, M., Baak, A.,

Blomberg, N., Boiten, J.-W., da Silva Santos, L. B., & Bourne, P. E. (2016). The FAIR

Guiding Principles for scientific data management and stewardship. Scientific Data,

3(1), 1-9. https://doi.org/10.1038/sdata.2016.18

World Health Organization. (2016). International statistical classification of diseases and

related health problems (10th ed.). https://icd.who.int/browse10/2016/en

World Health Organization. (2019). International statistical classification of diseases and

related health problems (11th ed.). https://icd.who.int/

Zimmermann, J., Kerber, A., Rek, K., Hopwood, C. J., & Krueger, R. F. (2019). A brief but

comprehensive review of research on the alternative DSM-5 model for personality

disorders. Current Psychiatry Reports, 21(9), 1-19. https://doi.org/10.1007/s11920-019-

1079-z

33

You might also like