You are on page 1of 24

42.

STR ATEGIES FOR IMPROV ING R A NDOMIZED


TR I A L EV IDENCE FOR TR EATMENT OF 
BIPOLAR DISOR DER
Ayşegül Yildiz, Juan Undurraga, Eduard Vieta, Dina Popovic,

Sarah Wooderson, and Allan H. Young

INTRODUCTION addressing the methodological obstacles that are impeding


progress in this critical sector of therapeutic development
Presently approved treatments of bipolar disorder are, for is warranted.
the most part, uniformly unable to address the underlying Regulatory agencies such as the US Food and Drug
etiology (Baldessarini, 2013; Yildiz, Guleryuz, Ankerst, Administration (FDA) and the European Medicines
Ongur, & Renshaw, 2008). Yet contemporary era has wit- Agency have taken the position that a true appreciation
nessed furthermost scientific understanding of the patho- of an intervention against a major psychiatric illness such
physiological mechanisms underlying bipolar disorder as schizophrenia, depression, and bipolar disorder is only
and has led to investigations of specific molecular targets possible by employment of a placebo-controlled design
and their neural correlates (Machado-Vieira et  al., 2010; (Alphs, Benedetti, Fleischhacker, & Kane, 2012). This
Mathew, Manji, & Charney, 2008). However, despite perspective has had a tremendous impact on drug devel-
substantial investment for development of target-specific, opment for these common psychiatric disorders (Alphs
receptor-oriented central nervous system (CNS) therapeu- et  al., 2012). For instance, more than two-thirds of the
tics, including the ones for bipolar disorder, a very limited available evidence on antimanic, antipsychotic, and anti-
number of the investigated compounds have survived to depressant treatments are against placebo (Kirsch, 2009;
launch in recent years (Hoertel, de Maricourt, & Gorwood, Leucht, Cipriani, et al., 2013; Sidor & MacQueen, 2011;
2013; Kemp et al., 2010). This result does not seem solely to Yildiz, Vieta, Leucht, & Baldessarini, 2011). This may
be due to failure of the underlying rationale or ineffective- partially be due to the requirement of the drug-regulatory
ness of the candidate compounds, which carried them to authorities but is also due to preferences of the pharma-
Phase I  trials but also increased noise encountered in the ceutical industry for testing their new compounds against
context of randomized controlled trials (RCTs). For CNS placebos rather than an active comparator, since the latter
therapeutics only, failure rates at Phase II stage reach 50%, necessitates truly more efficacious drugs (Leucht, Heres,
which is about 15% more than the previous decade (Hurko & Davis, 2013). Obviously, restricting progress in phar-
& Ryan, 2005; Kemp et  al., 2010; Mallinckrodt, Zhang, macology to superiority studies would seriously restrict
Prucka, & Millen, 2010; Tarr, Herbison, de la Barra, & progress in other matters such as improved safety or tol-
Glue, 2011). Considering $15,000 cost per subject for a erability (Vieta & Cruz, 2012), but the use of placebo
Phase II or III trial, it can be appreciated why such high as the only suitable comparator for registration studies
failure rates may discourage investment in CNS drug devel- is taking the field to a no-end road. Difficulties in con-
opment research (Kemp et  al., 2010; Kobak et  al., 2010). ducting placebo-controlled trials for clinically challeng-
From the business perspective, the worst scenario (failure ing severe mental disorders with frequently encountered
in Phase III of compounds that appeared to work in Phase symptomatic exacerbations have likely instigated distinct
II) is increasingly common (Vieta et  al., 2014). Clearly, research populations. The predominantly profit nature of

59 9
funding with predictable pressure for speedy completion of bipolar disorder is not only important for alleviation
with positive outcomes might have further contributed to of symptomatic exacerbations but also for protection
the formation of such exclusive trial populations. These against premature death associated with medical causes
placebo-controlled trials-compatible patients are likely to related to disturbed immune responses and increased
be at the least severe end of the disease spectrum, respond systemic oxidative stress, as well as suicide (Ahrens et al.,
to placebo, and increase attention provided in a research 1995; Cipriani, Hawton, Stockton, & Geddes, 2013;
context via the so called “Hawthorne’ effect.” A popula- Drexhage et al., 2011; Leboyer et al., 2012). Considering
tion enriched for placebo responders and deprived for true longitudinal course and lifetime consequences, observed
drug responders would lead to shrunken drug–placebo drug–placebo contrasts gain even greater appreciation.
contrasts, which in turn would lead to larger sample sizes However, as with other major psychiatric disorders, there
and a greater number of study sites and increased vari- is certainly much room for development of new therapeu-
ance as well as noise. Such a vicious cycle may be oper- tics with robust target specific effects and no or minimum
ating since cumulative introduction of “me too” drugs undesired effects for treatment and prophylaxis of bipo-
and have led to recent “anti-psychopharmacology” public lar disorder. For achieving this goal, identification of the
acts fostered in some scientific reviews (Kirsch, 2009). methodological obstacles in design and conduct of RCTs
Considering published, as well as FDA registered and is absolutely warranted.
unpublished RCTs, such critical reviews have for instance
claimed trivial effect of antidepressant drugs (Kirsch,
PL AC E B O E F F EC T V E R S US
2009). In interpretation of such critical evidence synthe-
PL AC E B O R E S P ONS E
sis reports, one should take into account the potential
effect of failure trials, which have typically failed to detect By definition, the placebo effect engulfs the psychobio-
the true treatment contrasts; potential mediator effect of logical phenomena attributable to the placebo, which
disease severity; and nature of source data attained in the is triggered via the perception or the symbolic mean-
context of RCTs involving distinct research populations ing of the treatment (Alphs et  al., 2012; Kirsch, 2009;
with milder forms of disease presentations. In the context Raz, Zigman, & de Jong, 2009). Placebo-like effects may
of RCTs it has been previously documented those patients also occur without administration of an actual placebo,
with severer forms of disease manifestation were more highlighting the central role of expectation and sugges-
likely to benefit from drug treatment (Kirsch, 2009). tion in placebo-related phenomena. Some effects, espe-
The real-world clinical samples on the other side, for the cially relating to the immune and endocrine systems,
most part, uniformly involve patients with higher symp- are due to conditioning types of processes. For example,
tomatic severity so would likely to be enriched for drug Benedetti (2009) discusses studies involving rats show-
responders. However, due to difficulties in conducting ing that a flavored liquid had immunosuppressant effects
placebo-controlled experiments in such patients, clini- after being repeatedly coadministered with an immuno-
cal trials are typically left to patients with milder disease suppressant drug. Likewise, administration of a placebo
forms. Indeed, for severe mental illnesses, the RCTs set- with patient expectation of pain relief typically activates
tings may better be perceived as simulation environments; the endogenous opioid system. Conversely, administer-
and evidence synthesis reports based on numeric data ing active analgesics via hidden administration—that is,
attained in these circumstances need to be translated to without the patient’s awareness—significantly reduces
the actual life settings. Further, sometimes even a small their effectiveness (Benedetti, 2009; Raz et  al., 2009).
actual treatment effect experienced in the real-life set- Another approach to the placebo effect is associated
tings may have magnified impacts, as in the critical clini- with the context surrounding the medical encounter
cal positions involving the line between life and death or (Moerman, 2002). This “meaning model” attempts to
delusional acts and insight (Leucht, Hierl, Kissling, Dold, explain why red placebos stimulate whereas blue place-
& Davis, 2012). Recent evidence indicates that, as with bos calm, why more placebos work better than few, and
schizophrenia, there is accelerated aging in bipolar dis- why more expensive placebos work better than cheaper
order with a consequent shortening in life expectancy of ones. Psychological and behavioral evidence points to
10 to 13.6 years relative to the general population of same the powerful and unique role of suggestion and expec-
age (Chang et al., 2011; García-Rizo et al., 2014; Laursen, tation in eliciting placebo effects (Benedetti, 2009; Raz
2011). Evidence also indicates that effective treatment et al., 2009).

6 0 0   •   S E C T I O N I V:   F U T U R E R E S E A R C H
Placebo response, on the other hand, designates the recent mega-analysis considering 94 meta-analytic reports
improvement observed in the placebo arm of a clini- of 48 drugs in 20 medical diseases and 16 drugs in 8 psy-
cal trial, which is produced by the totality of the placebo chiatric disorders, effect sizes computed for psychotropic
biological phenomenon combined with other potential agents were compared to general medical drugs (Leucht
factors contributing to symptom amelioration, such as pas- et al., 2012). This analysis revealed a pooled SMD of 0.54
sage of time, spontaneous remission, or natural history of for antihypertensive treatments; 0.41 for antimigraine
the disorder, regression to the mean, biases, geographical treatments; 0.87 for fasting glucose and 0.27 for mortality
and cultural factors, and/or judgment errors (Alphs et al., with antidiabetic treatments; 0.22 for antibiotic treatments
2012; Kirsch, 2009; Raz et al., 2009; Vieta, Pappadopulos, of otitis media; 0.44 for treatments of ulcerative colitis;
Mandel, Lombardo, 2011). The difference between the 0.36 for treatments increasing ventilatory volume; and 0.20
placebo response and improvement in a no-treatment for prevention of exacerbations in chronic obstructive pul-
control group can be interpreted as the placebo effect monary disease. The authors concluded that antimanic or
(Kirsch, 2009). antipsychotic drugs are not generally less efficacious than
As with the actual drug effect, genotype and pheno- average medical drugs (Leucht et  al., 2012). Nonetheless,
type related personal codes might determine individual clearly there is much room for the development of more
tendencies for the placebo effect. Since expectation and effective therapeutics with better efficacy and safety profiles
anticipation-based responses are in action, a clear level of for patients with bipolar disorder.
consciousness and insight should be preserved for the phe-
nomenon of actual placebo effect. Given that executive func-
I S PL AC E B O R E S P ONS E R I S I NG
tions such as insight, judgment, expectation, and anticipation
I N A N T I M A N IC T R E AT M E N T T R I A L S?
would often be disturbed in patients experiencing severe men-
tal illnesses, a true placebo effect would hardly be attainable Increasing placebo response is a major concern for anti-
in such patients. Consequently, as also indicated in the previ- psychotic drug trials (Alphs et  al., 2012; Kemp et  al.,
ous section, patient populations with more severe expressions 2010; Leucht, Heres, et  al., 2013). In a comprehensive
of such mental illnesses would likely to be enriched for drug meta-analysis of 50 placebo-controlled antipsychotic
response. Given ethical and clinical concerns, this aspect of drug trials of schizophrenia spectrum disorders con-
true placebo effect that is related with disease severity cannot ducted since 1970, Agid and colleagues (2013) reported
be fully encountered in the context of clinical trial settings. that placebo response has increased over the past few
However, efforts for achieving better simulation environ- decades, especially from 1993 to 2010. In this work, a
ments in the context of clinical trials can be optimized. standardized mean change (SMC) of  –0.33 (95% con-
A wider window for manipulation of the placebo-associated fidence interval [CI] =  –0.44 to  –0.22) from baseline
responses can be encountered through the effect of time on to endpoint for the placebo group was reported. The
the establishment of partial or full remissions, regression to authors also reported that larger placebo mean improve-
the mean, biases, and judgment errors. ments were associated with smaller drug–placebo differ-
ences. In a more recent multiple treatments meta-analysis
of antipsychotic drug trials for acute treatment of schizo-
C L I N IC A L T R I A L S OF   B I P OL A R   M A N I A phrenia exact same SMD of –0.33 (95% credible inter-
val [CrI] =  –0.43 to  –0.22) for the placebo-associated
improvements was computed (Leucht, Cipriani, et  al.,
E F F EC T S I Z E FOR A N T I M A N IC 2013). Similar to the findings with antipsychotic drug
T R E AT M E N T S R E L AT I V E TO  trials in schizophrenia, substantial placebo-associated
OT H E R M E DIC A L I L L N E S S E S mean improvements were documented also for the tri-
Effective treatments of acute mania yield an absolute dif- als of bipolar mania. Two recent reviews, one involv-
ference in responder rates of 17% and a standardized mean ing a total of 20 single-agent and add-on studies (Sysko
difference (SMD), as Hedges’ g based effect size of 0.41 & Walsh, 2007), and the other with 38 single-agent
(Yildiz, Vieta, Leucht, et al., 2011), which is comparable to studies (56 drug-placebo contrasts) indicated a
corresponding values of 18% and 0.51 for second-generation placebo-associated response rate of 31% in trials of bipo-
antipsychotics in the acute treatment of schizophrenia lar mania (Yildiz, Vieta, Tohen, & Baldessarini, 2011).
(Leucht, Arbter, Engel, Kissling, & Davis, 2009). In a Both reviews reported a secular trend for rising placebo

S trategies for I mprov ing R andomized T rial E v idence for T reatment of B ipolar D isorder   •   6 01
effects for industry-supported studies (Sysko & Walsh, Large studies do not only involve higher number of study
2007; Yildiz, Vieta, Tohen, et al., 2011). sites but also large patient samples. For bipolar mania the
meta-analysis described previously indicated that also a large
sample size was associated with greater placebo-associated
P OT E N T I A L E F F EC T MODI F I E R S
improvements (MD: 38 trials; ß = +0.06, 95% CI: 0.04 to
I N R A N D OM I Z E D CON T ROL L E D T R I A L S
0.08, z = 6.47, p < .0001) and smaller drug–placebo con-
A N D S T R AT E G I E S TO DI M I N I SH T H E I R
trasts (SMD as Hedges’ g: 48 trials; ß = –0.001, CI: –0.003
U N FAVOR A BL E I M PAC T S
to  –0.0004, z =  –2.63, p  =  .008; Yildiz, Vieta, Tohen,
In this section each of the below factors are evaluated as poten- et al., 2011). In a database of 27 acute schizophrenia stud-
tial effect modifiers in the context of RCTs of acute mania. ies conducted between 1997 and 2008 it was reported that
for each 1-point increase in the placebo mean change, the
• The number of study sites drug–placebo contrast decreased 0.4 points (Mallinckrodt
et al., 2010). The authors also reported that as the percent-
• Sample size
age of patients randomized to placebo increased, mean
• Gender placebo-induced improvements decreased (Mallinckrodt
et al., 2010). In another meta-analysis of antipsychotic tri-
• Age
als of schizophrenia, Agid and colleagues (2013) reported
• Psychotic and mixed features that greater placebo response was associated with the lower
percentage of patients assigned to placebo when only stud-
• Trial completion rates
ies published since 1998 were considered. However, the
• Baseline disease severity association was lost when the entire set of trials were con-
sidered (Agid et al., 2013). These opposing effect directions
• Quality of ratings
documented for the RCTs of bipolar mania versus schizo-
• Study and illness duration phrenia may result from differences in trial, patient, and/
or disease characteristics. For instance, for schizophrenia
• Financial incentives affecting investigators and candi-
trials the percentage of patients randomized to placebo
date patients
was reported in the range of 20% to 33% (Mallinckrodt
et al., 2010), while it was about 47% for the RCTs of bipo-
A recent meta-regression analysis investigating the phe-
lar mania (Yildiz, Vieta, Tohen, et al., 2011). As mentioned
nomenon of placebo response in bipolar mania established
earlier, current trends in the RCTs settings by compelling
that higher number of collaborating study sites was strongly
speedy completion and positive outcomes are likely to result
associated with greater placebo-induced improvements (mean
in larger and larger trials yielding smaller contrasts through
difference [MD]: 38 trials; ß = –0.11, 95% CI: –0.15 to –0.06,
increased variance and noise.
z = –4.67, p < .0001) and diminished drug–placebo contrasts
Trial participants’ age was also reported as an effect
(SMD as Hedges’ g: 48 trials; ß = 0.007, 95% CI: 0.003 to
mediator in the context of placebo-controlled RCTs of
0.01, z = 3.79, p = .00015; Yildiz, Vieta, Tohen, et al., 2011).
bipolar acute mania (Yildiz, Vieta, Tohen, et  al., 2011).
A finding missed in previous reviews owing to consideration
This meta-regression analysis has documented that younger
of placebo and drug arms conjointly in the same regression
mean age of trial population was associated with lesser pla-
model (Tarr et al., 2011). The finding indicating the number
cebo effects (MD:  36 trials; ß  =  +0.92, 95% CI:  0.33 to
of study sites as an effect mediator has recently found sup-
1.15, z = 3.07, p = .002) and greater drug–placebo contrasts
port in antipsychotic treatment trials of schizophrenia spec-
(SMD as Hedges’ g: 46 trials; ß = –0.09, CI: –0.13 to –0.04,
trum disorders (Agid et  al., 2013). In this comprehensive
z = –4.03, p = .00006; Yildiz, Vieta, Tohen, et al., 2011).
meta-analysis, the authors detected a significant increase in
Again, direction of effect was opposite to the antipsychotic
placebo response over the past few decades and explained this
trials of schizophrenia, for which younger age was associ-
temporal effect by an increase of the number of sites per trial
ated with higher placebo responses (Agid et al., 2013).
accompanied by a decrease in the number of academic sites
Similarly, male gender was found to be associated
(Agid et al., 2013; Leucht, Heres, et al., 2013). In light of these
with fewer placebo responses (MD:  35 trials; ß =  –0.18,
findings, we can propose the number of study sites as a criti-
95% CI: –0.29 to –0.06, z = –3.04, p = .002) and greater
cal factor contributing to the substantial noise encountered in
drug–placebo contrasts (SMD as Hedges’ g:  46 trials;
the context of placebo-controlled RCTs.
ß = +0.02, 95% CI: 0.007 to 0.03, z = 3.49, p = .0005) in the

6 0 2   •   S E C T I O N I V:   F U T U R E R E S E A R C H
context of bipolar mania trials (Yildiz, Vieta, Tohen, et al., had any impact on responses to placebo treatment. The
2011). Gender effect in schizophrenia trials is questionable, sensitivity and specificity of the observed effects, as well as
as the most comprehensive recent review did not detect any clinically distinct features of pure mania versus mania with
sign of gender influence on the placebo-associated responses mixed features, may validate consideration of patients with
(Agid et al., 2013), while an earlier review had documented prominently mixed features, exclusively in independent tri-
a 3-point increase in the mean placebo-associated improve- als. However, given recent modifactions in description of
ments for each 10% increase in the female trial populations mixed features this finding needs to be replicated in future
(Mallinckrodt et al., 2010). While disease-based specificity antimanic treatment trials. Given arm-based specificity of
of the effect is questionable, given trial-level protection by the effect, it is not surprising that a previous meta-regression
randomization one would expect equal gender distribution model considering all study arms conjointly could only
in the treatment arms. Yet, as documented, nonsignificant detect a trend toward a poorer outcome associated with
differences in the distribution of male and female genders mixed states (Tarr et al., 2011). Based on above summarized
in the corresponding study arms yield a significant media- evidence, we may suggest inclusion of patients with mod-
tor effect when considered all together in the context of erate to severe forms of mania with or without psychotic
placebo-controlled RCTs. If such an effect can be verified features but with no prominent depressive components.
in future regression models, trial designs for acute mania Special care should be given to avoid patients who might
may call for more male participants but with a balanced have already entered the natural waning stage for the cur-
study arm-based allocations. rent episode as well as treatment-refractory patients.
Another proposed factor operating in the proposed Another factor associated with lower drug–placebo
vicious cycle is disease severity. Rising ethical standards and contrasts in bipolar mania trials was the higher dropout
concerns on randomizing severe patients to placebo might rates (Yildiz et  al., 2011). If the drug-associated benefit
have caused enrolment of progressively milder patients and the resulting treatment effect is being modulated by
into clinical trials over time. The largest standard pairwise disease severity either in the form of high baseline scores
meta-analysis on antimanic treatment trials indicated that or presence of psychotic features, manic patients need to be
not placebo- but drug-associated improvements were pre- exposed to the active drug long enough for accomplishment
dicted by increased disease severity (46 trials; ß  =  +0.26, of the brain’s adaptive response to drug-induced pharma-
95% CI: 0.13 to 0.40, z = 3.80, p = .0002; Yildiz, Vieta, cological alterations. The meta-analysis over 56 compari-
Leucht, et  al., 2011). The finding, being specific to active sons found that higher trial completion rates in drug arms
drugs, supported by the previous meta-analytic models, were associated with both greater drug-associated benefit
cannot solely be explained by a regression to the mean, and drug–placebo contrasts (MD: 46 trials; ß = 0.08, 95%
but rather suggest a pharmacological response component CI:  0.04 to 0.13, z  =  3.99, p  =  .00007; SMD as Hedges’
in patients with more severe forms of bipolar mania (Tarr g: 46 trials; ß = 0.006, 95% CI: 0.002 to 0.009, z = 2.86,
et al., 2011; Yildiz, Vieta, Leucht, et al., 2011). In that sense, p  =  .004, respectively; Yildiz, Vieta, Tohen, et  al., 2011).
it supports the assumption that patient populations expe- Considering that RCTs of acute mania involve already
riencing more severe forms of mania would be enriched short study durations (three or four weeks), high rates of
for drug responders. Another factor related to severity of early dropouts are likely to mask actual magnitude of the
manic episode is the presence of psychotic features. Impact true drug effects.
of psychotic features was also specific to active drugs Evidence indicates that younger bipolar patients with
(MD: 40 trials; ß = +0.10, 95% CI: 0.06 to 0.15, z = 4.22, short illness durations would be more likely to respond to
p  =  .00002), with the effect being more influential, as it pharmacologic treatment and less likely to respond to pla-
was also reflected on the enhanced drug–placebo contrasts cebo (Yildiz, Vieta, Tohen, et  al., 2011). Since diagnosis
(SMD as Hedges’ g:  40 trials; ß  =  +0.85, 95% CI:  0.43 of bipolar disorder can technically be made only after the
to 1.28, z = 3.91, p = .00009; Yildiz, Vieta, Tohen, et al., establishment of the first manic episode and an accurate
2011). In contrast with the effect of psychosis, another documentation of age onset for the first mood episode is
disease-specific modulator: mixed features decreased both often barely obtainable, we did not attempt to extract data
drug-response and drug–placebo contrasts (MD: 45 trials; for illness duration. Nevertheless, present findings on the
ß = –0.07, 95% CI:–0.12 to –0.03, z = –3.26, p = .001; SMD age effect as well as the clinical impression for obtaining
as Hedges’ g: 45 trials; ß = –0.59, 95% CI: –0.99 to –0.19, better treatment responses in patients with more recent dis-
z = –2.92, p = .004, respectively; Yildiz, Vieta, Tohen, et al., ease onset support the notion that younger patients with
2011). Neither presence of psychotic- nor mixed-features shorter illness durations may constitute better research

S trategies for I mprov ing R andomized T rial E v idence for T reatment of B ipolar D isorder   •   6 0 3
samples with enhanced treatment contrasts. If this notion dealing with extremely energy-demanding manic patients
can find further support, a new illness duration related for the subsequent two or three weeks. Considering that
inclusion criterion may be considered. the brain’s adaptive response to pharmacotherapy necessi-
Another potentially important factor mediating tates longer exposure and previously documented negative
drug–placebo contrasts is the quality of ratings:  both for influence of early dropouts on drug-associated responses, as
baseline- and outcome-assessments. A recent methodologi- well as drug-placebo contrasts, we propose a payment strat-
cally sound study compared effect of site-based and cen- egy facilitating constant investigator motivation through-
tralized ratings on patient selection and placebo response out the study. Last but not least, employment of rescue
in subjects with major depressive disorder (Kobak et  al., medications in conduct of acute mania trials may call for
2010). That study established that 35% of the included special attention. The protocol-guided use of rescue medi-
subjects would not have been eligible to enter the study if cations is usually left to the yard staff since the site investi-
the centralized rater’s score was used to determine study gators are often occasionally on the yard. Given that manic
entry. Further, the mean placebo change for the site-raters patients would often cause sleepless and stressful shifts,
(~7.5) was significantly greater than the mean placebo the yard staff may tend to employ rescue medications at a
change for centralized raters (~3.18, p < .001; Kobak maximum possibly allowed dose and rate. Given that exclu-
et  al., 2010). Although there is no published study inves- sion of benzodiazepines as rescue medications from the
tigating this issue in terms of manic symptom assessments, trial protocols is not feasible, their employment by the site
baseline score inflations, as well as expectancy effects (the investigators, rather than the yard staff, on an exclusively
tendency to see improvement over time), site-based rat- as-needed basis may facilitate drug–placebo separation and
ings are likely to increase placebo-associated responses reduce the risk of failure trials. Since manic patients often
and diminish drug–placebo contrasts for the RCTs of need constant extra care and timely handling during repeat-
bipolar mania as well. Given natural vulnerability of edly encountered crisis, for enabling optimized and timely
scale-based measurements, such inflations in baseline- site evaluation and crisis management we suggest that the
and/or outcome-assessments can be easily mediated by the site investigators to stay on the yard through day and night
site investigators, site nurses, as well as study patients. For shifts during the study conduct. Constant presence of the
instance, the so-called “professional” patients seen in the site investigators on the yard would not only optimize use
catchment area of often-used trial sites might exaggerate of rescue medications but also award system systematically.
their symptoms in order to be eligible for a study (Alphs Specific features of the manic syndrome such as inflated
et  al., 2012). Similarly, site-investigators themselves may self-esteem, high energy, and increased tendency for plea-
consciously or unconsciously inflate baseline scores in order surable activities with diminished or no sleep make com-
to meet patient eligibility requirements, especially when patibility with a protocol-guided treatment in an inpatient
sites are paid on a per-patient basis. For ensuring quality unit very challenging both for the patients and staff. For
of ratings, interrater reliability should be checked at the minimizing early dropouts with limited use of pharmaco-
beginning and also often during the study conduct. A cen- logical restraints while optimizing patients’ comfort and
tralized rater should synchronously watch the site investi- compliance, employment of some recreational incentives
gators’ ratings, and patient inclusion should be decided by with a negotiation-based award-promoting approach may
their mutual agreement. Manic symptom improvements help them to stay calm in the yard without harming them-
should be assessed by standardized use of a common met- selves or others. Among such incentives may be special per-
ric such as the Young Mania Rating Scale (YMRS; Young, missions for immediate family members’ visits to the yard,
Biggs, Ziegler, & Meyer, 1978). Administration of different occupation with safety-guarded energy-consuming games
instruments with discrepancy in target symptoms, measure- or activities, day and night television/video entertainment,
ment items, and score ranges, may potentially increase vari- Wi-Fi access, and special meals. One may argue against use
ance and heterogeneity (Yildiz, Vieta, Correll, Nikodem, & of such recreational activities with concerns on enhanced
Baldessarini, 2014). placebo responses. However, our experience with the larg-
Finally, regulations on investigator payments may have est proof-of-concept academic trial in a manic patient sam-
an impact on the study completion or dropout rates as such ple with a mean syndromal severity of 63% of maximum
drug-effects and treatment contrasts. In industry-sponsored possible score by the YMRS and psychotic features accom-
trials site investigators are often paid on a per-patient basis panying at a rate of 67% implied that employment of some
upon completion of one post-baseline evaluation. In such recreational incentives increased study completion rates
circumstances investigators may lose their motivation for and enhanced drug–placebo contrasts (Yildiz et al., 2008).

6 0 4   •   S E C T I O N I V:   F U T U R E R E S E A R C H
In contrast, without such sensitive considerations, RCTs placebo and time to destabilization is the endpoint (Alphs
are left to patients with softer or already alleviated forms of et al., 2012; Correll et al., 2003). The assumption underly-
manic syndrome or to the so called “professional” patients, ing such a design would be that the early responders would
yielding a group enriched for placebo responders. It is likely include a greater proportion of patients who are truly
that a research population with more severe forms of manic “drug responsive,” although it would also include some
syndrome would be enriched for real drug responders, but proportion of “placebo responders” (Alphs et  al., 2012).
we need them to stay in the trial long enough to let the brain Finally, investigator-initiated, large international trials
accomplish an adaptive response. The optimum and safe to be funded by the National Institute of Mental Health,
employment of such award mechanisms can be achieved the FDA, the Department of Veterans Affairs, European
only by the constant supervision of the site investigators. Union Human Research Programs, and other medi-
Quality and safety control by an independent monitor may cal research agencies by employing adaptive or sequential
also be considered. On condition that any indications of treatment strategies involving acute as well as maintenance
poor patient care and/or study conduct are identified, an phases would provide clinically very useful answers for the
action plan for further training or change of the investiga- most critical research questions. Improvement of industry-
tors or research nurses or safety guards may be considered, or academia-initiated placebo-controlled RCTs together
and a central backup team may take position on the site in with employment of such alternative designs would likely
the meantime. change the face of placebo-associated responses, as well as
the magnitude of the observed treatment contrasts in the
trials of bipolar mania.
CONS I DE R AT ION OF 
Finally, improvements in the individual RCTs would
A LT E R NAT I V E DE S IG NS
have extended impacts when combined via analytic evi-
Placebo-controlled RCTs of mania constitute A-plus evi- dence synthesis approaches. To enable the most informa-
dence, and they are actually the designs requested by the tive and valid future evidence synthesis, each RCT effort
regulatory agencies. While we encourage the aforemen- should be registered before conduct and published upon
tioned approaches for improvement of quality and hopefully completion by reporting quantifiable data on all outcome
quantity of placebo-controlled RCTs of bipolar mania, in measures in a standardized manner. Further, by consider-
situations with limited sources alternative trial designs such ing future evidence syntheses, each RCT report should
as add-on or head to head trials may provide complemen- report on the subgroups, such as those with or without psy-
tary clinical information if planned appropriately. As in the chotic or mixed features. In addition, outcome measures
case of add-on trials, positive head to head trials designed should involve important side effects encountered with use
for testing noninferiority may serve to provide rationale of antimanic drugs, again via a standardized approach: for
for a placebo-controlled monotherapy trial and accumulate example, data on patient’s weight at baseline and end point,
data for future evidence synthesis. Head to head trials prov- change in glucose or lipid levels, number of subjects who
ing superiority over standard treatments might be another experience akathisia and those who need anti-Parkinso-
option although this approach, even more so, requires drugs nian drugs, quantifiable data on extrapyramidal side-effect
that are truly more efficacious (Leucht, Heres, et al., 2013). assessments, and suicidal thoughts or acts. Without accu-
The use of an early-response/nonresponse paradigm is mulation of such data neither considerate evidence-based
another study design worth considering in order to enhance clinical decisions nor an evidence-based action plan for
the selection of true drug responders, who can then par- future research investments would be possible.
ticipate in a double-blind, placebo-controlled discontinua-
tion trial (Alphs et al., 2012; Correll, Malhotra, Kaushik,
McMeniman, & Kane, 2003; Kinon et  al., 2010). This C L I N IC A L T R I A L S OF 
strategy involves treating all patients with the experimental B I P OL A R DE PR E S S ION
drug with or without an active control and then identify-
ing those subjects who have at least a minimal prespecified As reported in the previous chapters, available evidence on
level of improvement after two weeks of treatment. Because bipolar depression (BPD) is strikingly scarce. There are many
such patients represent a group enriched for drug respon- possible reasons for this: The first is related to the fact that
siveness, they constitute an ideal subgroup to enter into a many clinicians and researchers have long assumed that
double-blind discontinuation study, where, following stabi- major depressive episodes in bipolar or unipolar disorders
lization, patients are randomized to continued treatment or share some common clinical characteristics and treatment

S trategies for I mprov ing R andomized T rial E v idence for T reatment of B ipolar D isorder   •   6 0 5
response. This view also explains why antidepressants have on the attempts for evidence synthesis, available RCTs for
long been the most common clinically employed treatment BPD are for the most part weak, and the methodological
for BPD (Baldessarini et  al., 2007). Moreover, this view quality of study designs and their reporting is highly het-
explains the limited commercial interest in extending regula- erogeneous. A  recently published systematic review ana-
tory indications of antidepressants beyond “major depression” lyzed the methodology used in the individual RCTs of BPD
and the lack of specific trials aimed to establish antidepres- in the past 20 years (Spanemberg et al., 2012). The authors
sant efficacy for BPD. The second reason may be related to included 30 RCTs in their analysis (all of them published
safety issues. There are concerns that some drugs used to treat in journals with an impact factor >3) and found striking
BPD (i.e., antidepressants), might induce mood switching, results indicating the uneven and, in general, poor quality
suicidality (i.e., suicidal thoughts or acts), or mood destabi- of available evidence. For example, almost half of considered
lization and rapid cycling (Bauer, Beaulieu, Dunner, Lafer, RCTs were conducted with fewer than 50 patients; 70% of
& Kupka, 2008; Undurraga et  al., 2012). These concerns them failed to describe the method used in determining
have led to routine exclusion of bipolar patients from most their sample sizes; 50% did not assess remission of depres-
RCTs in major depression in the past 20 years (Undurraga sion as an outcome (they limited their analysis to response,
& Baldessarini, 2012). Those allowing inclusion of patients usually defined as 50% reduction in a depressive symptoms
with BPD did not usually select or stratify according to scale); half had attrition rates over 20%; and last observa-
polarity (Geddes & Miklowitz, 2013). The third reason is tion carried forward analysis was used in only two-thirds
related to clinical features of mood disorders (Baldessarini, of the published reports (see Table 42.1 for further details).
2013), which are typically characterized by clinical complex- The observed heterogeneity in BPD trials makes the
ity, daily or weekly fluctuations in mood and behavior, and comparison between outcomes (e.g., efficacy measures, tol-
high risk of relapse or recurrence. Moreover, some of the erability) for different treatments and placebo difficult, as
available treatments can even worsen clinical course when there is a lot of statistical noise. Moreover, high variability
prescribed or when discontinued, especially when discon- is associated with less statistical power (i.e., more false nega-
tinued abruptly (Baldessarini, Vieta, Calabrese, Tohen, & tive results; Ghaemi, 2009).
Bowden, 2010). This clinical complexity and lack of available
RCT–based homogenous evidence on the treatment options
I S PL AC E B O R E S P ONS E R I S I NG I N 
for BPD address the importance of establishing standards for
A N T I DE PR E S S I V E T R E AT M E N T T R I A L S?
future RCTs with reliable and consistent diagnostic and clin-
ical assessments, as well as outcome definitions. In addition, There has been a steady increase in placebo-associated
not only do the trial design factors have to meet high quality responses and an accompanying decrease in drug–placebo
standards but also the trial conduct, as well as data reporting. contrasts in antidepressant trials over the past three decades
(Khan, Bhat, Kolts, Thase, & Brown, 2010; Undurraga &
Baldessarini, 2012). Further, certain trial characteristics
E V I DE NC E ON E F F IC AC Y A N D S OU RC E S
such as sample sizes or number of study sites were signifi-
OF OU TCOM E H E T E RO G E N E I T Y
cantly associated with the higher placebo responses in the
Meta-analytic reports on antidepressant use for BPD are antidepressant trials (Undurraga & Baldessarini, 2012;
often weak and inconsistent due to the paucity (involv- Walsh, Seidman, Sysko, & Gould, 2002). This association
ing only 4 to 12 trials) and substantial heterogeneity of has been observed particularly in the context of unipolar
the available RCTs (Gijsman, Geddes, Rendell, Nolen, & depression; however, unipolar and bipolar differentiation
Goodwin, 2004; Sidor & MacQueen, 2012; Vázquez, has been a matter of large debate in the past century, and
Tondo, Undurraga, & Baldessarini, 2013). For this reason, most studies until mid-1980s did not differentiate between
their value in the context of BPD at present is highly ques- the two (Baldessarini et  al., 2010; Benazzi, 2007; Khan
tionable, as pointed out in the International Society for et al., 2010).
Bipolar Disorders (ISBD) task force report on the use of
antidepressants in bipolar disorder, where the highest level
P OT E N T I A L FAC TOR S CON T R I BU T I NG
of evidence was rated as B (Pacchiarotti et al., 2013). The
TO S TAT I S T IC A L H E T E RO G E N E I T Y A N D
situation is similar regarding other available treatments of
NOI S E FOR BI P OL A R DE PR E S S ION T R I A L S
bipolar depression, such as anticonvulsants, lithium, and
antipsychotics (Cipriani et al., 2013; De Fruyt et al., 2012; Certain methodological characteristics of BPD trials and
Reinares et al., 2013; Vieta & Valentí, 2013). As reflected trial participants may contribute to heterogeneity, noise

6 0 6   •   S E C T I O N I V:   F U T U R E R E S E A R C H
Table 42 .1.  METHODOLOGICAL QUALITY OF  ITEM EVALUATED/DESCR IBED NO. STUDIES (%)
PUBLISHED R ANDOMIZED-CONTROLLED
TR IALS IN BIPOLAR DEPR ESSION Type of statistical analysis

Only LOCF 20 (66.7%)


ITEM EVALUATED/DESCR IBED NO. STUDIES (%)
LOCF + OC 2 (6.7%)
Follow-up time (weeks)
≤6 9 (30%) Other or unclear 8 (26.7%)

6 to 8 13 (43.3%) Assessed remission

>8 weeks 8 (26.7%) Yes 13 (43.3%)

Placebo use No 17 (56.7%)

Manic switch
Yes 20 (66.7%)
Not assessed or not described 3 (10%)
No 10 (33.3%)
Dropouts (follow-up)
Diagnostic assessment
≤20% 13 (43.3%)
Structured diagnosis (SCID/MINI) 20 (66.6%)
20% to 50% 14 (46,7%)
Semistructured diagnosis 1 (3.3%)
≥50% 2 (6.7%)
Clinical diagnosis only 9 (30%)
Unclear 1 (3.3%)
Washout period
Type of sponsorship for the study
Yes 11 (36.7%)
Corporate sponsored 14 (46.7%)
No 18 (60%)
Sponsored by institutions/foundations/ 10 (33.3%)
Not described 1 (3.3%) government
Inclusion criteria Not declared/unclear/unsponsored 6 (20%)
Cutoff score defined (HDRS, 27 (90%) NOTE: Adapted from Spanemberg et al. (2012).
MADRS, etc.)

Cutoff score is not defined 3 (10%) effect, and lower effect sizes (Henkel et  al., 2012; Khan,
Number of patients (Total) Schwartz, Kolts, Ridgway, & Lineberry, 2007; Schalkwijk,
Undurraga, Tondo, & Baldessarini, 2014; Undurraga &
≤50 14 (46.7%)
Baldessarini, 2012):
50 to 100 5 (16.7%)

100 to 200 3 (10%)


• Trial-durations

>200 to 500 3 (10%) • Involved study arms: placebo versus active comparators


≥500 5 (16.7%) • Number of involved active treatment arms
Sample size calculation • Sample sizes
Described 12 (40%) • Degree of baseline disease severity
Not described or not performed 18 (60%) • Comparability in dosing strategies
BD Subtypes
• Involvement of ethnically distinct patient populations
BD Type I 6 (20%)
• Number of study sites
BD Type II 1 (3.3%)
• Frequency of assessments
Both 18 (60%)
• Employment or length of a wash-out phase via placebo
Unclear 5 (16.7%)
or no treatment

S trategies for I mprov ing R andomized T rial E v idence for T reatment of B ipolar D isorder   •   6 0 7
S T R AT E G I E S FOR DI M I N I SH I NG NOI S E A N D Hamilton Rating Scale for Depression (HRSD), the
E N H A NCI NG T R E AT M E N T CON T R A S T S Montgomery-Åsberg Depression Rating Scale (MADRS),
I N BI P OL A R DE PR E S S ION T R I A L S and the Inventory for Depression Symptomatology
(IDS), which were initially developed to evaluate uni-
Pragmatic trials in addition to placebo-controlled
polar major depressive disorder. Although the HRSD,
randomized controlled trials
with 23 items, considers atypical features, the HRSD–17
The placebo-controlled, double-blind, parallel-group RCTs Items, MADRS, and IDS lack sensibility to correctly
are the gold standard for establishment of efficacy for an estimate “atypical” depressive symptoms such as anergy,
experimental treatment, but they have an important limita- hypersomnia, or hyperphagia, which are common in
tion. That is, the patients are highly selected and homoge- bipolar depression and partly encountered also in mixed
neous and therefore are unrepresentative of the typical bipolar manic-depressive states (Baldessarini et  al., 2010).
patients who tend to have multiple comorbidities with hetero- Moreover, most of them were developed in nonblind stud-
geneous presentations and are typically on multiple treatments ies of hospitalized patients and are likely to be less sensi-
(Post, 2009). The fact that rate of exclusions for potential tive to milder or subsyndromal forms (Tohen et al., 2009).
subjects in a RCT is as high as 80% to 90% underscores how Another scale recently developed for the dimensional
selective RCT patient populations can be (Post, 2010). In measurement of bipolar depression is named as the Bipolar
addition, for technical reasons most trials use monotherapy Depression Rating Scale (BDRS), which comprises items
in their design, which is far from the usual clinical care. This for assessment of atypical depressive and mixed symptoms
situation makes outcome generalizability (i.e., external valid- (Berk et al., 2007).
ity) of these types of studies a difficult and arguable exercise.
On the other hand, they have good internal validity, and valid
Outcome definitions
conclusions can be made regarding treatment efficacy.
To deal with the problem of generalizability, effective- Bipolar disorder is a complex illness, with variable course
ness trials (otherwise known as pragmatic or practical tri- between and within life cycles of affected individuals.
als) have been proposed (March et  al., 2005). This type Clinical presentations such as mania, hypomania, depres-
of “clinically useful” oriented design compares clinically sion, presence of psychotic and/or mixed manic-depressive
important interventions and allows the inclusion of more features, subsyndromal depressive states, and rapid cycling
complex patients with comorbid psychiatric problems, as course with variable interepisodic functionality and cogni-
such being more akin to everyday clinical practice (though tive impairments, make clear-cut outcomes, which would
heterogeneity could reduce internal validity). This kind of cover all various forms of disease manifestation and course,
design has been used in recent clinically useful studies such difficult to define. The problem is even more significant if
as STAR-D, STEP-BD, and CATIE (Post, 2010). A more we consider that many of these clinical presentations can
comprehensive description of the problem with generaliz- remit or change both in relation to treatment as well as
ability and judicious consideration of different trial designs spontaneously, as most are time limited.
are provided in Chapter 43 of this book. In the light of this clinical complexity, the definition of
clear-cut outcomes and treatment objectives (i.e., clinically
meaningful outcomes) is mandatory. An expert consensus
Bipolar disorder subgroups
and standardization of outcome definitions and terminol-
Bipolar type I and II have different clinical characteristics ogy is essential to make meaningful comparisons across
and treatment responses (Judd et  al., 2003; Pacchiarotti studies, as well as combined analysis. Some commonly used
et al., 2013). Consequently, clinical trials of bipolar depres- outcomes in BPD trials based on available evidence and
sion may either be conducted for a specific bipolar type or expert consensus are defined next (Martinez-Aran et  al.,
may involve patients with both bipolar subtypes; however, 2008; Rush et al., 2006; Tohen et al., 2009):
in such circumstances a subgroup analysis of outcomes
should be well documented.
R E S P ONS E

Response is typically defined as ≥50% reduction in the


Evaluation instruments
initial depression severity scores, as commonly measured
The most frequently used clinical scales to assess depres- via the HRSD, MADRS, IDS, or BDRS. It is a clinically
sive symptom severity and treatment response are the useful definition, used to indicate whether to continue

6 0 8   •   S E C T I O N I V:   F U T U R E R E S E A R C H
with the same drug, stop and change treatment, or adjust R ECOV E RY
doses. The main problem of response is that it greatly
Recovery implies remission for an extended period of time,
depends on the baseline measure of symptom severity,
such that the possibility of relapse or roughening is no lon-
and regression to the mean may contribute to an invalid
ger a concern and another affective episode is unlikely to
impression of symptomatic improvement (Fava, Evins,
occur in the near future. (The Diagnostic and Statistical
Dorer, & Schoenfeld, 2003). Importantly, a time criterion
Manual of Mental Disorders [fifth edition] defines it as two
in its definition should be employed to avoid misinterpre-
months, the same as the 2009 ISBD task force, which rec-
tations by random mood fluctuations (two to four weeks
ommends 8 weeks as the sustained remission criteria).
has been recommended by a recent ISBD task force on
the course and outcome nomenclature in bipolar disor-
ders; Tohen et al., 2009). In addition, during assessment R E L A P S E A N D R ECU R R E NC E
of response in the context of BPD trials, manic symp-
toms should also be evaluated using standardized scales Both concepts imply the clinical manifestation of the epi-
to demonstrate that no switching or manic worsening has sode within the predefined time limits. The definition of
occurred, which would invalidate response as a clinically relapse and recurrence depends on the natural course of
significant outcome. the illness (in this case, the bipolar depressive episode).
The ISBD has defined relapse as the early return of the
syndrome (≤ 8 weeks), that is, before recovery, and recur-
R E M I S S ION rence as a late return of the syndrome (>8 weeks)—in other
Remission is typically based on the posttreatment depres- words, during or after recovery.
sion severity scores. An upper limit for the depression score
is defined under which the patient is defined as remitted S W I TCH I NG
(often if sustained for a period of time). Remission implies
that the signs and symptoms of depression are absent The course of illness for a bipolar patient may involve
or nearly absent. Differentiating between response and immediate change or switching to the opposite pole (i.e.,
remission is important, as a patient could respond to a depression to mania or vice versa) or a return to a normal
treatment without remitting (i.e., patients with residual mood state (i.e., euthymia) prior to the next episode. Even
depressive symptoms). These patients (responders but though some drugs could induce switching, the attribu-
not remitters) experience more psychosocial impairment tion of treatment as the cause of this complication could
and have a higher likelihood of recurrence (Thase, 2003; be misleading as it could also correspond to the natural
Zimmerman et al., 2004). The definition of remission (i.e., course of the illness. Tohen and colleagues (2009) have
cut-off scores and time intervals) is still a matter of debate. proposed the nomenclature “treatment-emergent affec-
Unsurprisingly, lower cut-offs in depression severity rating tive switch” (TEAS) to avoid attributing causality and
scales scores have been associated with better functional- propose a definition of definite TEAS if occurring in less
ity and fewer recurrences in depression (Zimmerman than eight weeks. In addition, they propose including
et al., 2004). Remission cut-offs with the most commonly the specific treatment if the switch emerges within less
used depression rating scales have been suggested by than two weeks from the beginning of treatment (e.g.,
expert panels (Rush et al., 2006; Tohen et al., 2009). One antidepressant-associated TEAS).
of the main limitations of this concept is that it may not
reflect the quantitative change in symptoms (i.e., patients
S U B S Y N DROM A L
who enter a study with low scores could have little change
in their symptoms and still be counted or classified among The term subsyndromal is used to describe patients who
the remitters; Tohen et  al., 2009). Remission has impli- fail to meet the full diagnostic criteria for a mood episode.
cations for functional outcome, prognosis, and course Subsyndromal symptoms are associated with worse social
(Zimmerman et al., 2004); as such, it should be employed and occupational functioning and may increase the risk of
as a primary outcome in BPD trials. Nevertheless, func- relapse (Judd et al., 2008; Marangell, 2004). Considering
tionality should be measured as a separate outcome, as their prognostic relevance, subsyndromal symptoms should
symptomatic remission in bipolar depression is not nec- be measured. However, the symptom severity scales for
essarily associated with a return to premorbid day-to-day depression have strong limitations regarding this point, as
functioning (Tohen et al., 2009). discussed later.

S trategies for I mprov ing R andomized T rial E v idence for T reatment of B ipolar D isorder   •   6 0 9
Table 42 .2 .  R ECOMMENDATIONS FROM THE R A PI D C YC L I NG
INTER NATIONAL SOCIETY FOR BIPOLAR
DISORDERS TASK FORCE R EPORT ON THE
Rapid cycling was originally described by Dunner and
NOMENCLATUR E OF COURSE AND OUTCOME Fieve (1974), who arbitrarily defined it as patients pre-
IN BIPOLAR DISORDERS senting four or more distinct episodes in one year. It has
prognostic importance, as it has been associated with more
CUANTIFICATION
severe depressive episodes, poorer response to treatment,
(SCOR E, W EEKS,
ITEM MEASUR E PERCENTAGE) worse functioning, higher substance abuse comorbidity,
and greater suicidal risk (Cruz et al., 2008). Clinical trials
Symptomatic HDRS, MADRS,
Response IDS, BDRS % improvement
aimed at evaluating a rapid cycling pattern should employ
an adequate extension phase.
<25%, 25–49%,
50-74%, 75–100%

Symptomatic HDRS-17 ≤5 or ≤7 points PR E D OM I NA N T P OL A R I T Y


Remission
Predominant polarity is defined as patients having at
MADRS ≤5 or ≤7 points least two-thirds of their lifetime episodes at one polar-
ity or the other (Colom, Vieta, Daban, Pacchiarotti, &
BDRS ≤8 points
Sanchez-Moreno, 2006), that is, either predominantly
Recovery Time (weeks) Remission ≥8 weeks depressed or predominantly manic. This course speci-
Relapse Time (weeks) New episode <8weeks fier has strong prognostic value, especially relevant to
from remission long-term therapeutic decisions and prediction of outcome.
Predominant depression has been associated with depres-
Recurrence Time (weeks) New episode ≥8weeks
from remission sive or mixed onset, occurrence of more mixed episodes,
rapid cycling course, accompanying psychotic features, and
Subsyndromal HDRS-17 8 to 14 points higher suicidal risk (Baldessarini et al., 2012).
Depression

MADRS 8 to 14 points
F U NC T IONA L OU TCOM E S
BDRS 9 to 16 points
Bipolar disorder can have important consequences in social,
Predominant Depressive 2/3 total episodes cognitive, and occupational functioning. These functional
polarity episodes being depressive
deficits have been traditionally associated with affec-
TEAS tive episodes but may also present during subsyndromal
states and even during euthymia (Jaeger & Vieta, 2007).
Definite & 2 consecutive Full episode ≤ 2 weeks
treatment specific days (>50% time As previously stated, better functionality has been associ-
each day) ated with symptomatic remission, but remission does not
mean returning to a premorbid state of functioning, hence
Definite 2 consecutive Full episode ≤ 8 weeks
days (>50% time it should be measured as a separate outcome. There is a pau-
each day) city of methods for measuring disability. Some of the most
important ones are described next.
Likely 2 consecutive (>2 symptoms +
days (>50% time YMRS >12) ≤ 12 The recently developed Functioning Assessment Short
each day) weeks Test measures six domains of functioning (autonomy,
occupational and cognitive functioning, financial issues,
Possible 2 consecutive (change mood or
days (>4 hrs each energy + YMRS >8) ≤ interpersonal relationships, and leisure time) and may be a
day) 12 weeks good instrument for clinical and research settings, since it
is brief, has high reliability, and is intended specifically for
Unlikely Fleeting > 16 weeks
symptoms. bipolar patients (Rosa et al., 2007). In addition, the World
Environmental Health Organization designed a tool aimed at identify-
or exogenous
contribution
ing and classifying relevant domains of human experience
affected by health conditions (Ayuso-Mateos et al., 2013).
NOTE: Adapted from Tohen et al. (2009).
The International Classification of Functioning, Disability

610   •   S E C T I O N I V:   F U T U R E R E S E A R C H
and Health has two core sets especially developed for bipo- L E V E L OF S IG N I F IC A NC E : P VA LU E
lar disorder, which describe illness-associated functional
As stated earlier, high variability observed in bipolar
problems (personal and environmental factors). Its main
depression trials could lead to false-negative outcomes. On
limitation is that it lacks validation in clinical settings.
the other hand, making repeated analysis of data greatly
Another relevant tool is the World Health Organization
augments the possibility of obtaining significant p values
Disability Assessment Schedule, which evaluates differ-
(typically p < .05), that is, false positives. For example, the
ent domains of functioning and is a helpful research tool,
chance of obtaining false positives (obtaining significance
although it may be too long to use in daily clinical practice
by chance) for a p < .05 will be 5% when making one com-
(World Health Organization, 2013).
parison, 10% for two comparisons, 23% for five compari-
sons, and so on (Ghaemi, 2009). This could be corrected
using different methods, such as the Bonferroni correction
DATA A N A LY S I S A N D R E P ORT I NG
or the Holm-Bonferroni correction, but the importance of
taking this into account is that, in order to avoid such kind
OU TCOM E A NA LY S I S A S  of errors, researchers should choose one or a few primary
I N T E N T ION TO T R E AT outcome measures for which the study should be adequately
A N D DROP OU T R AT E S powered.
On the other hand, in secondary analysis of infrequent
After defining the primary and secondary outcomes to be
events (such as subgroup analysis or analysis on adverse
measured, the strategy for their analysis should be decided.
effects of a drug in a trial designed for efficacy outcomes), p
Completer analyses have major limitations that include loss
values may show false negative results, owing to diminished
of power and loss of initial randomization as a consequence
statistical power by smaller groups (Ghaemi, 2009).
of dropouts (i.e., people who do not complete the study)
not being aleatory (e.g., dropouts in one group as a conse-
quence of adverse effects of medication or lack of efficacy).
QUA L I T Y OF R E P ORT I NG
Therefore, samples of remaining subjects (i.e., completers)
might be unrepresentative (Tierney & Stewart, 2005). A recently published systematic review on quality of
Intention to treat analysis includes analysis of data of reporting of RCTs on pharmacologic treatments of bipo-
all randomized participants, irrespective of how much of lar disorders stated that “a good part of the reporting qual-
the treatment they received. In other words, it evaluates ity . . . falls well below the required standards and also the
the treatment offer. This is intended to equalize the poten- practically feasible levels for many aspects, essential for
tial confounding factors for the entire sample but presents adequate interpretation of methodological quality and
problems too. To account for incomplete data in this type clinical relevance” (Strech, Soltmann, Weikert, Bauer, &
of analysis, one of the most common strategies is to use the Pfennig, 2011. p.  1220). The lack of standardisation in
last observation carried forward (LOCF), where endpoint RCTs reporting may influence their correct interpretation.
data is substituted with previous results. The LOCF impu- Besides, vital information in guiding clinical decisions and
tation method has limitations as well, because it assumes for data pooling in the context of meta-analysis could be
that dropout is not related to treatment or to outcome missing (Undurraga & Baldessarini, 2012). Furthermore,
and that a subject who discontinues treatment would have inadequate reporting and design have been associated with
retained constant clinical status from the time of dropout biased treatment effect estimates (Moher et al., 2010). As a
to the planned endpoint (Schalkwijk et al., 2014; Siddique result of this problem, the CONSORT statement was made
et  al., 2008). Other analytical methods have been pro- in the 1990s and has been revised periodically ever since
posed to account for missing data, such as the mixed-effect (Moher et al., 2010). It consists of a checklist of essential
modeling of repeated measures, or defined outcomes have items that should be included in reports of RCTs and a
been assessed even after subjects have dropped out of trials diagram for documenting the flow of participants through
(Schalkwijk et al., 2014; Siddique et al., 2008). a trial, intended to provide guidance to researchers on
Finally, dropout rates of over 20% pose serious threats to reporting of their RCTs. Importantly, many leading medi-
the validity of results obtained in trials (Schulz & Grimes, cal journals and major international editorial groups have
2002). Regrettably, according to a recent systematic review, supported this initiative.
this seems to be the case in approximately half of the bipo- Strech et  al. (2011) systematically reviewed the lit-
lar depression trials (Spanemberg et al., 2012). erature to identify all RCTs involving pharmacologic

S trategies for I mprov ing R andomized T rial E v idence for T reatment of B ipolar D isorder   •   611
treatment of bipolar disorder published between 2000 and to binary variables, e.g., yes/no response; Citrome, 2008).
2008. They included 105 RCTs and assessed their quality Effect sizes should routinely be accompanied by their corre-
of reporting based on the CONSORT statement (Moher sponding CI or CrI, which is the estimate of precision. It is
et al., 2010): 25% of trials reported inadequately and 42% “the range of plausible values for the effect size” or the “like-
reported adequately. Reporting was especially poor for ran- lihood that the real value for the variable would be captured
domization procedures: only 16% of trials defined the gen- in 95% of the trials” (Ghaemi, 2010). Of note, 95% CI is
eration of the random allocation sequence and 15% defined equivalent to p value of .05 (but gives more information).
the method of allocation concealment. In addition, only In conclusion, evidence on the efficacy and safety of
41% of trials reported on blinding measures for care pro- available treatments for bipolar depression is scarce and
viders and 46% for outcome assessors and only 2% reported highly heterogeneous. Many factors contributing toward
how the success of blinding was evaluated. In addition, only heterogeneity and statistical noise should be taken into
32% reported on description of sample size calculations. account when interpreting existing trials and designing
Furthermore, important variables such as education, dura- new clinical trials:  longer trial durations, higher number
tion of illness, and number of previous episodes were under- of study arms, larger sample sizes, involvement of more
reported (7%, 32%, and 42%, respectively). Finally, only 6% study sites, lower baseline disease severity, possibility of
of trials reported number needed to treat (NNT) and 15% doses adaptation, diversity of study populations and ethnic
reported on the CI for the contrasts between groups. factors, high dropout rates, less placebo washout of previ-
When reporting outcomes, clinical applicability and ous drugs, and low frequency of symptomatic assessments.
ease of interpretation are important goals. Effect size esti- Moreover, bipolar disorder has complex clinical presenta-
mations in general are easier to interpret and provide more tions that make establishing reliable and consistent diagno-
information than hypothesis testing (p values), though sis, clinical assessments, and (clinically relevant) outcome
they can be complementary. As reported previously, p value definitions extremely important. Notably, in light of cur-
considered alone is not helpful to guide clinical decisions, rent evidence, outcome assessments in bipolar depression
as it can be clinically irrelevant and depend on design vari- should also consider functional status.
ables such as the sample size or on statistical analysis (e.g., Regarding assessment methods, the most frequently
repeated analysis leading to false positives). In addition, used scales to assess depressive symptom severity and treat-
total scores assessed by the symptom severity scales are dif- ment response (i.e., HDRS, MADRS, IDS) were initially
ficult to interpret if the clinician is not familiar with them, developed to evaluate unipolar major depressive disorder.
which is frequently the case. Alternative methods like sim- Newer depression scales developed specifically for bipolar
ple effect size estimations such as Cohen’s d have been pro- depression (e.g., BDRS) may be suitable for future trails of
posed (Ghaemi, 2010; Martinez-Aran et al., 2008). Cohen’s bipolar depression. Indeed, the only way to overcome the
d is simply calculated as the difference between the study challenges posed by decreasing signal detection in depres-
arms for change in scores divided by their pooled standard sion RCTs may be the use of harder outcomes, such as spe-
deviations. It is useful, as it corrects for the variation within cific biomarkers, which are yet to come (Vieta, 2014).
the sample and gives values between zero and 1 or higher. It In addition, for statistical analysis of primary and sec-
typically described as “small, 0.2,” “medium, 0.5,” or “large, ondary outcomes, a standard intention to treat approach
0.8” (Martinez-Aran et al., 2008). Other examples of effect should be adapted as it offers a better model than the com-
size estimations are SMDs, risk ratio, odds ratio, absolute pleter analysis but nevertheless is far from being unbiased.
responder or remitter rates, NNT, and number needed to It would be even better if new analytical methods such
harm (NNH), each being described in a preceding chapter as mixed-effect modelling of repeated measures could be
of this book. NNT and NNH are other clinically useful incorporated.
effect estimates that inform clinicians about how many The last stage of research, reporting the results, is as
patients one would need to treat with one intervention ver- important as methodological planning and statistical
sus another to see a difference in an outcome (efficacy out- analysis. Regrettably, the reporting of results from bipolar
comes, such as response, remission, etc. or adverse outcomes depression trials is generally of poor quality and associ-
such as tolerability, suicidal behaviors, etc.). In other words, ated with biased estimates of treatment effects. Guidelines
they can quantify the clinical relevance of a statistically sig- such as the CONSORT statement should be used as an aid
nificant study result. Their main limitation is that they can when reporting. Clinical applicability and ease of interpre-
only be calculated for binary variables. Continuous vari- tation, as well as possibility of future evidence synthesis,
ables should use other effect size measures (or be converted should be considered. In that sense, change in scores and

61 2   •   S E C T I O N I V:   F U T U R E R E S E A R C H
their standard deviations for the entire sample, as well as care is imperative regarding the management of lithium
subgroups, should constantly be reported and may better withdrawal (Mander & Loudon, 1988).
be accompanied by the standardized effect size estimations
such as SMD or Cohen’s d and responder and remitter rates,
PL AC E B O - CON T ROL L E D
as these measures are easier to interpret and provide more
M A I N T E N A NC E T R I A L S
information than hypothesis testing. NNT and NNH are
also clinically oriented and easy-to-interpret outcome mea- In 1997 Bowden et  al. reported data from the first
sures that should be incorporated into reporting whenever placebo-controlled maintenance trial in bipolar I disorder,
possible. Finally, as with bipolar mania trials, quantifiable which has been conducted since 1973. Bipolar maintenance
data on the special subgroups such as bipolar I or II, psy- trial methodologies have evolved substantially during this
chotic versus nonpsychotic, with mixed features or rapid time. Bowden et  al. argued that this particular study,
cycling course, as well as individual adverse effects, should designed for submission to the regulatory agencies with
be reported regularly with a standardized approach. considerations on the country-specific requirements, have
had a strong influence on the study design. They also sug-
gest that maintenance studies should be designed and exe-
C L I N IC A L T R I A L S OF  cuted to enroll patients with acute, severe forms of bipolar
B I P OL A R M A I N T E N A NC E depression or bipolar mania, rather than recruiting patients
(OR PROPH Y L A X I S) T R E AT M E N T with remission or milder forms of BD. They suggested that
by paying greater attention to the frequency of manic and
Standardized designs have been established for studies of depressive episodes and the severity of an index episode, the
acute mania (and acute depression); however, no core design statistical power of the study could be enhanced (Bowden
for maintenance therapy for BD has been agreed on by et al., 1997).
researchers in the field (Gitlin, Abulseoud, & Frye, 2010). Unlike the acute treatment trials, with over 50 trials
Therefore, it is uncertain whether the results reported for only acute bipolar mania being published, there is a sur-
may be the product of differential efficacy between agents prising paucity of maintenance studies. In a recent review
or between different designs, thus creating inconsistent by Popovic, Reinares, Amann, Salamero, and Vieta (2011),
results. Given the paucity of available maintenance trials which included all the RCTs assessing the effectiveness
and the diversity of their designs due to the patient popu- of drugs in the prophylactic treatment of BD compared
lations examined and pursued outcome (i.e., antimanic or to placebo, only 15 trials satisfied the inclusion criteria
antidepressive prophylaxis as described by Grunze et  al., (Inclusion criteria were: a minimal duration of six months;
2013), more trials providing homogenous data on the mea- patients over 18; while exclusion criteria were: small
sures of efficacy, as well as side effects, functionality, quality sample size (i.e., fewer than 17 subjects per arm); a study
of life, and prevention of suicide are needed. Such studies sample not exclusively composed of bipolar patients; those
will aid clinicians in providing more successful treatments using rating scales not validated in patients with bipo-
for bipolar prophylaxis. lar disorder. Since then, further two trials satisfying the
same inclusion criteria have been published (Berwaerts,
Melkote, Nuamah, & Lim, 2012; Marcus et al., 2011). It
E A R LY L I T H I U M T R I A L S
is noteworthy that some trials (e.g., McIntyre et al., 2010)
The early lithium trials have been heavily criticized due to were excluded because they did not have a long-term pla-
the discrepancies between their strikingly positive results cebo group. Some of the RCTs used a three-arm design.
compared to those of subsequent comparative trials and Among the available 17 trials, there were 11 RCTs assess-
naturalistic data. In 1995 Moncrieff published an article ing drugs in monotherapy and 6 for combined treatment
proposing that these discrepancies originate from the with mood stabilizers such as lithium and valproate. The
design of these early studies (i.e., discontinuation studies studies were not homogeneous with respect to clinical
in which patients who had been receiving lithium were characteristics of the sample (rapid-cycling course, manic/
allocated to either continue receiving lithium or to placebo mixed states or depression, refractory patients or unbi-
substitution; Moncrieff, 1995). Such studies fail to exam- ased samples), sample size, and rates of study completion.
ine the efficacy of lithium prophylaxis in view of the sub- In that regard, as discussed by Popovic et al. (2012), most
stantial evidence that lithium withdrawal induces manic studies enrolled enriched populations of patients who were
relapse (Dunner, 1998; Schou, 1993). Irrefutably, special currently or recently manic or mixed. Missing from most

S trategies for I mprov ing R andomized T rial E v idence for T reatment of B ipolar D isorder   •   613
study designs was the recruitment of patients with index lamotrigine as add-on treatment) should also be an object
depressive episodes. Exclusion of depressed patients at of future research.
enrollment may affect the polarity of mood episodes dur-
ing the blinded relapse prevention phase since the study
DI S CON T I N UAT ION S T U DI E S
design was primarily configured to demonstrate efficacy in
the delay or prevention of manic recurrence. The methodological caveat of discontinuation studies is
The absence of depressive index episodes in a compound that findings may demonstrate “discontinuation effects”
whose primary spectrum of efficacy is in depression biases rather than preventing relapses. Additionally, because the
outcome against the drug, or vice versa. However, the main rate of relapse is low in these studies they require a very large
reason why some compounds have been studied only in number of participants to reach adequate statistical power.
the context of index mania is their failure to separate from
placebo in acute bipolar depression trials; hence, the bias
CON T I N UAT ION S T U DI E S
against index depression is actually caused by the high
polarity index of the drug, indicating a stronger antidopa- Continuation trials involve patients who have responded
minergic action, which makes it more suitable for the treat- to the relevant treatment in the acute phase and thus are
ment of mania and the prevention of subsequent manic enriched for the possibility of relapse in the subsequent
episodes (Popovic et al., 2012). continuation period. Since fewer participants are required,
The polarity index is a novel metric indicating the rela- maintenance trials switched to continuation studies.
tive antimanic versus antidepressive preventive efficacy of The lack of consistency in the designs of the
drugs, and it was retrieved by calculating NNT for preven- placebo-controlled maintenance continuation studies
tion of depression and NNT for prevention of mania ratio, listed in Table 42.3 has weakened the reliability and valid-
emerging from the results of the RCTs. The metric aims to ity of these studies. Therefore the available evidence for the
help clinicians to understand the prophylactic efficacy pro- long-term treatment of bipolar disorder is lacking. More
file of drugs used for the treatment of bipolar disorder by rigorous approaches should involve analyses of bipolar
translating results of clinical trials into real-world clinical I and bipolar II subtypes or consideration of them in dif-
practice (Popovic et al., 2014). According to Vieta (2014), ferent trials as described in the studies outlined here. For
in recent years we have started to measure the relative effi- example, quetiapine has recently been indicated by the
cacy of drugs for bipolar disorder by means of the NNT FDA for monotherapy of bipolar I  and bipolar II depres-
(Popovic et  al., 2011)  and their profile according to the sion. Although no specific controlled trial addressed the
polarity index (Popovic et al., 2012), which is a useful mea- efficacy and safety of quetiapine monotherapy in bipolar
sure that can be easily misunderstood (Alphs, Berwaerts, & II, BOLDER (BipOLar DEpRession) studies I and II, and
Turkoz, 2013) since it actually provides advice on what not the EMBOLDEN continuation studies (described later)
to use, rather than what to use, depending on the predomi- included sufficient bipolar II patients (Parker, 2012).
nant polarity of manic and depressive episodes in a given The two EMBOLDEN studies constitute a good exam-
patient (Baldessarini et al., 2012). ple of the continuation of maintenance trials with an index
In fact, RCTs for maintenance treatment of bipolar episode of bipolar depression. These two RCTs are similar
disorder are not only scarce; they are completely lack- in design (i.e. multicenter, randomized, double-blind com-
ing for various agents. Carbamazepine is a clear example; parisons of the efficacy and safety of quetiapine monother-
although it was the first agent after lithium to be advocated apy [300 mg or 600 mg daily] verses placebo in bipolar I or
for long-term treatment of BD and two lithium-controlled II disorder adults). The active comparator arm was lithium
studies indicate the drug’s efficacy in relapse prevention, in EMBOLDEN I and paroxetine in EMBOLDEN II. The
no maintenance trials exist. As for valproate and oxcar- 26- to 52-week continuation phase consisted of patients
bazepine, the only existing maintenance trials are nega- who had achieved remission and were continued on the
tive/failed, thus further studies are clearly needed. Future same dose of quetiapine or were switched to placebo. In
studies need to address the long-term effectiveness of their combined analysis, the authors demonstrated that
agents such as carbamazepine, oxcarbazepine, and valpro- both doses of quetiapine significantly increased the time
ate, as well as some antipsychotics, which have not been to recurrence of any mood event. The subgroup analysis on
assessed in long-term placebo-controlled studies. In fact, bipolar II disorder was available, and the risk of recurrence
successful long-term management often requires combina- of any mood event compared with placebo was significantly
tion treatment. Lack of evidence base for this strategy (e.g., reduced in this subpopulation of patients with bipolar II

614   •   S E C T I O N I V:   F U T U R E R E S E A R C H
Table 42 .3.  R ANDOMIZED CONTROLLED TR IALS ASSESSING THE EFFECTIVENESS OF DRUGS IN THE
PROPHYLACTIC TR EATMENT OF BIPOLAR DISOR DER COMPAR ED TO PLACEBO a

TR I AL (IN OR DER PATIENT INCLUSION DOSAGE (MG/DAY)


OF APPEAR ANCE CR ITER I A (M AINTENANCE DUR ATION NUMBER OR PLASM A LEVELS/MEAN
IN TEXT) PHASE) (W EEKS) R ANDOMIZED DOSAGE

Keck et al., 2007 Bipolar I 100 ARI = 78 ARI: 15–30mg/day


≥18 years PLA = 83 Mean: 23.8 mg/day
YMRS≤10
MADRS≤13
No hospitalization in previous
3 months
Tohen et al., 20063 Bipolar I 48 OLZ = 225 OLZ: 5–20 mg/day
≥18 years PLA = 136
YMRS≤12
HDRS≤ 8
2 prior mixed or manic episodes in
past 6 years

Tohen et al., 2004 Bipolar I 72 LI/VPA + PLA = 48 OLZ: 5–20 mg/day


18-70 years LI/VPA + OLZ = 51 Mean: 12.5 mg/day
YMRS≤12 LI: 0.66-0.86 mEq/l
HRSD-21≤ 8 VPA: 60.1–73.8 μg/mL

Vieta et al., 2008a Bipolar I 104 QUE + LI/VPA = 336 QUE: 400 -800 mg/day
≥18 years PLA + LI/VPA = 367 Mean: 497 mg/day
YMRS≤12 LI: 0.5–1.2 mEq/L
HDRS≤ 12 VPA:50–125 μg/mL

Suppes et al., 2009 Bipolar I 104 QUE + LI/VPA = 310 QUE: 400–800mg/day


≥18 years PLA + LI/VPA = 313 Mean: 519 mg/die
YMRS≤10 LI: 0.5–1.2 mEq/L
MADRS≤13 Mean: 0.71–0.74 mEq/L
VPA: 50–125 μg/mL
Mean: 68.91–71.38 μg/mL

Weisler et al. Bipolar I 104 QUE = 404 QUE: 300–800 mg/day


YMRS ≤12 LI = 364 Li: 0.6–1.2 mEq/L
MADRS ≤12 PLA = 404
Acute current or recent (past 26
weeks) manic, depressive, or mixed
index episode treated with QUE

Quiroz et al., 2010 Bipolar I 96 RLAI = 140 RIS: 12.5–50 mg i.m.


18–65 years PLA = 136 Mean:25mg
Recent manic/mixed episode or
stable patients with ≥1 mood epi-
sode in past 4 months

Macfadden et al., Bipolar I 52 RLAI + TAU = 65 RLAT: 25–50mg/2 weeks


2009 18–70 years PLA + TAU = 59
≥4 episodes in the past year

Bowden et al., 2010 Bipolar I 24 ZIP + LI/VPA = 127 ZIP: 80–160 mg/day


≥18 years PLA + LI/VPA = 113 LI: 0.6–1.2 mEq/L
Current or recent manic/mixed Mean: 0.7–0.9 mEq/L
episode VPA: 50–125 μg/mL
MRS≥14 Mean: 67.4–72.8

Bowden et al., 2003 Bipolar I 76 LAM: 59 LAM: 100–400mg/die


≥18 years LI: 46 LI: 0.8-1.1 mEq/L
Current or recent PLA:70
(hypo)mania
≥1 additional (hypo)manic and 1
depressive episode in the past 3 years

(continued)

S trategies for I mprov ing R andomized T rial E v idence for T reatment of B ipolar D isorder   •   615
Table 42 .3  CONTINUED

TR I AL (IN OR DER PATIENT INCLUSION DOSAGE (MG/DAY)


OF APPEAR ANCE CR ITER I A (M AINTENANCE DUR ATION NUMBER OR PLASM A LEVELS/MEAN
IN TEXT) PHASE) (W EEKS) R ANDOMIZED DOSAGE

Calabrese et al., Bipolar I 72 LAM: 221 LAM:50-400mg/die


2003 ≥18 years LI: 121 Mean:200mg/die
Current or recent MDE PLA:121 LI: 0.8-1.1 mEq/L
≥1 additional (hypo)manic and 1 Mean: 0.8±0.3 mEq/L
depressive episode in the past 3 years

Calabrese et al., Bipolar I and II Rapid cycling 26 LAM: 90 LAM: 100–300 mg/day


2000 ≥18 years PLA: 87
≤14 HDRS
≤12 MRS
<3 on item 3 HDRS
stable for 4 weeks

Prien et al., 1973 Manic-depressive, manic type 24∗ LI:101 LI: 0.5-1.4 mEq/L
PLA: 104

Bowden et al., Bipolar I 52 VPA: 187 VPA: 71-125 μg/mL


2000 18-70 years LI: 90 LI: 0.8-1.2 mmol/L
Manic episode ≤3 months before PLA:92
randomization.
MRS ≤11
DSS ≤13
GAS >60,
No serious suicidal risk

Vieta et al., 2008b Bipolar I or II 52 OXC + LI=26 OXC: 1200 mg/day


≥18 years PLA + LI=29 LI:0.6 mEq/l
YMRS≤12
MADRS≤20
No acute phases in 6 months

Marcus et al., 2011 Bipolar I 52 ARI + LI/VPA=168 ARI: 15-30mg/day


YMRS ≥16 PLA + LI/VPA=169
Current or recent manic/mixed
episode
Inadequate response to lithium
or valproate YMRS ≥16 and ≤35%
decrease from baseline at 2 weeks

Berwaerts, 2012 Bipolar I Until the patients PALI=152 PALI: 3- 12 mg/day


18 -65 years experienced PLA=148 OLZ: 5-20 mg/day
Current manic or mixed episodes recurrence OLZ=83
At least 2 previous
mood episodes (1 of which had to
be a manic or mixed episode) within
3 years before
screening
YMRS ≥20 at
baseline.
No significant
risk for suicidal or violent behavior;
no borderline or
antisocial personality disorder

NOTE: Adapted from: Popovic et al. (2011).


a
With a minimal duration of 6 months and in patients aged over 18.
with both doses of quetiapine (McElroy et al., 2010; Young versus full clinical relapse (in addition to repeated use of
et  al., 2010). The strengths of these two EMBOLDEN cross-sectional scales) is the National Institute of Mental
studies are (a) the large patient population (n =1542 (565 Health Life Chart Method™ (NIMH-LCM™), which has
bipolar II)) and (b) the continued treatment of responsive both a clinician-rated and a self-rated version for assessing
patients with quetiapine or placebo (using a randomized severity of mania and depression on a daily basis (Post &
withdrawal design). Thus the authors were able to inves- Yildiz, 2014). This scale can also track extremes of cycling,
tigate the maintenance of effect associated with ongoing presence of dysphoric mania, and comorbid symptoms. In
quetiapine treatment. Additionally, subgroup analyses for addition, it has a running tally of medications and a space
the bipolar subtypes were included and the study involved for rating side effects with mild, moderate, or severe impact.
the largest bipolar II patient population studied to date in a Reliability is excellent, as severity of mania and depression
continuation treatment study (McElroy et al., 2010). is rated on the degree of functional impairment with which
each phase is associated, making recall of severity relatively
easy even at intervals of several weeks to a month between
OU TCOM E S A S S E S SM E N T S A N D
rating sessions. The scale has been validated against other
R ECOM M E N DAT IONS FOR F U T U R E
measures and used productively in a number of studies,
M A I N T E NA NC E T R I A L S
including long-term comparisons of lithium versus carbam-
The primary outcome measures in bipolar disorder main- azepine versus the combination for one year of prophylaxis
tenance trials vary greatly, making the comparison of effi- for each phase; comparisons of lamotrigine, gabapen-
cacy of medication across studies difficult (Grunze et  al., tin, and placebo; and, most recently, detailed analyses of
2013). The majority of long-term studies use the results of lamotrigine’s long-term effects on mood stability (Post &
Kaplan–Meier (KM) survival analyses based on time to Yildiz, 2014). Related longitudinal ratings that have been
intervention as the primary outcome. Other studies have employed by the STEP-BD program are also endorsed, as
used “any reason of failure” (i.e., new mood episodes, need such detailed description of the precise course of illness
for additional treatments, hospitalization, adverse events, is particularly important in instances of extreme rapidity
withdrawal of consent, lost to follow-up) as primary out- of cycling, which can even occur in bipolar outpatients in
comes. Some previous studies have used dropout for emerg- demanding professions.
ing new episodes (using Diagnostic and Statistical Manual One of the major assets of such detailed longitudi-
of Mental Disorders [fourth edition] criteria/clinical rating nal ratings is the ability to simultaneously assess different
scale thresholds). The problem with KM survival analytic thresholds for what one might consider as mild, moderate,
techniques are that they measure the occurrence of a pre- or severe relapse, as well as employ modal measures such as
defined event, for example, treatment emergent episode that of area under the curve, signifying the magnitude and
intervention, discontinuation, at baseline (absence of the duration of mania and depression. Such a measure of area
event) and endpoint (occurrence of event) only (Grunze under the curve allows for precise intraindividual compari-
et al., 2013). Grunze et al. state that KM survival analytic sons of the degree of symptomatology observed prior to and
techniques are not entirely appropriate considering that after a given experimental manipulation as a continuous
“completely healthy” between-episode states are unlikely variable, which should vastly increase the power to detect
given the subsyndromal fluctuations of mood, impaired treatment difference compared with a single endpoint of
functioning, and quality of life associated with bipolar dis- percentage of relapse into a new episode.
order and also due to the failure to capture tolerability and Traditional designs in assessment of the efficacy of a
impact on health data. Grunze et al. suggest that the reason prophylactic treatment have typically required patients to
for the popularity of the KM techniques is they are more achieve substantial improvement or remission for a given
sensitive to measuring differences than the more traditional period of time, then are randomized and followed up until
counting of failures. Instead of KM analyses, some studies the occurrence of a new episode or need for clinical inter-
have used mean change over time of symptomatic rating vention. While this has merits inpatient populations in
scales such as the YMRS and MADRS (McIntyre, 2010; whom this degree of wellness is readily achieved, it is far
Tohen et al., 2003). The limitation of the change over time from ideal for those with highly treatment-resistant illness.
analyses is that only minor shifts of statistical means for all In order to enable future pooled analyses of bipo-
patients are captured using this method. lar maintenance trials’ results, besides a standardized
As indicated in Chapter 43 of this book, one option for approach on the efficacy measures in terms of prophylaxis
assessing daily or weekly subsyndromal mood fluctuations from emerging manic or depressive episodes, future RCTs

S trategies for I mprov ing R andomized T rial E v idence for T reatment of B ipolar D isorder   •   617
should also quantify data on the important side effects such Sunovion, Takeda, Teva, the Spanish Ministry of Science
as weight gain or metabolic syndrome, undesirable neuro- and Innovation (CIBERSAM), the Seventh European
logical or cognitive effects, and TEAS, as well as quality of Framework Programme (ENBREC), and the Stanley
life, total life years gained, and functionality, in a standard- Medical Research Institute.
ized approach. Dr.  Dina Popovic’s work is supported by a Sara
Given that BD is an especially heterogeneous condi- Borrell post-doctoral grant, provided by Carlos III
tion, it might be wise to allow a limited degree of “pre- Institute (CD13/00149), Spanish Ministry of Science and
planned diversity” into the maintenance trials, provided Innovation. Dr. Popovic has received research grants from,
that equal allocation into study arms is maintained, or served as a speaker for, Bristol-Myers Squibb, Merck
objective outcome assessments are utilized, and, most Sharp & Dohme, and Janssen-Cilag.
important, similarity and transitivity of the trials are not Dr. Sarah Wooderson is employed by King’s College
jeopardized. More RCTs and more secondary outcomes London. This chapter presents independent research part-
in wisely planned designs may enable more effective clini- funded by the National Institute for Health Research
cal decision-making by utilizing advanced meta-analyses (NIHR) Biomedical Research Centre at South London
methods involving direct and indirect comparisons, sen- and Maudsley NHS Foundation Trust and King’s College
sitivity analysis, and meta-regressions. In order to achieve London. The views expressed are those of the author
this we suggest (a) employment of the NIMH-LCM™ for and not necessarily those of the NHS, the NIHR or the
detection of daily or weekly subsyndromal mood fluc- Department of Health. No external funding was used for
tuations in addition to weekly or biweekly assessments of this study.
mania and depression cross-sectionally via employment of Dr.  Allan Young is employed by King’s College
standardized scales such as the YMRS and MADRS as London; Honorary Consultant SLaM. He has given paid
appropriate; (b)  employment of 52 weeks of study dura- lectures and been on advisory boards for all major phar-
tion; (c)  study designs that avoid withdrawal or discon- maceutical companies with drugs used in affective and
tinuation effects; (d) regular and standardized assessment related disorders. He has no share holdings in pharmaceu-
and reporting of all important side effects such as weight tical companies. He was lead investigator for Embolden
gain or metabolic syndrome, undesirable neurological or Study (AZ), BCI Neuroplasticity Study, and Aripiprazole
cognitive effects, and TEAS, as well as quality of life, life Mania Study and has participated in investigator-initiated
years gained, and functionality; (e)  regular reporting on studies from AZ, Eli Lilly, Lundbeck, and Wyeth. Grant
the lifetime occurrence of depressive versus manic/mixed funding (past and present) includes:  USA:  National
episodes and subgroup analysis on patients with predomi- Institute of Mental Health, Brain and Behavior Research
nantly depressive versus manic/mixed polarity; (f) regular Foundation (NARSAD), Stanley Medical Research
reporting on the lifetime occurrence of psychotic features Institute; Canada: Canadian Institutes of Health Research,
and subgroup analysis on patients with psychotic versus UBC-VGH Foundation, WEDC, CCS Depression
nonpsychotic features; and (g)  regular reporting on the Research Fund, MSFHR; UK:  MRC; Wellcome Trust;
occurrence of rapid cycling features and subgroup analy- Royal College of Physicians; BMA; NIHR. 
sis on patients with rapid cycling versus nonrapid cycling This chapter presents independent research part-
features. funded by the National Institute for Health Research
(NIHR) Biomedical Research Centre at South London
Disclosure statement:  Dr.  Ayşegül Yildiz has received and Maudsley NHS Foundation Trust and King’s College
research grants from or served as a consultant to, or London. The views expressed are those of the author
speaker for, Abdi Ibrahim, Actavis, AliRaif, AstraZeneca, and not necessarily those of the NHS, the NIHR or the
Bristol-Myers Squibb, Janssen-Cilag, Lundbeck, Pfizer, Department of Health.
Sanofi-Aventis, and Servier. Dr. Juan Undurraga has not disclosed any conflicts of
Dr.  Eduard Vieta has received grants and served as interest.
consultant, advisor or CME speaker for the following
entities:  AstraZeneca, Bristol-Myers Squibb, Elan, Eli
Lilly, Ferrer, Forest Research Institute, Gedeon Richter, R E F E R E NC E S
Glaxo-Smith-Kline, Janssen-Cilag, Jazz, Johnson &
Johnson, Lundbeck, Merck, Novartis, Otsuka, Pfizer, Agid, O., Siu, C.  O., Potkin, S.  G., Kapur, S., Watsky, E., Vander-
Roche, Rovi, Sanofi-Aventis, Servier, Shire, Solvay, burg, D., . . . Remington, G. (2013). Meta-regression analysis of

618   •   S E C T I O N I V:   F U T U R E R E S E A R C H
placebo response in antipsychotic trials, 1970-2010. The Ameri- Citrome, L. (2008). Compelling or irrelevant? Using number needed
can Journal of Psychiatry, 170(11), 1335–1344. doi:10.1176/appi. to treat can help decide. Acta Psychiatrica Scandinavica, 117(6),
ajp.2013.12030315 412–419. doi:10.1111/j.1600-0447.2008.01194.x
Ahrens, B., Muller-Oerlinghausen, B., Schou, M., Wolf, T., Alda, M., Colom, F., Vieta, E., Daban, C., Pacchiarotti, I., & Sanchez-Moreno,
Grof, E., . . . et al. (1995). Excess cardiovascular and suicide mortality J. (2006). Clinical and therapeutic implications of predominant
of affective disorders may be reduced by lithium prophylaxis. Jour- polarity in bipolar disorder. Journal of Affective Disorders, 93(1–3),
nal of Affective Disorders, 33(2), 67–75. 13–17. doi:10.1016/j.jad.2006.01.032
Alphs, L., Benedetti, F., Fleischhacker, W. W., & Kane, J. M. (2012). Correll, C. U., Malhotra, A. K., Kaushik, S., McMeniman, M., & Kane,
Placebo-related effects in clinical trials in schizophrenia:  What J. M. (2003). Early prediction of antipsychotic response in schizo-
is driving this phenomenon and what can be done to minimize phrenia. The American Journal of Psychiatry, 160(11), 2063–2065.
it? International Journal of Neuropsychopharmacology, 15(7), Cruz, N., Vieta, E., Comes, M., Haro, J. M., Reed, C., Bertsch, J., &
1003–1014. doi:10.1017/S1461145711001738 Emblem Advisory Board. (2008). Rapid-cycling bipolar I  disor-
Alphs, L., Berwaerts, J., & Turkoz, I. (2013). Limited utility of number der: Course and treatment outcome of a large sample across Europe.
needed to treat and the polarity index for bipolar disorder to char- Journal of Psychiatric Research, 42(13), 1068–1075. doi:10.1016/j.
acterize treatment response. European Neuropsychopharmacology, jpsychires.2007.12.004
23(11), 1597–1599. De Fruyt, J., Deschepper, E., Audenaert, K., Constant, E., Floris, M.,
Ayuso-Mateos, J.  L., Avila, C.  C., Anaya, C., Cieza, A., Vieta, E., & Pitchot, W., . . . Claes, S. (2012). Second generation antipsychot-
Bipolar Disorders Core Sets Expert Group. (2013). Development of ics in the treatment of bipolar depression: A systematic review and
the international classification of functioning, disability and health meta-analysis. Journal of Psychopharmacology, 26(5), 603–617.
core sets for bipolar disorders: Results of an international consensus doi:10.1177/0269881111408461
process. Disability and Rehabilitation, 35(25), 2138–2146. doi:10.3 Drexhage, R.  C., Hoogenboezem, T.  H., Versnel, M.  A., Berghout,
109/09638288.2013.771708 A., Nolen, W.  A., & Drexhage, H.  A. (2011). The activation of
Baldessarini, R.  J. (2013). Chemotherapy in psychiatry (3rd ed.). monocyte and T cell networks in patients with bipolar disorder.
New York: Springer Press. Brain Behaviour and Immunity, 25(6), 1206–1213. doi:10.1016/j.
Baldessarini, R.  J., Leahy, L., Arcona, S., Gause, D., Zhang, W., & bbi.2011.03.013
Hennen, J. (2007). Patterns of psychotropic drug prescription for Dunner, D.  L. (1998). Lithium carbonate:  Maintenance studies and
U.S.  patients with diagnoses of bipolar disorders. Psychiatric Ser- consequences of withdrawal. Journal of Clinical Psychiatry, 59(6),
vices, 58(1), 85–91. doi:10.1176/appi.ps.58.1.85-a 48–55; discussion 56.
Baldessarini, R.  J., Undurraga, J., Vázquez, G.  H., Tondo, L., Salva- Dunner, D. L., & Fieve, R. R. (1974). Clinical factors in lithium car-
tore, P., Ha, K., . . . Vieta, E. (2012). Predominant recurrence polar- bonate prophylaxis failure. Archives of General Psychiatry, 30(2),
ity among 928 adult international bipolar I  disorder patients. 229–233.
Acta Psychiatrica Scandinavica, 125(4), 293–302. doi:10.1111/ Fava, M., Evins, A. E., Dorer, D. J., & Schoenfeld, D. A. (2003). The
j.1600-0447.2011.01818.x problem of the placebo response in clinical trials for psychiatric
Baldessarini, R.  J., Vieta, E., Calabrese, J.  R., Tohen, M., & disorders:  Culprits, possible remedies, and a novel study design
Bowden, C.  L. (2010). Bipolar depression:  Overview and approach. Psychotherapy and Psychosomatics, 72(3), 115–127.
commentary. Harvard Review of Psychiatry, 18(3), 143–157. doi:69738
doi:10.3109/10673221003747955 Gitlin, M.  J., Abulseoud, O., & Frye, M.  A. (2010). Improving the
Bauer, M., Beaulieu, S., Dunner, D.  L., Lafer, B., & Kupka, R. design of maintenance studies for bipolar disorder. Current Medical
(2008). Rapid cycling bipolar disorder—Diagnostic con- Research and Opinion, 26(8), 1835–1842.
cepts. Bipolar Disorders, 10(1 Pt. 2), 153–162. doi:10.1111/ Grunze, H., Vieta, E., Goodwin, G.  M., Bowden, C., Licht, R.  W.,
j.1399-5618.2007.00560.x Möller, H.  J., . . . WFSBP Task Force on Treatment Guidelines for
Benazzi, F. (2007). Is there a continuity between bipolar and depres- Bipolar Disorders. (2013). The World Federation of Societies of
sive disorders? Psychotherapy and Psychosomatics, 76(2), 70–76. Biological Psychiatry (WFSBP) guidelines for the biological treat-
doi:10.1159/000097965 ment of bipolar disorders: Update 2012 on the long-term treatment
Benedetti, F. (2009). Understanding the mechanisms in health and dis- of bipolar disorder. World Journal of Biological Psychiatry, 14(3),
ease. New York: Oxford University Press. 154–219.
Berk, M., Malhi, G. S., Cahill, C., Carman, A. C., Hadzi-Pavlovic, D., García-Rizo, C., Kirkpatrick, B., Fernandez-Egea, E., Oliveira, C.,
Hawkins, M. T., . . . Mitchell, P. B. (2007). The Bipolar Depression Meseguer, A., Grande, I., . . . Bernardo, M. (2014). “Is bipolar disor-
Rating Scale (BDRS): Its development, validation and utility. Bipo- der an endocrine condition?” Glucose abnormalities in bipolar dis-
lar Disorders, 9(6), 571–579. doi:10.1111/j.1399-5618.2007.00536.x order. Acta Psychiatrica Scandinavica, 129(1), 73–74. doi:10.1111/
Berwaerts, J., Melkote, R., Nuamah, I., & Lim, P. (2012). A ran- acps.12194
domized, placebo- and active-controlled study of paliperidone Geddes, J. R., & Miklowitz, D. J. (2013). Treatment of bipolar disorder.
extended-release as maintenance treatment in patients with bipolar Lancet, 381(9878), 1672–1682.
I disorder after an acute manic or mixed episode. Journal of Affective Ghaemi, S. N. (2009). A clinician’s guide to statistics and epidemiology in
Disorders, 138(3), 247–258. mental health: Measuring truth and uncertainty. New York: Cam-
Bowden, C. L., Swann, A. C., Calabrese, J. R., McElroy, S. L., Morris, D., bridge University Press.
Petty, F., . . . Gyulai, L. (1997). Maintenance clinical trials in bipolar Gijsman, H. J., Geddes, J. R., Rendell, J. M., Nolen, W. A., & Goodwin,
disorder:  Design implications of the divalproex-lithium-placebo G. M. (2004). Antidepressants for bipolar depression: A systematic
study. Psychopharmacology Bulletin, 33(4), 693–699. review of randomized, controlled trials. The American Journal of
Chang, C. K., Hayes, R. D., Perera, G., Broadbent, M. T., Fernandes, Psychiatry, 161(9), 1537–1547. doi:10.1176/appi.ajp.161.9.1537
A. C., Lee, W. E., . . . Stewart, R. (2011). Life expectancy at birth for Henkel, V., Casaulta, F., Seemuller, F., Krahenbuhl, S., Obermeier,
people with serious mental illness and other major disorders from M., Husler, J., & Moller, H. J. (2012). Study design features affect-
a secondary mental health care case register in London. PLoS One, ing outcome in antidepressant trials. Journal of Affective Disorders,
6(5), e19590. doi:10.1371/journal.pone.0019590 141(2–3), 160–167. doi:10.1016/j.jad.2012.03.021
Cipriani, A., Hawton, K., Stockton, S., & Geddes, J. R. (2013). Lithium Hoertel, N., de Maricourt, P., & Gorwood, P. (2013). Novel routes to
in the prevention of suicide in mood disorders: Updated systematic bipolar disorder drug discovery. Expert Opinion on Drug Discovery,
review and meta-analysis. BMJ, 346, f3646. doi:10.1136/bmj.f3646 8(8), 907–918. doi:10.1517/17460441.2013.804057

S trategies for I mprov ing R andomized T rial E v idence for T reatment of B ipolar D isorder   •   619
Hurko, O., & Ryan, J. L. (2005). Translational research in central ner- Leucht, S., Hierl, S., Kissling, W., Dold, M., Davis, J. M. (2012). Putting
vous system drug discovery. NeuroRx, 2(4), 671–682. doi:10.1602/ the efficacy of psychiatric and general medicine medication into per-
neurorx.2.4.671 spective: Review of meta-analyses. The British Journal of Psychiatry,
Jaeger, J., & Vieta, E. (2007). Functional outcome and disability in 200, 97–106.
bipolar disorders: Ongoing research and future directions. Bipolar Machado-Vieira, R., Salvadore, G., DiazGranados, N., Ibrahim, L.,
Disorders, 9(1–2), 1–2. doi:10.1111/j.1399-5618.2007.00441.x Latov, D., Wheeler-Castillo, C., . . . Zarate, C. A. (2010). New thera-
Judd, L.  L., Schettler, P.  J., Akiskal, H.  S., Coryell, W., Leon, A.  C., peutic targets for mood disorders. TheScientificWorldJournal, 10,
Maser, J. D., & Solomon, D. A. (2008). Residual symptom recovery 713–726. doi:10.1100/Tsw.2010.65
from major affective episodes in bipolar disorders and rapid episode Mallinckrodt, C. H., Zhang, L., Prucka, W. R., & Millen, B. A. (2010).
relapse/recurrence. Archives of General Psychiatry, 65(4), 386–394. Signal detection and placebo response in schizophrenia:  Parallels
doi:10.1001/archpsyc.65.4.386 with depression. Psychopharmacology Bulletin, 43(1), 53–72.
Judd, L. L., Schettler, P. J., Akiskal, H. S., Maser, J., Coryell, W., Solo- Mander, A. J., & Loudon, J. B. (1988). Rapid recurrence of mania fol-
mon, D., . . . Keller, M. (2003). Long-term symptomatic status of lowing abrupt discontinuation of lithium. Lancet, 2(8601), 15–17.
bipolar I vs. bipolar II disorders. International Journal of Neuropsy- Marangell, L. B. (2004). The importance of subsyndromal symptoms in
chopharmacology, 6(2), 127–137. doi:10.1017/S1461145703003341 bipolar disorder. Journal of Clinical Psychiatry, 65, 24–27.
Kemp, A.  S., Schooler, N.  R., Kalali, A.  H., Alphs, L., Anand, R., March, J. S., Silva, S. G., Compton, S., Shapiro, M., Califf, R., & Krish-
Awad, G., . . . Vermeulen, A. (2010). What is causing the reduced nan, R. (2005). The case for practical clinical trials in psychiatry.
drug-placebo difference in recent schizophrenia clinical trials and The American Journal of Psychiatry, 162(5), 836–846. doi:10.1176/
what can be done about it? Schizophrenia Bulletin, 36(3), 504–509. appi.ajp.162.5.836
doi:10.1093/schbul/sbn110 Marcus, R., Khan, A., Rollin, L., Morris, B., Timko, K., Carson, W., &
Khan, A., Bhat, A., Kolts, R., Thase, M.  E., & Brown, W. (2010). Sanchez, R. (2011). Efficacy of aripiprazole adjunctive to lithium or
Why has the antidepressant-placebo difference in antidepres- valproate in the long-term treatment of patients with bipolar I dis-
sant clinical trials diminished over the past three decades? CNS order with an inadequate response to lithium or valproate mono-
Neuroscience and Therapeutics, 16(4), 217–226. doi:10.1111/ therapy:  A  multicenter, double-blind, randomized study. Bipolar
j.1755-5949.2010.00151.x Disorders, 13(2), 133–144.
Khan, A., Schwartz, K., Kolts, R.  L., Ridgway, D., & Lineberry, C. Martinez-Aran, A., Vieta, E., Chengappa, K. N. R., Gershon, S., Mullen,
(2007). Relationship between depression severity entry criteria and J., & Paulsson, B. (2008). Reporting outcomes in clinical trials for
antidepressant clinical trial outcomes. Biological Psychiatry, 62(1), bipolar disorder: A commentary and suggestions for change. Bipolar
65–71. doi:10.1016/j.biopsych.2006.08.036 Disorders, 10(5), 566–579. doi:10.1111/j.1399-5618.2008.00611.x
Kim, J. H., Jung, H. Y., Kang, U. G., Jeong, S. H., Ahn, Y. M., Byun, Mathew, S. J., Manji, H. K., & Charney, D. S. (2008). Novel drugs and
H. J., . . . Kim, Y. S. (2002). Metric characteristics of the drug-induced therapeutic targets for severe mood disorders. Neuropsychopharma-
extrapyramidal symptoms scale (DIEPSS):  A  practical combined cology, 33(9), 2080–2092. doi:10.1038/sj.npp.1301652
rating scale for drug-induced movement disorders. Movement Dis- McElroy, S.  L., Weisler, R.  H., Chang, W., Olausson, B., Paulsson,
orders, 17(6), 1354–1359. B., Brecher, M., . . . Investigators, E.  I. (2010). A double-blind,
Kinon, B.  J., Chen, L., Ascher-Svanum, H., Stauffer, V.  L., placebo-controlled study of quetiapine and paroxetine as monother-
Kollack-Walker, S., Zhou, W., . . . Kane, J. M. (2010). Early response apy in adults with bipolar depression (EMBOLDEN II). Journal of
to antipsychotic drug therapy as a clinical marker of subsequent Clinical Psychiatry, 71(2), 163–174.
response in the treatment of schizophrenia. Neuropsychopharmacol- McIntyre, R. S. (2010). Aripiprazole for the maintenance treatment of
ogy, 35(2), 581–590. doi:10.1038/npp.2009.164 bipolar I disorder: A review. Clinical Therapeutics, 32(1), S32–38.
Kirsch, I. (2009). Antidepressants and the placebo response. Epidemio- McIntyre, R. S., Cohen, M., Zhao, J., Alphs, L., Macek, T. A., & Panag-
logia e Psichiatria Sociale, 18(4), 318–322. ides, J. (2010). Asenapine in the treatment of acute mania in bipolar
Kobak, K.  A., Leuchter, A., DeBrota, D., Engelhardt, N., Williams, I  disorder:  A  randomized, double-blind, placebo-controlled trial.
J. B., Cook, I. A., . . . Alpert, J. (2010). Site versus centralized raters in Journal of Affective Disorders, 122(1–2), 27–38.
a clinical depression trial: Impact on patient selection and placebo Moerman, D. (2002). Meaning, medicine and the placebo effect.
response. Journal of Clinical Psychopharmacology, 30(2), 193–197. New York: Cambridge University Press.
doi:10.1097/JCP.0b013e3181d20912 Moher, D., Hopewell, S., Schulz, K. F., Montori, V., Gotzsche, P. C.,
Laursen, T. M. (2011). Life expectancy among persons with schizophre- Devereaux, P. J., . . . Altman, D. G. (2010). CONSORT 2010 expla-
nia or bipolar affective disorder. Schizophrenia Research, 131(1–3), nation and elaboration:  Updated guidelines for reporting paral-
101–104. doi:10.1016/j.schres.2011.06.008 lel group randomised trials. BMJ:  British Medical Journal, 340.
Leboyer, M., Soreca, I., Scott, J., Frye, M., Henry, C., Tamouza, doi:10.1136/bmj.c869
R., & Kupfer, D.  J. (2012). Can bipolar disorder be viewed as a Moncrieff, J. (1995). Lithium revisited:  A  re-examination of the
multi-system inflammatory disease? Journal of Affective Disorders, placebo-controlled trials of lithium prophylaxis in manic-depressive
141(1), 1–10. doi:10.1016/j.jad.2011.12.049 disorder. The British Journal of Psychiatry, 167(5), 569–573; discus-
Leucht, S., Arbter, D., Engel, R.  R., Kissling, W., & Davis, J.  M. sion 573–564.
(2009). How effective are second-generation antipsychotic drugs? Pacchiarotti, I., Bond, D. J., Baldessarini, R. J., Nolen, W. A., Grunze,
A meta-analysis of placebo-controlled trials. Molecular Psychiatry, H., Licht, R. W., . . . Vieta, E. (2013). The International Society for
14(4), 429–447. doi:10.1038/sj.mp.4002136 Bipolar Disorders (ISBD) task force report on antidepressant use
Leucht, S., Cipriani, A., Spineli, L., Mavridis, D., Orey, D., Richter, in bipolar disorders. The American Journal of Psychiatry, 170(11),
F., . . . Davis, J.  M. (2013). Comparative efficacy and tolerability of 1249–1262. doi:10.1176/appi.ajp.2013.13020185
15 antipsychotic drugs in schizophrenia:  A  multiple-treatments Parker, G. (2012). Bipolar II disorder: Modelling, measuring and man-
meta-analysis. Lancet, 382(9896), 951–962. doi:10.1016/ aging (2nd ed.). New York: Cambridge University Press.
S0140-6736(13)60733-3 Popovic, D., Reinares, M., Amann, B., Salamero, M., & Vieta, E. (2011).
Leucht, S., Heres, S., & Davis, J. M. (2013). Increasing placebo response Number needed to treat analyses of drugs used for maintenance
in antipsychotic drug trials: Let’s stop the vicious circle. The Ameri- treatment of bipolar disorder. Psychopharmacology (Berlin), 213(4),
can Journal of Psychiatry, 170(11), 1232–1234. doi:10.1176/appi. 657–667.
ajp.2013.13081129 Popovic, D., Reinares, M., Goikolea, J.  M., Bonnin, C.  M.,
Gonzalez-Pinto, A., & Vieta, E. (2012). Polarity index of

62 0   •   S E C T I O N I V:   F U T U R E R E S E A R C H
pharmacological agents used for maintenance treatment of bipolar Tarr, G.  P., Herbison, P., de la Barra, S.  L., & Glue, P. (2011). Study
disorder. European Neuropsychopharmacology, 22(5), 339–346. design and patient characteristics and outcome in acute mania
Popovic, D., Torrent, C., Goikolea, J. M., Cruz, N., Sánchez-Moreno, J., clinical trials. Bipolar Disorders, 13(2), 125–132. doi:10.1111/
González-Pinto, A., & Vieta, E. (2014). Clinical implications of pre- j.1399-5618.2011.00904.x
dominant polarity and the polarity index in bipolar disorder: A nat- Thase, M.  E. (2003). Evaluating antidepressant therapies:  Remis-
uralistic study. Acta Psychiatrica Scandinavica, 129(5), 366–374. sion as the optimal outcome. Journal of Clinical Psychiatry,
Post, R. M. (2009). Myth of evidence-based medicine for bipolar disor- 64(13), 18–25.
der. Expert Review of Neurotherapeutics, 9(9), 1271–1273. Tierney, J. F., & Stewart, L. A. (2005). Investigating patient exclusion
Post, R.  M. (2010). Special issues of research methodology in bipo- bias in meta-analysis. International Journal of Epidemiology, 34(1),
lar disorder clinical treatment trials. In M. Hertzman & L. Adler 79–87. doi:10.1093/ije/dyh300
(Eds.), Clinical trials in psychopharmacology:  A  better brain (2nd Tohen, M., Frank, E., Bowden, C.  L., Colom, F., Ghaemi, S.  N.,
ed.). Hoboken, NJ: John Wiley. Yatham, L.  N., . . . Berk, M. (2009). The International Society for
Raz, A., Zigman, P., & de Jong, V. (2009). Placebo effects and Bipolar Disorders (ISBD) Task Force report on the nomenclature
responses: Filling the interstices with meaning. PsycCRITIQUES, of course and outcome in bipolar disorders. Bipolar Disorders, 11(5),
54, 33. 453–473. doi:10.1111/j.1399-5618.2009.00726.x
Reinares, M., Rosa, A. R., Franco, C., Goikolea, J. M., Fountoulakis, Tohen, M., Ketter, T.  A., Zarate, C.  A., Suppes, T., Frye, M., Alt-
K., Siamouli, M., . . . Vieta, E. (2013). A systematic review on the shuler, L., . . . Baker, R.  W. (2003). Olanzapine versus divalproex
role of anticonvulsants in the treatment of acute bipolar depression. sodium for the treatment of acute mania and maintenance of remis-
International Journal of Neuropsychopharmacology, 16(2), 485–496. sion: A 47-week study. The American Journal of Psychiatry, 160(7),
doi:10.1017/S1461145712000491 1263–1271.
Rosa, A.  R., Sanchez-Moreno, J., Martinez-Aran, A., Salamero, M., Undurraga, J., & Baldessarini, R.  J. (2012). Randomized,
Torrent, C., Reinares, M., . . . Vieta, E. (2007). Validity and reliabil- placebo-controlled trials of antidepressants for acute major depres-
ity of the Functioning Assessment Short Test (FAST) in bipolar sion:  Thirty-year meta-analytic review. Neuropsychopharmacology,
disorder. Clinical Practice and Epidemiology in Mental Health, 3, 5. 37(4), 851–864. doi:10.1038/npp.2011.306
doi:10.1186/1745-0179-3-5 Undurraga, J., Baldessarini, R. J., Valentí, M., Pacchiarotti, I., Tondo,
Rush, A. J., Kraemer, H. C., Sackeim, H. A., Fava, M., Trivedi, M. H., L., Vázquez, G., & Vieta, E. (2012). Bipolar depression:  Clinical
Frank, E., . . . ACNP Task Force. (2006). Report by the ACNP correlates of receiving antidepressants. Journal of Affective Disorders,
Task Force on response and remission in major depressive disor- 139(1), 89–93. doi:10.1016/j.jad.2012.01.027
der. Neuropsychopharmacology, 31(9), 1841–1853. doi:10.1038/ Vázquez, G. H., Tondo, L., Undurraga, J., & Baldessarini, R. J. (2013).
sj.npp.1301131 Overview of antidepressant treatment of bipolar depression. Inter-
Schalkwijk, S., Undurraga, J., Tondo, L., & Baldessarini, R.  J. national Journal of Neuropsychopharmacology, 16(7), 1673–1685.
(2014). Declining efficacy in controlled trials of antidepres- doi:10.1017/S1461145713000023
sants:  effects of placebo dropout. The International Journal of Vieta, E. (2014). The bipolar maze:  A  roadmap through transla-
Neuropsychopharmacology / Official Scientific Journal of the Col- tional psychopathology. Acta Psychiatrica Scandinavica, 129(5),
legium Internationale Neuropsychopharmacologicum (CINP), 323–327.
17(8), 1343–1352. Vieta, E., & Cruz, N. (2012). Head to head comparisons as an alterna-
Schou, M. (1993). Is there a lithium withdrawal syndrome? An exami- tive to placebo-controlled trials. European Neuropsychopharmacol-
nation of the evidence. The British Journal of Psychiatry, 163, ogy, 22(11), 800–803.
514–518. Vieta, E., Grunze, H., Azorin, J. M., & Fagiolini, A. (2014). Phenomenol-
Schulz, K.  F., & Grimes, D.  A. (2002). Sample size slippages in ran- ogy of manic episodes according to the presence or absence of depressive
domised trials:  Exclusions and the lost and wayward. Lancet, features as defined in DSM-5: Results from the IMPACT self-reported
359(9308), 781–785. doi:10.1016/S0140-6736(02)07882-0 online survey. Journal of Affective Disorders, 156, 206–213.
Siddique, J., Brown, C. H., Hedeker, D., Duan, N., Gibbons, R. D., Vieta, E., Pappadopulos, E., Mandel, F.  S., Lombardo, I. (2011).
Miranda, J., & Lavori, P.  W. (2008). Missing data in longitudi- Impact of geographical and cultural factors on clinical tri-
nal trials—Part B, analytic issues. Psychiatric Annals, 38(12), als in acute mania:  Lessons from a ziprasidone and haloperidol
793–801. placebo-controlled study. International Journal of Neuropsychophar-
Sidor, M.  M., & Macqueen, G.  M. (2011). Antidepressants for the macology, 14(8), 1017–1027.
acute treatment of bipolar depression:  A  systematic review and Vieta, E., Thase, M.  E., Naber, D., D’Souza, B., Rancans, E., Lep-
meta-analysis. Journal of Clinical Psychiatry, 72(2), 156–167. ola, U., . . . Eriksson, H. (2014). Efficacy and tolerability of
doi:10.4088/JCP.09r05385gre flexibly-dosed adjunct TC-5214 (dexmecamylamine) in patients
Sidor, M. M., & MacQueen, G. M. (2012). An update on antidepres- with major depressive disorder and inadequate response to prior
sant use in bipolar depression. Current Psychiatry Reports, 14(6), antidepressant. European Neuropsychopharmacology:  The Jour-
696–704. doi:10.1007/s11920-012-0323-6 nal of the European College of Neuropsychopharmacology, 24(4),
Spanemberg, L., Massuda, R., Lovato, L., Paim, L., Vares, E. A., Sica 564–574.
da Rocha, N., & Cereser, K. M. (2012). Pharmacological treatment Vieta, E., & Valentí, M. (2013). Pharmacological management of bipo-
of bipolar depression: Qualitative systematic review of double-blind lar depression:  Acute treatment, maintenance, and prophylaxis.
randomized clinical trials. Psychiatric Quarterly, 83(2), 161–175. CNS Drugs, 27(7), 515–529. doi:10.1007/s40263-013-0073
doi:10.1007/s11126-011-9191-1 Walsh, B. T., Seidman, S. N., Sysko, R., & Gould, M. (2002). Placebo
Strech, D., Soltmann, B., Weikert, B., Bauer, M., & Pfennig, A. (2011). response in studies of major depression: Variable, substantial, and
Quality of reporting of randomized controlled trials of phar- growing. JAMA, 287(14), 1840–1847.
macologic treatment of bipolar disorders:  A  systematic review. World Health Organization. (n.d.). WHO Disability Assessment
Journal of Clinical Psychiatry, 72(9), 1214–1221. doi:10.4088/ Schedule 2.0: WHODAS 2.0. Retrieved from http://www.who.int/
JCP.10r06166yel classifications/icf/whodasii/en/
Sysko, R., & Walsh, B.  T. (2007). A systematic review of placebo Yildiz, A., Guleryuz, S., Ankerst, D.  P., Ongur, D., & Renshaw,
response in studies of bipolar mania. Journal of Clinical Psychiatry, P.  F. (2008). Protein kinase C inhibition in the treatment of
68(8), 1213–1217. mania:  A  double-blind, placebo-controlled trial of tamoxifen.

S trategies for I mprov ing R andomized T rial E v idence for T reatment of B ipolar D isorder   •   621
Archives of General Psychiatry, 65(3), 255–263. doi:10.1001/arch- Young, A.  H., McElroy, S.  L., Bauer, M., Philips, N., Chang, W.,
genpsychiatry.2007.43 Olausson, B.,  . 
. 
. 
Investigators, E.  I. (2010). A double-blind,
Yildiz, A., Vieta, E., Correll, C. U., Nikodem, M., & Baldessarini, R. J. placebo-controlled study of quetiapine and lithium monotherapy in
(2014). Critical Issues on the Use of Network Meta-analysis in Psy- adults in the acute phase of bipolar depression (EMBOLDEN I).
chiatry. Harvard Review of Psychiatry, 22(6), 367–372. Journal of Clinical Psychiatry, 71(2), 150–162.
Yildiz, A., Vieta, E., Leucht, S., & Baldessarini, R.  J. (2011). Efficacy Young, R. C., Biggs, J. T., Ziegler, V. E., & Meyer, D. A. (1978). A rat-
of antimanic treatments: Meta-analysis of randomized, controlled ing scale for mania: Reliability, validity and sensitivity. The British
trials. Neuropsychopharmacology, 36(2), 375–389. doi:10.1038/ Journal of Psychiatry, 133, 429–435.
npp.2010.192 Zimmerman, M., Posternak, M. A., & Chelminski, I. (2004). Implica-
Yildiz, A., Vieta, E., Tohen, M., & Baldessarini, R.  J. (2011). Factors tions of using different cut-offs on symptom severity scales to define
modifying drug and placebo responses in randomized trials for remission from depression. International Clinical Psychopharmacol-
bipolar mania. International Journal of Neuropsychopharmacology, ogy, 19(4), 215–220.
14(7), 863–875. doi:10.1017/S1461145710001641

62 2   •   S E C T I O N I V:   F U T U R E R E S E A R C H

You might also like