You are on page 1of 5

European Neuropsychopharmacology (2008) 18, 623–627

w w w. e l s e v i e r. c o m / l o c a t e / e u r o n e u r o

A regulatory Apologia — A review of placebo-controlled


studies in regulatory submissions of
new-generation antidepressants☆
Hans Melander a,⁎, Tomas Salmonson a,b ,
Eric Abadie b,c , Barbara van Zwieten-Boot b,d
a
Medical Products Agency, P.O. Box 26, SE-751 03 Uppsala, Sweden
b
CHMP, EMEA, London, UK
c
Agence Francaise de Securité Sanitaire des Produits de Santé, Saint-Denis, France
d
Medicines Evaluation Board, The Hague, Netherlands

Received 2 May 2008; received in revised form 11 June 2008; accepted 16 June 2008

KEYWORDS Abstract
Major depression;
Placebo; Data on percentage of patients experiencing a relevant response (N 50% reduction of the baseline
Clinical relevance Hamilton Depression Scale (HAMD) score), average baseline severity and sample size were
retrieved for all placebo-controlled studies in regulatory submissions of SSRIs and SNRIs between
1984 and 2003. Overall there was 16%-units (95% CI: 12; 20) more responders on active drug
compared to placebo. There was no evidence of a diminishing magnitude of effect with lower
severity at baseline. With one exception significant differences varying between 13.5 and 19.3%-
units were demonstrated for the individual antidepressants. Statistically significant mean
differences versus placebo in change in HAMD are not a proper basis for evaluation of clinical
relevance and are not sufficient for approval. Differences in the percentage of patients
experiencing a clinically relevant response should also be demonstrated. In this respect, the
approved SSRIs and SNRIs were found superior to placebo, independent of severity of depression.
© 2008 Elsevier B.V. and ECNP. All rights reserved.

1. Introduction

The efficacy of antidepressant medicinal products is con-


tinuously discussed (Kirsch and Saperstein, 1998; NICE, 2004;

Disclaimer: The views presented are those of the individual and
Kirsch et al., 2002, 2008). In particular the clinical relevance
may not be understood or quoted as being made on behalf of EMEA or of the effects demonstrated for new-generation drugs such
reflecting the position of EMEA. as selective serotonin reuptake inhibitors (SSRIs) and
⁎ Corresponding author. Tel.: +46 18174600; fax: +46 18508730. serotonin noradrenalin reuptake inhibitors (SNRIs) is ques-
E-mail address: hans.melander@mpa.se (H. Melander). tioned. It has been shown that meta-analyses based on

0924-977X/$ - see front matter © 2008 Elsevier B.V. and ECNP. All rights reserved.
doi:10.1016/j.euroneuro.2008.06.003
624 H. Melander et al.

published data only are bound to overestimate the magni- 2. Experimental procedures
tude of effect due to selective publication and selective
reporting, and with the inclusion of unpublished data
2.1. Selection of studies
submitted to the drug regulatory authorities the modest
effects seem to diminish further (Melander et al., 2003,
Turner et al., 2008). Some authors have discussed the impact The approval of antidepressants by competent authorities in
of severity of depression and argue that antidepressants are Europe as well as in other regions is based on more or less
effective for severely depressed patients but not for patients identical documentation. We chose to base our analyses on the
with mild to moderate depression (Angst, 1993, Kahn et al., clinical documentation for the six SSRIs and two SNRIs that
2002). In a recent meta-analysis based on placebo-controlled have been approved in Sweden for treating major depressive
studies submitted to the US Food and Drug Administration episodes. The market authorisation applications for these
this hypothesis is claimed to be confirmed (Kirsch et al., medicinal products were submitted between 1984 and 2003.
2008). Against this background one can ask why the new- The two most recent substances, escitalopram and duloxetine,
generation antidepressant medicinal products were ever have a European approval, while the earlier substances are
approved. nationally approved in Sweden as well as in most other
In the above meta-analyses questioning the clinical European countries. To be included in the analyses the studies
relevance of antidepressant effects, the focus has been on should be randomised, double-blind, placebo-controlled short-
the absolute average difference in change from baseline on term studies of at least 4 weeks duration with at least one
the Hamilton Depression Rating Scale (HAMD) between active treatment group with a dose in the dose range that was later
drug and placebo. Indeed, HAMD is the primary outcome in approved. In studies with more than one dose in the approved
most antidepressant studies and it is used by the regulatory dose range the results for the dose closest to the recommended
authorities to evaluate whether statistically significant dose were used in the analysis. With respect to severity of
differences between treatments have been shown. However, depression the studies should have an inclusion criterion
mean difference on any rating scale is not an appropriate requiring a score of at least 15 on the 17 item version of HAMD.
outcome for the evaluation of clinical relevance. Once
statistical significance has been established, the clinical 2.2. Data retrieval
value is judged on the basis of other outcomes, the most
important being the percentage of patients achieving a The primary endpoint for this analysis was the percentage of
clinically meaningful response. Thus, while the primary responders with response defined as at least 50% reduction of
statistical analysis operates on comparison on the group the baseline HAMD score at the end of the study. The
level, this responder analysis rather focuses on the individual calculation of percentage of responders was based on the
patient. randomised set, and patients discontinuing prior to end of
The purpose of this paper is to discuss the clinical study without evaluable data were considered as non-
relevance of SSRIs and SNRIs, the two most common classes responders. Percentage of responders, number of patients
of new-generation antidepressants, in terms of an outcome and average HAMD score for each treatment were retrieved
that is based on what is considered to be a relevant effect for from the original study reports. In case of missing or
an individual patient. The discussion will be based on inappropriately calculated responder figures, appropriate
analyses of the total placebo-controlled database available figures were collected from the authority assessment
when these medicinal products were approved. In addition, reports. In most studies the 17 item version of HAMD was
the hypothesis that these antidepressants are effective used, and for studies using other versions of HAMD the results
only in severely depressed patients will be tested in this were converted to the 17 item version (HAMD17). In a few
context. studies the Montgomery Åsberg Depression Rating Scale

Table 1 The placebo-controlled database at approval for SSRIs and SNRIs


Substance Year of submission Number studies Total number of patients Average baseline HAMD17 score⁎
Mean Range
Paroxetine 1990 15⁎⁎ 1142 21.4 19.6; 24.1
Fluvoxamine 1989 5 408 22.6 19.7; 26.3
Citalopram 1992 5 420 23.8 22.7; 26.4
Sertraline 1993 4 914 23.2 20.5; 25.5
Fluoxetine 1984 8 570 22.0 19.3; 28.4
Escitalopram 2001 4 1181 21.2 19.5; 21,4
Venlafaxine 1992, 1996 9⁎⁎⁎ 1493 21.4 19.5; 28.4
Duloxetine 2003 6 1246 19.8 17.6; 21.3
Total 56 7374 21.6 17.6; 28.4
⁎) Converted from average baseline MADRS score for one sertaline study and two escitalopram studies.
⁎⁎) One additional paroxetine study with 29 patients in total did not report responder results.
⁎⁎⁎) Immediate release formulation: 6 studies; extended release formulation: 3 studies.
A regulatory Apologia — A review of placebo-controlled studies 625

(MADRS) was the primary endpoint, and for these studies


response was defined as at least 50% reduction of the
baseline MADRS score. In three of these studies the HAMD
rating scale was not used at all, and the average baseline
MADRS score was converted to a HAMD 17 score using the
relation between HAMD17 and MADRS derived from 45
studies using both scales. The data was retrieved by two
reviewers independently, and any discrepancies were dis-
cussed and resolved in consensus.

2.3. Statistical methods

The relation between percentage of responders and severity


of depression, measured as average baseline HAMD17 score,
was evaluated with a linear regression model. The analysis Figure 2 Funnel plot. Estimate of the difference in percen-
was performed with sample size as weight. In the meta- tage of responders between active drug and placebo in relation
analyses of response rates for the different substances the to the total number of patients (active plus placebo) on which
estimates from the individual studies were combined, with the estimate is based.
the inverse of the variance of the estimates as weights
percentage of responders varied between 13.6% and 67.9%,
(DerSimonian and Laird, 1986).
and 7.5% and 55.4% for active drug and placebo, respec-
tively. A similar variability was observed for the difference
3. Results between treatments, from −3.3% to 49.6%-units (Fig. 2).
The most extreme differences in either direction occurred in
In total 57 studies fulfilled the inclusion criteria, but for one fairly small studies.
of these studies responder data was not available, neither in There was no statistical evidence of a relation between
the company study report nor the authority assessment average baseline HAMD17 score and difference in percen-
report. Since the sample size of this paroxetine study tage of responders (Fig. 1, p = 0.51, test of difference in
constitutes only 2.5% of the paroxetine placebo-controlled slopes). Neither was there any evidence of a relation
database and 0.4% of the entire SSRI and SNRI placebo- between baseline HAMD17 score and percentage of respon-
controlled database, the omission of this study was con- ders independent of treatment (p = 0.98 and p = 0.31 for
sidered not to influence the paroxetine or the overall results. active treatment and placebo, respectively, test of
The distribution over substances for the remaining 56 studies slope = 0). In the absence of a difference in slopes the over-
is presented in Table 1. The amount of placebo-controlled all difference in percentage of responders was estimated to
data varies between the substances with a clear tendency for 16.0%-units (95% CI: 12.0; 20.0).
smaller databases for the drugs that were approved early, The overall magnitude of effect for the different sub-
with the exception of paroxetine. Considering all studies the stances is illustrated in Fig. 3. With the exception of fluvoxa-
average baseline HAMD17 score ranged between 17.6 and mine significant differences varying between 13.5 and 19.3%-
28.4 The fluoxetine and venlafaxine studies covered most of units were demonstrated for all SSRIs and SNRIs.
this total range while the studies with escitalopram and
especially duloxetine had average baseline HAMD17 scores 4. Discussion
predominantly in the lower part of the total range.
The placebo response as well as the response to active In addition to being overall statistically superior to placebo
drug showed large variability between studies (Fig. 1). The with respect to average change in HAMD score (although

Figure 1 Scatter plot of average baseline HAMD17 score versus Figure 3 Overall difference in percentage of responders
percentage of responders with regression lines for active treat- between active drug and placebo with 95% confidence intervals
ment (solid line) and placebo (dashed line). for SSRIs and SNRIs.
626 H. Melander et al.

some individual studies might fail), we have shown that SSRIs of patients in remission, usually defined as a HAMD17 score of
and SNRIs overall are superior to placebo in providing a 7 or lower. However, remission rate is not always a pre-
clinically relevant benefit to depressed patients (48% versus specified endpoint and remission results have been less
32%, respectively, responding to treatment). There is no frequently, and thus potentially selectively, reported.
evidence that this excess benefit is limited to severely Furthermore, to establish whether a sustained remission
depressed patients. has occurred longer studies are required.
Our analyses are based on the entire placebo-controlled One can always discuss whether a difference in response
documentation for all SSRIs and SNRIs available when these rate of 16%-units is large, modest or only marginal. In
medicinal products were approved. In contrast to meta- analyses of average absolute change from baseline it has
analyses based on statistical and medical reviews of studies been estimated that about 80% of the drug effect is
submitted to the FDA we have had access to full study attributable to placebo. Similarly, with 49% responders on
reports, and thus have been able to perform more active treatment and 33% on placebo one can argue that two
comprehensive data retrieval. In our study responder data thirds of the drug effect is attributable to placebo. However,
was missing in only one out of 57 studies, while in a recent such arguments is based on the doubtful assumption that the
study based on FDA submissions (Kirsch et al., 2008) twelve placebo effect and the pharmacological effect are additive.
out 47 studies were excluded due to missing information on Furthermore, some of the placebo effects are probably due
crucial variables. When applying for marketing authorisa- to study specific procedures (increased attention, therapeu-
tion, the applicant is obliged to submit full reports of all tic impact of weekly rating sessions) that are not present in
studies performed by the applicant as well as all available clinical practice. Hence, the 16%-units difference observed
information on any study performed by others than the in placebo-controlled studies could be considered as lower
applicant. Thus, it is reasonable to conclude that the basis limit of the pharmacological effect that could be expected in
for approval was not subjected to selection bias. Neither clinical practice.
should there have been any risk for selective reporting since In conclusion, the approval of the SSRIs and SNRIs were
missing appropriately calculated responder figures are based on data demonstrating that they provide clinically
routinely asked for or calculated by the authority. These meaningful benefits to a non-negligible percentage of the
conclusions are supported by the symmetrical funnel plot patients.
(Fig. 2). We did not include placebo-controlled studies after
the approval, essentially studies where an approved anti- Role of the funding source
depressant is included as an active control in a placebo-
controlled study for a forthcoming competitor. First, these
There has been no funding of the research reported in this paper.
studies were not available at the approval, and second, a re-
evaluation including these studies is doubtful as usually only
one dose is used with no or limited options for further Contributors
titration, which might be sub-optimal.
The purpose of this paper was not to compare the HM and TS designed the study. HM was responsible for data retrieval
different SSRIs and SNRIs, and the absence of direct and statistical analysis. All authors interpreted the results and
comparisons should preclude any comparison between the contributed to the writing of the paper.
substances. However, it seems reasonable to conclude that
the documentation for fluvoxamine is less convincing. Conflict of interest
Comparisons between the different substances should be
based on head-to-head comparisons predominantly per- Neither of the authors have any conflict of interest apart from being
formed after the approval. However, these studies are not an employee of a regulatory authority.
always submitted to the regulatory authorities, and the
decision to make them publicly available is left to the
Acknowledgements
sponsor and selective publication as well as selective
reporting is to be expected.
We thank Ms Therese Gester for assisting in the data retrieval
The responder criterion used in this investigation, at least
process.
50% reduction of the baseline HDRS score is well recognized
and used in almost every antidepressant study. Since a
HAMD17 score of at least 18 is required for inclusion in most References
studies, at least 9 points improvement is required for an
individual patient to be counted as a responder. This should Angst, J., 1993. Severity of depression and benzodiazepine co-
be compared to the average difference in change of about 2 medication in relationship to efficacy of antidepressants in acute
points usually observed. The responder criterion is a relative trials: a meta-analysis moclobemide trials. Hum. Psychopharma-
measure, and thus floor effects, which might contribute to col. 8, 401–407.
DerSimonian, R., Laird, N., 1986. Meta-analysis in clinical trials.
the diminishing effect size observed with lower average
Control Clin. Trials 7, 177–188.
HAMD score at baseline, are avoided. Finally, counting
Kahn, A., Leventhal, R., Kahn, S., et al., 2002. Severity of depression
discontinuing patients with no evaluable data as non- and response to antidepressants and placebo: an analysis of the
responders is conservative, i.e. not likely to favour active Food and Drug Administration database. J. Clin. Psychopharma-
treatment, since discontinuations are usually less frequent in col. 22, 40–45.
the placebo groups. An alternative outcome based on Kirsch, I., Saperstein, G., 1998. Listening to Prozac but hearing
individual patient benefit could have been the percentage placebo: a meta-analysis of antidepressant medication.
A regulatory Apologia — A review of placebo-controlled studies 627

Prev. Treat. 1 Article 0002a. Available: http://www.journals. Melander, H., Ahlqvist-Rastad, J., Meijer, G., Beermann, B., 2003.
apa.org/prevention/volume1/pre001002a.html. Evidenced b(i)ased medicine — selective reporting from studies
Kirsch, I., Moore, T.J., Scoboria, A., Nicholls, S.S., 2002. The emperor's sponsored by pharmaceutical industry: review of studies in new
new drugs: an analysis of antidepressant medication data drug applications. BMJ 326, 1171–1173.
submitted to the U.S. Food and Drug Administration. Prev. Treat. National Institute for Clinical Excellence, 2004. Depression: Manage-
5 Article 23. Available: http://www.journals.apa.org/prevention/ ment of Depression in Primary and Secondary Care. Clinical prac-
volume5/pre0050023a.html. tice guideline, vol 23. National Institue for Clinical Excellence,
Kirsch, I., Deacon, B.J., Huedo-Medina, T.B., Scoboria, A., Moore, T.J., London.
et al., 2008. Initial severity and antidepressant benefits: a meta- Turner, E.H., Matthews, A.M., Linardatos, E., Tell, R.A., Rosenthal,
analysis of data submitted to the Food and Drug Administration. R., 2008. Selective publication of antidepressant trials and its
PLoS Med. 5 (2), e45. doi:10.1371/journal. pmed.0050045. influence on apparent efficacy. N. Engl. J. Med. 358, 252–260.

You might also like