You are on page 1of 6

97

Noninferiority Clinical Trials: The Good, the Bad,


and the Ugly
Emmanuel Lesaffre, Dr. Sc.1

1 L-Biostat School of Public Health, KU Leuven, University of Leuven, Address for correspondence Emmanuel Lesaffre, Dr. Sc., L-Biostat
Leuven, Belgium School of Public Health, KU Leuven University of Leuven,
Kapucijnenvoer 35, Leuven 3000, Belgium
Semin Liver Dis 2018;38:97–102. (e-mail: emmanuel.lesaffre@kuleuven.be).

Abstract For decades, the superiority trial has been the most popular design to assess the
efficacy of newly developed drugs in a randomized controlled clinical trial. In a
superiority trial, the aim is to show that the new (experimental) treatment is better
than the standard treatment or placebo. However, it becomes increasingly difficult to

Downloaded by: University of Massachusetts - Amherst. Copyrighted material.


improve the efficacy upon that of existing drugs. For this reason, noninferiority designs
have been suggested. In a noninferiority study, one aims to show that the experimental
treatment does not lower the efficacy of the standard treatment too much, but this loss
of efficacy should be compensated by other better properties. In this article, the
design, aims, and properties of the superiority and the noninferiority trial are
contrasted and illustrated on recently published studies to treat patients with advanced
hepatocellular carcinoma. The author discusses the reasons why noninferiority studies
Keywords are becoming popular, but also why the results of noninferiority studies may be difficult
► equivalence trial to interpret and can be easily misused. Since only a few noninferiority studies in
► noninferiority trial hepatocellular cancer have been organized, also examples from other therapeutic areas
► nonsignificance were taken. Finally, it is indicated how to appreciate the qualities of published
► superiority trial noninferiority studies.

Sorafenib was the first and the only drug that improved least as high efficacy of the experimental treatment as
overall survival (OS) in patients with advanced hepatocel- sorafenib, with perhaps some additional benefits.
lular carcinoma (HCC).1 The drug is globally approved for this In the first scenario, one uses a superiority test to verify
indication, but is also associated with major toxicities and the greater efficacy than placebo in second line treatment.
30% of patients stop the treatment because of intolerance.2 In the second scenario, a noninferiority (NI) test evaluates
Therefore, there is still the need for developing new drugs the new agent against sorafenib in first line treatment.
that may be better suited for those who do not tolerate In this article, superiority testing is contrasted with NI
sorafenib or for whom the drug is inefficacious. Two strate- testing. Both statistical as well as clinical aspects are dis-
gies have been developed when evaluating newly developed cussed. Focus is on the NI test, which has become increas-
drugs to treat HCC patients. ingly popular in drug research in the last two decades.
In the first strategy, the efficacy of the new agent is Experience with NI tests revealed their positive and negative
evaluated in a placebo-controlled trial for the second line aspects.
treatment of those patients for whom sorafenib turned out To introduce the theoretical concepts, one generically
not to be appropriate. It is then hoped that the new agent has speaks of the experimental (E) treatment (new agent) and
greater efficacy than placebo. In the second strategy, the the control (C) treatment (standard treatment or placebo).
efficacy and safety of the new agent is compared with that of Illustrations will be based on the results from recent clinical
sorafenib as the first line treatment of HCC patients. In this trials on advanced liver cancer. More specifically, the super-
case, a cautious approach could then be to demonstrate at iority studies like BRISK-PS,3 EVOLVE-1,4 and REACH2 for

Copyright © 2018 by Thieme Medical DOI https://doi.org/


Publishers, Inc., 333 Seventh Avenue, 10.1055/s-0038-1655777.
New York, NY 10001, USA. ISSN 0272-8087.
Tel: +1(212) 584-4662.
98 Noninferiority Clinical Trials Lesaffre

treating HCC patients in second line will be considered. To result at a two-sided significance level of 0.05. A statistically
illustrate NI concepts, use is made of the studies such as significant result is claimed when p-value is < 0.05. For the
BRISK-FL,5 LIGHT,6 and the recently published study7 com- EVOLVE-1 study, the sample size computation was based on
paring lenvatinib with sorafenib in the first line treatment of HR ¼ 0.714, while for the REACH study HR ¼ 0.75 was
HCC patients. But, also NI studies from other therapeutic assumed. In each of these studies, the aim is to reject the
areas will be taken to illustrate concepts. null hypothesis given by:

H0: HR (¼ δSUP) ¼ 1. (1)


The Superiority Trial
For many years, randomized controlled trials (RCTs) were The null hypothesis assumes that the two treatments have
designed to show that the experimental treatment is superior equal hazard rates and hence the same median survival.
to a control treatment. Such a RCT is called a superiority trial and It is well-known that a two-sided p-value (sometimes
the associated statistical test a superiority test. With a statisti- denoted as 2p) is smaller than 0.05 if and only if the two-
cally significant result based on either a p-value or a confidence sided 95% CI of HR does not include the value of 1. It is also
interval (CI) one concludes that E has a different effect than C. useful to remember that a two-sided p-value is the sum of
When the observed result is in favor of E, it is claimed that the two one-sided p-values (denoted here as 1p): one in the
experimental treatment has a statistically significantly higher direction of small HR values (here brivanib better than

Downloaded by: University of Massachusetts - Amherst. Copyrighted material.


efficacy than the control treatment. What the meaning is of placebo), and one in the direction of large HR values (here
“statistical significance” will become clear below. brivanib worse than placebo). In the REACH study, one
We now discuss the superiority tests used in the BRISK-PS, required that the 1p-value (in the good direction) is smaller
EVOLVE-1, and REACH studies. The primary analysis of the than 0.025. With a result in the good direction, a 2p-value
BRISK-PS study is based on a comparison of the estimated smaller than 0.05 is equivalent to a 1p-value smaller than
hazard ratio (HR) for OS of brivanib-treated HCC patients 0.025. Correspondingly, one can compute a one-sided 97.5%
versus placebo-treated HCC patients. A formal statistical CI, which is in fact the two-sided 95% CI but at one end
procedure is based on the log rank test. Note that the hazard unlimited, i.e., 0 or 1. The estimated HR (with two-sided 95%
is an expression of the instantaneous risk, here, for overall CI) in the BRISK-PS study was: 0.89 (0.69, 1.15) with 2p-
mortality. The HR is then the ratio of the hazards for death value ¼ 0.3307. For the EVOLVE-1 study, the results were:
(here of brivanib/placebo). The log rank test formally eval- HR ¼ 1.05 (0.86, 1.27) and 2p-value ¼ 0.68. In the REACH
uates the HR, and implicitly assumes that it is constant over study, the HR ¼ 0.87 (0.79, 1.05). Since in the REACH study,
time. Based on this test and assuming that the true HR is one reported the one-sided p-value, we could also report
equal to 0.67, 282 deaths were required in the BRISK-PS the corresponding one-sided 97.5% CI extending here from 0
study to obtain 90% power and a statistically significant to 1.05. The three CIs are displayed in ►Fig. 1.

Fig. 1 Observed hazard ratios and two-sided 95% confidence intervals (CIs)/one-sided 97.5% CIs published in studies comparing experimental
trials to treat hepatocellular carcinoma (HCC) patients versus sorafenib in the first and second line treatments. Superiority (3 bottom) and
noninferiority (NI) (top 2) results are reported. The dotted horizontal lines extend the two-sided 95% CI to the one-sided 97.5% CI. The dotted
vertical lines define the NI margins chosen in the respective studies.

Seminars in Liver Disease Vol. 38 No. 2/2018


Noninferiority Clinical Trials Lesaffre 99

In ►Fig. 1, a hypothetical interval of equivalence is tolerated by some HCC patients, brivanib could be added to
indicated, which runs from 0.9 to 1/0.9 ¼ 1.11. An alterna- the palette of treatments and may open other treatment
tive to the standard superiority test, is to assess whether the options for the oncologists. The same is true for linifanib,
HR is at most 0.9. The null hypothesis for this test becomes which was compared with sorafenib in the LIGHT study.
The NI margin for the BRISK-FL study was set at δINF
H0: HR  δSUP ¼ 0.9. (2) ¼ 1.08, while for the LIGHT study δINF ¼ 1.0491 was chosen.
At the end of the study, NI could not be claimed, for any of the
The aim of this superiority test is in fact to show that the two studies. In the BRISK-FL trial the observed HR was 1.06,
true HR is less than 0.9, i.e., that the experimental treatment with a two-sided 95% CI equal to (0.93, 1.22), and hence
provides a clinically attractive benefit to the patients. This crossing the boundary of 1.08. For the LIGHT study, the
implies that the two-sided 95% CI or the one-sided 97.5% CI results are HR ¼ 1.046 with a two-sided 95% CI ¼ (0.896,
are completely left to HR ¼ 0.9. This kind of superiority test 1.221), again crossing the boundary now equal to 1.0491. In
was advocated already in 1986 by Spiegelhalter and Freed- ►Fig. 1, the results of the two trials together with their
man,8 and is considered a more clinically useful superiority chosen NI margins are displayed.
test, but has not been followed up by the medical commu-
nity. Of course, one practical problem is the choice of δSUP,
Three Classical Designs
which may be subject of discussion.

Downloaded by: University of Massachusetts - Amherst. Copyrighted material.


When the NI design was introduced into clinical trials, it was
sometimes wrongly referred to as an equivalence design. To
The Noninferiority Trial
fix ideas, we therefore briefly review the characteristics of
In a NI study one aims to show that the new and the standard the three classical designs in RCTs, but applied in the current
drug have almost the same efficacy, but that the new drug context of comparing hazards.
exhibits some other benefits to the patients, e.g., causes less
• Superiority design: In a classical superiority test, the aim is
adverse events. One needs to reflect on the meaning of
to reject that the true HR is greater than δSUP ¼ 1. To
“almost the same efficacy,” which on its turn comes down
obtain a more clinically meaningful superiority test, δSUP
to choose a NI margin.
¼ 1 could be replaced with a value smaller than 1. When
Let us, purely hypothetical, assume that a HR of 1.11 is
2p < 0.05 or 1p < 0.025 (in the correct direction), the
still clinically acceptable as long as the experimental drug
new drug is claimed superior to the control drug. This is
offers some other, important, clinical benefit to the patient.
equivalent to the two-sided 95% (or one-sided 97.5%) CI
The new drug will then be called noninferior to the stan-
not including δSUP. A superiority design is the most
dard drug if not only the observed HR but also the one-sided
common design for comparisons against placebo.
97.5% CI are located left of the margin of 1.11. For instance,
• Equivalence design: This design is typically used to show
suppose that the observed HR for E versus C is equal to 0.95
bioequivalence of a generic drug to the corresponding
with one-sided 97.5% CI running from 0 to 1.08, then E is
commercial drug. For the equivalence test, two values
called noninferior to C. However, with an observed HR of
with an interval of clinical equivalence are needed. In
0.83 and a one-sided 97.5% CI running from 0 to 1.15, E
►Fig. 1, the interval (0.9, 1.11) could define clinical
cannot be called noninferior to C. Hence, although the
equivalence between the two drugs in terms of hazard
observed HR for the new agent is smaller in the second
rates. Significance at 0.05 is realized when the two-sided
study, the uncertainty around the observed result is too
95% CI of HR is completely inside that interval. This design
high and crosses the boundary. For a NI test, the null
is, however, not used for therapeutic trials since equiva-
hypothesis is
lence is rejected when the 95% CI crosses the left boundary
of the interval, indicating that the HR is going in the good
H0: HR  δINF ¼ 1.11. (3)
direction.
• Noninferiority design: For the NI design, one specifies that
Rejection of this mull hypothesis occurs when the one-
the NI margin δINF defines the interval of NI, which runs
sided 97.5% CI (and thus the corresponding two-sided 95% CI)
here from 0 to δINF. A (one-sided) significant result at
is located completely left of 1.11, and then one claims that E is
0.025 is obtained when the one-sided 97.5% (but also two-
noninferior to C for δINF ¼ 1.11. Note that there is a corre-
sided 95%) CI does not contain δINF. Note, however, that
sponding significance test, which is an adapted version of the
despite what the term “noninferior” might suggest, one
log rank test to test now the null hypothesis given in (3). A
can only show with a NI design that the experimental drug
significant 1p-value (say < 0.025) then corresponds to a
is not much less efficacious than the control drug.
claim of NI of E versus C.
The BRISK-FL study is a NI trial comparing the efficacy of
A Critical Discussion of the Noninferiority
brivanib to sorafenib in first line HCC patients. Given that not
Trial
all patients benefit from sorafenib, having more than one
drug to treat HCC patients would help the treating oncolo- NI designs have become increasingly popular in drug
gists in their struggle against the disease. Hence, if brivanib research in the last two decades, but have not penetrated
had about the same overall efficacy as sorafenib, but better all therapeutic areas equally. Indeed, NI designs are still

Seminars in Liver Disease Vol. 38 No. 2/2018


100 Noninferiority Clinical Trials Lesaffre

scarce in HCC trials. A critical reflection about the positive statistical arguments. Choosing a NI margin using only
and negative aspects of NI designs is given below. clinical arguments may be tough, especially for OS in life-
threatening diseases as with hepatocellular cancer since
Reasons for Choosing a Noninferiority Design allowing increased mortality for the new drug may be
The standard design for RCTs is still the superiority trial, but difficult to justify. This approach is an example of a direct
there may be reasons why this design is not to be preferred. comparison between E and C.
Here are two main reasons: One may motivate the choice of the NI margin using an
indirect comparison with a (putative) placebo treatment.
• Placebo-controlled trials are not acceptable. Since sorafe-
This approach has been used in the comparison of novel oral
nib has shown to be an effective and established drug to
anticoagulants (NOACs) with warfarin for the treatment of
treat HCC patients, placebo-controlled RCTs to test first
patients suffering from atrial fibrillation. Indeed, several NI
line treatments in this indication are not ethically justifi-
trials were recently conducted to compare the efficacy of
able. The ASSENT II study9,10 was one of the first NI trials
these NOACs with warfarin in preventing (ischemic and
to test the efficacy of thrombolytics for treating acute
hemorrhagic) stroke or systemic embolism, see, e.g., the
myocardial infarct patients. The aim of the ASSENT II
ARISTOTLE study.12 These studies make use of the half
study was to test the efficacy of tenecteplase based on
rule. Loosely spoken, this rule defines the NI margin for
30-day mortality rate. At the planning of the trial, alte-
the HR such that the efficacy of the new drug is likely to be
plase (manufactured by the same drug company) was the

Downloaded by: University of Massachusetts - Amherst. Copyrighted material.


greater than that of an imaginary placebo arm (called
standard drug and had shown its effect in different
putative placebo arm). Results from past studies comparing
studies. Therefore, the ASSENT II study could not have
the standard drug versus placebo (say via HRs) are used to
placebo as control treatment.
define the NI margin. We now describe how this rule can be
• The experimental drug may not be more efficacious, but has
implemented in the BRISK-FL and LIGHT studies.
other merits. Sorafenib is not the preferred treatment for all
In the SHARP trial, the HR for OS of sorafenib versus
HCC patients. The search for new drugs that may improve
placebo (S/P) equals 0.58 with two-sided 95% CI ¼ (0.45,
efficacy over sorafenib is therefore still needed. It may not
0.74).1 Suppose that this is a stable result, i.e., that this HR is a
be possible to achieve higher overall efficacy than sorafenib,
good estimate of the true HR comparing sorafenib with
but the new drug may show greater efficacy than sorafenib
placebo (say obtained from a meta-analysis). The half rule
for a subgroup of patients or exhibit less adverse events.
would then consist in halving the interval between 0.74
Such a new drug may then open new therapeutic possibi-
(upper bound of the 95% CI) and HR ¼ 1 to yield 0.87. This
lities for the treating oncologist. Two examples from other
value could be taken as the lower bound for the HR of
therapeutic areas serve as an illustration. Celecoxib is an
sorafenib versus the experimental drug, i.e., (S/E). This
analgesic and nonsteroidal anti-inflammatory drug
results in a margin δNI ¼ 1/0.87 ¼ 1.15.
(NSAID) administered to osteoarthritis patients, but with
We note that no justification is given for the chosen NI
increased risk for gastropathy. On the other hand, COX-2
margins in the BRISK-FL and the LIGHT studies. Finally, we
inhibitors, such as etoricoxib, have shown in RCTs to have a
note that δNI needs to be set in agreement with the regulatory
similar efficacy as NSAIDs in the treatment of osteoarthritis
agencies, if drug registration is aimed at.
pain, but with less gastrointestinal adverse effects. A NI
design to compare etoricoxib with celecoxib was therefore
Intention-To-Treat or Per-Protocol Analysis?
conducted and described by Bingham et al.11 The motiva-
For a superiority trial, it is recommended (by regulatory
tion to opt for a NI design in the ASSENT II study was that
agencies) to use the intention-to-treat (ITT) population as
tenecteplase is administered as a single bolus, while the
the basis for the primary analysis, called the ITT analysis. By
standard treatment with alteplase needs (90 minutes)
definition, the ITT population consists of all randomized
infusion. While a faster administration of the thrombolytic
patients. In other words, once enrolled in the study the
can save lives, the drug company did not expect a priori
patient will be included in the ITT analysis. Protocol violators,
much greater efficacy than the standard treatment and thus
patients that miss one or more visits, dropouts, patients
one opted for a NI design.
randomized into the wrong group, etc. are analyzed accord-
Other reasons for choosing a NI design in the absence of ing to the planned treatment. Hence, for a badly conducted
improved efficacy could be that the new drug is cheaper to RCT the ITT analysis will provide a conservative estimate of
produce and also cheaper for the patients, or the new drug the treatment effect (E vs. C). Since the ITT analysis biases the
may exhibit better compliance, etc. Note, however, that in the treatment effect toward zero, such an analysis helps to prove
publications on the BRISK-FL study and the LIGHT study, no NI. Hence, a poor conduct of the study helps in claiming NI.
mention is made of additional benefits of the experimental Using the Per-Protocol (PP) population as basis for the
drugs. primary analysis is not a solution. The PP population consists
of a subset of patients excluding protocol violators, wrongly
Determining the Noninferiority Boundary randomized patients for whatever reason, noncompliers, etc.
The Achilles heel of the NI design is the choice of the margin Since restricting to this subset does not yield a result that is
δNI. The NI margin may be derived from pure clinical reason- representative for the entire population, it is not clear that a
ing or may be obtained from a combination of clinical and PP analysis is the way to go. There is no good solution to this

Seminars in Liver Disease Vol. 38 No. 2/2018


Noninferiority Clinical Trials Lesaffre 101

problem, and that is why the recommended approach for a NI one wishes to show. This is even more the case for NI
trial is to perform the ITT and PP analysis and hope that the studies. This is the “ugly” part of NI studies. For instance, it
two analyses will confirm each other. is not uncommon to see publications where a nonsignifi-
cant result is a posteriori interpreted as “no difference
Sample Size Calculations between treatment groups” or is turned into a NI claim
In a superiority trial, the power and sample size depend on with a NI margin defined after having seen the data. This is
the assumed clinically important treatment δCLIN. In a NI done in the SARAH trial where HCC patients were rando-
trial, the power and sample size depend on the choice of the mized to either selective internal radiotherapy (SIRT) or
NI margin δNI. When δCLIN ¼ δNI, the necessary sample size in sorafenib.15 This open-label study was designed as a super-
the superiority and the NI trial are equal when the sample iority trial, but failed to show a significant difference
size in the NI trial is determined assuming that the two drugs between OS of the two treatments. The authors did not
have equal efficacy. However, because δCLIN is typically make an explicit NI claim, but interpreted (between the
(much) larger than δNI, the calculated sample size for a NI lines, what often happens) their nonsignificant result as “no
trial is often much larger than that of a superiority trial. difference.” Because the patients appear to tolerate better
SIRT, the authors concluded that SIRT may have a lot to offer
Combining Noninferiority and Superiority in One Trial to future HCC patients. This conclusion was taken despite
Suppose that, at the analysis stage of a NI trial, the results the open-label character of the study, making it easier that

Downloaded by: University of Massachusetts - Amherst. Copyrighted material.


look unexpectedly good in the sense that the two-sided 95% several SIRT randomized patients received sorafenib. In
CI lies completely to the left of HR ¼ 1. Hence, not only NI was addition, the waiting list for SIRT was considerably longer
established, but it seems that one can make the stronger than with sorafenib. For those patients who urgently
conclusion of superiority. On the other hand, suppose one needed, treatment sorafenib was given, thereby reducing
plans a superiority trial but the RCT fails to show superiority, the difference in treatment effects when based on an ITT
under which conditions is a claim for NI justified? population as in the SARAH trial. Furthermore, despite
Definitely, moving from NI to superiority or vice versa can these protocol violations, the observed overall median
only be done if included in the protocol of the study. On the survival was slightly longer for sorafenib.
other hand, from statistical theory we conclude that a NI The fact that there is no preferred population (neither ITT
claim after a superiority test requires a correction for multi- nor PP) to report the results of a NI study also complicates the
ple testing. In the other direction, i.e., claiming superiority comparison of results obtained from different NI studies. The
after NI was obtained does not require a statistical penalty if same is true when studies use different NI margins. Finally,
applied to the same population (ITT or PP).13 This is due to the we can imagine that several NI studies are planned sequen-
closed testing principle.14 When NI is tested first in the PP tially in the future comparing first E1 with C, then E2 with E1,
population and then superiority in the ITT population, again then E3 with E2, etc. Such a series of NI studies may end up in
no penalty is applied, but the reverse order in testing an experimental drug that is worse in efficacy than placebo.
requires an adjustment for multiplicity. This is referred to as the biocreep phenomenon.

The Good, the Bad, and the Ugly Some Guidelines


Drug research is a time-consuming activity that requires Le Henanff et al16 critically reviewed noninferiority and
increasingly more resources due to more stringent regula- equivalence trials published between January 1, 2003, and
tions. In addition, it becomes increasingly difficult to December 31, 2004, and concluded that the reporting of such
improve upon the efficacy of current drugs. Hence, the times trials shows important deficiencies. Personal experience
that only classical superiority trials were sufficient are over. shows that authors still confuse nonsignificant results with
A great variety of new designs have been proposed lately. NI noninferiority claims. Also, the computed p-values reported
trials have entered the clinical trial arena to test new agents in a NI trial are often based on superiority tests. When
that could be attractive to the patient by being cheaper, noninferiority can be claimed, the NI p-value should show
easier to administer, exhibiting less adverse events, etc. with significance (p < 0.05). Nevertheless, it is not uncommon to
about the same efficacy. So NI designs filled in a definite see nonsignificant (superiority) p-values reported in that
need, and constitutes “the good” aspect of NI studies. case. We refer the reader to the extended CONSORT guide-
The choice of the NI margin could be difficult to motivate. The lines for noninferiority and equivalence trials.17
choice of the NI margin in the BRISK-FL and the LIGHT trials was Finally, here are some simple rules that allow the reader to
not justified. But, an appropriate choice of that margin is central verify the quality of a reported noninferiority trial:
in the NI design. Indeed, in contrast to the classical superiority
test, the NI margin is an essential part of the NI significance test. • Look carefully at the definition of noninferiority. This is of
Without a motivated choice of the NI margin or NI margins that crucial importance for the appreciation of the result.
change from study to study, unjustified conclusions are lurking. • Check if the definition of noninferiority is well justified
This is definitely the “bad” aspect of NI studies. from a clinical viewpoint.
Finally, as with superiority trials, statistical tests or • When comparing different noninferiority studies, check
conclusions can be twisted such that one “shows” what that the definition of NI is the same.

Seminars in Liver Disease Vol. 38 No. 2/2018


102 Noninferiority Clinical Trials Lesaffre

• Check the conduct of the trial: All aspects which reduce 8 Spiegelhalter DJ, Freedman LS. A predictive approach to selecting
the quality of the trial will help to “show” noninferiority. the size of a clinical trial, based on subjective clinical opinion. Stat
• Noninferiority CANNOT be defined/claimed a posteriori. Med 1986;5(01):1–13
9 Van De Werf F, Adgey J, Ardissino D, et al; Assessment of the Safety
and Efficacy of a New Thrombolytic (ASSENT-2) Investigators.
Single-bolus tenecteplase compared with front-loaded alteplase
References in acute myocardial infarction: the ASSENT-2 double-blind ran-
1 Ribeiro de Souza A, Reig M, Bruix J. Systemic treatment for domised trial. Lancet 1999;354(9180):716–722
advanced hepatocellular carcinoma: the search of new agents 10 Lesaffre E, Bluhmki E, Wang-Clow F, et al. The general concepts of
to join sorafenib in the effective therapeutic armamentarium. an equivalence trial, applied to ASSENT-2, a large-scale mortality
Expert Opin Pharmacother 2016;17(14):1923–1936 study comparing two fibrinolytic agents in acute myocardial
2 Zhu AX, Park JO, Ryoo B-Y, et al; REACH Trial Investigators. infarction. Eur Heart J 2001;22(11):898–902
Ramucirumab versus placebo as second-line treatment in 11 Bingham CO III, Sebba AI, Rubin BR, et al. Efficacy and safety of
patients with advanced hepatocellular carcinoma following etoricoxib 30 mg and celecoxib 200 mg in the treatment of
first-line therapy with sorafenib (REACH): a randomised, dou- osteoarthritis in two identically designed, randomized, pla-
ble-blind, multicentre, phase 3 trial. Lancet Oncol 2015;16(07): cebo-controlled, non-inferiority studies. Rheumatology (Oxford)
859–870 2007;46(03):496–507
3 Llovet JM, Decaens T, Raoul J-L, et al. Brivanib in patients with 12 Granger CB, Alexander JH, McMurray JJV, et al; ARISTOTLE Com-
advanced hepatocellular carcinoma who were intolerant to sor- mittees and Investigators. Apixaban versus warfarin in patients
afenib or for whom sorafenib failed: results from the randomized with atrial fibrillation. N Engl J Med 2011;365(11):981–992

Downloaded by: University of Massachusetts - Amherst. Copyrighted material.


phase III BRISK-PS study. J Clin Oncol 2013;31(28):3509–3516 13 Moyé LA. Multiple Analyses in Clinical Trials: Fundamentals for
4 Zhu AX, Kudo M, Assenat E, et al. Effect of everolimus on survival Investigators. New York: Springer; 2003
in advanced hepatocellular carcinoma after failure of sorafenib: 14 Lesaffre E. Use and misuse of the p-value. Bull NYU Hosp Jt Dis
the EVOLVE-1 randomized clinical trial. JAMA 2014;312(01): 2008;66(02):146–149
57–67 15 Vilgrain V, Pereira H, Assenat E, et al; SARAH Trial Group. Efficacy
5 Johnson PJ, Qin S, Park J-W, et al. Brivanib versus sorafenib as first- and safety of selective internal radiotherapy with yttrium-90
line therapy in patients with unresectable, advanced hepatocel- resin microspheres compared with sorafenib in locally advanced
lular carcinoma: results from the randomized phase III BRISK-FL and inoperable hepatocellular carcinoma (SARAH): an open-label
study. J Clin Oncol 2013;31(28):3517–3524 randomised controlled phase 3 trial. Lancet Oncol 2017;18(12):
6 Cainap C, Qin S, Huang W-T, et al. Linifanib versus sorafenib 1624–1636
in patients with advanced hepatocellular carcinoma: results of 16 Le Henanff A, Giraudeau B, Baron G, Ravaud P. Quality of reporting
a randomized phase III trial. J Clin Oncol 2015;33(02): of noninferiority and equivalence randomized trials. JAMA 2006;
172–179 295(10):1147–1151
7 Kudo M, Finn RS, Qin S, et al. Lenvatinib versus sorafenib in first- 17 Piaggio G, Elbourne DR, Pocock SJ, Evans SJ, Altman DG; CONSORT
line treatment of patients with unresectable hepatocellular car- Group. Reporting of noninferiority and equivalence randomized
cinoma: a randomised phase 3 non-inferiority trial. Lancet 2018; trials: extension of the CONSORT 2010 statement. JAMA 2012;308
391(10126):1163–1173 (24):2594–2604

Seminars in Liver Disease Vol. 38 No. 2/2018

You might also like