You are on page 1of 4

Published Ahead of Print on October 14, 2008 as 10.1200/JCO.2008.18.

3996
The latest version is at http://jco.ascopubs.org/cgi/doi/10.1200/JCO.2008.18.3996

JOURNAL OF CLINICAL ONCOLOGY E D I T O R I A L

Registries That Show Efficacy: Good, but Not


Good Enough
Mark N. Levine and Jim A. Julian, Department of Oncology, McMaster University; Juravinski Cancer Centre, Hamilton,
Ontario, Canada

When readers of Journal of Clinical Oncology (JCO) scan the The value of a registry greatly depends on the quality of its data. A
Table of Contents of an issue, they are likely to recognize designs of primary concern is whether the data are accurate and complete. There
interventional research studies such as phase I trials, phase II trials, and can be many sources of data errors. Two examples are errors in
randomized phase II and III trials. We suspect, however, that the programming and inaccurate transcription. To ensure the integrity of
reader has much less familiarity with research involving observational a registry’s data, a set of procedures should be in place both before data
study designs, administrative databases, phase IV studies, or registries. collection, to ensure the highest quality data at the time of collection,
We have also noticed that these latter terms are often used inter- and after collection, to identify and correct sources of error.1,4 Exam-
changeably. Our goal in the present commentary is not to provide a ples of the former are a clear definition of data characteristics and
primer on epidemiology but to focus on registries and address such study design and training of data collectors, and examples of the latter
questions as “What are they?”, “What can they be used for?”, and are data checks and site visits.
“What are some of their limitations?” One example of a registry that has been used as a platform for a
In recent years, there has been a marked increase in the number number of research publications in JCO is the Surveillance, Epidemi-
of medical registries and in publications using registry data. The word ology, and End Results Program (SEER) database.7 The primary pur-
registry is derived from the Latin word registrum, which means list or pose of this registry is to provide information on cancer incidence and
catalogue. A medical registry can be defined as a systematic collection survival. It covers approximately 26% of the US population. Since
of a set of health and demographic data for patients with specific health January 2006, there have been 16 papers published in JCO that have
characteristics held in a defined database for a predefined purpose.1-3 involved research using SEER. SEER has also been used to study the
Registries were first used to describe disease incidence and as a re-
cost of cancer care at a population level.8 This database is large and
source for epidemiologic research. Subsequently, with advances in
comprehensive, with information collected on tumor characteristics
computer technology making it possible to store large amounts of data
including stage, demographic data, surgical intervention, and whether
concerning treatment of patients with medical disorders, registries
radiation therapy was administered. SEER includes follow-up for vital
have been used to describe patterns of clinical care that generate
status and cause of death. Data are collected primarily from an insti-
inferences on quality of care and even effectiveness of therapy. They
tutional source and not from physicians’ offices, so that information
have even been used to assess safety of a drug after its approval by a
on chemotherapy and hormonal therapy is under-reported. A quality
regulatory agency after randomized trials (phase IV or postmarketing
studies). Some registries contain data on all cases of a particular disease control program is conducted each year by the National Cancer Insti-
in a defined population such that the cases can be related to a popula- tute to evaluate the quality and completeness of the SEER data. The
tion base. With this information, incidence rates can be calculated. strengths and limitations of the SEER registry in general were dis-
Information on outcomes, such as remission and death, can be ob- cussed in a 2003 editorial in JCO by Linda Harlan,9 and the limitations
tained if the patients are followed up routinely, either directly or of making inferences on therapy using this registry were also high-
through linkage with other registries or administrative databases (eg, lighted in more recent editorials.10,11
cancer or death registries). There are also registries (such as hospital- A suggested guide to the reader when reading a registry article is
based or disease-specific registries) that are not population based. presented in Table 1. Answering the questions in Table 1 will help in
The intended purpose of a registry should always be prespecified critically appraising the quality of the article.
and should define the necessary properties of the data to be collected. We will discuss some of the limitations of registries when they are
There are different types of data sources available that are collected.4-6 used to compare interventions. First, the allocation of patients to the
For example, administrative data are data that were originally col- intervention is not random. Therefore, the intervention and compar-
lected for reasons other than research. Encounter data sets maintain a ison groups may differ in ways that affect the study outcome, poten-
record of health care encounters and typically are maintained by tially leading to biased overestimates of benefit (ie, selection bias).
payers to track reimbursement (eg, discharge database). Enrollment Second, follow-up is generally not as active or as standardized as in
data allow the determination of a denominator population from randomized trials; therefore, ascertainment of outcomes may be in-
which the encounter numerator is drawn (for example, census data). complete or inaccurate. Moreover, in some cases, outcomes may not

Journal of Clinical Oncology, Vol 26, 2008 © 2008 by American Society of Clinical Oncology 1
DOI: 10.1200/JCO.2008.18.3996; published online ahead of print at www.jco.org on October 13, 2008
Information downloaded from jco.ascopubs.org and provided by GENENTECH INC on October 14, 2008 from
192.12.78.245.
Copyright © 20082008
Copyright by theby
American SocietySociety
American of Clinicalof
Oncology. All Oncology
Clinical rights reserved.
Levine and Julian

falls that can occur when registry data are used to suggest that an
Table 1. Methodology Criteria for Critically Appraising the Quality of
a Registry Study
intervention works.18-20
In a recent edition of JCO, Hillner et al18 reported the initial
Criteria Questions
results from the National Oncologic PET Registry (NOPR). Consid-
Are the study results What is the population base of the registry?
generalizable to Is it well described? erable thought and effort went into the design, creation, and imple-
my patients? Are the patients highly selected? mentation of this registry, in which data are supplied by participating
Is the purpose of the Can data from the registry answer the question centers via the internet.21,22 The primary research goal was to assess the
registry clearly being asked?
stated? effect of positron emission tomography (PET) on referring physicians’
Is the data in the Are procedures in place to ensure accuracy and plans of intended patient management. Some procedures were imple-
registry of high completeness (eg, checks and audits)? mented to ensure that the data were of high quality. There were a series
quality?
Are the outcome Are objective criteria used?
of case report forms that were completed according to a schedule.
measures Is the assessment done in a blinded fashion? Briefly, the referring physician requested the PET scan and completed
reasonable? Is the assessment the same in groups being a pre-PET form, which included the reason for the PET imaging,
compared (potential for bias)?
What is the patient Is there missing data?
cancer type, performance status, and planned management if PET was
follow-up? Is the loss to follow-up stated? not available. Patients were asked to provide consent to have their data
Are groups in the Do the types of patients participating in the included in the research database. Once the PET scan was completed,
registry being comparison groups differ (potential for bias)?
compared, and Does the information collected on the patients
the PET report was uploaded to the database. The final step was the
is this potentially differ between groups (potential for bias)? completion of a post-PET form by the referring physician. Fifteen
problematic? Are there important factors that either have not priority areas for early assessment based on tumor type and indication
been collected or have not been used in the
analysis that can affect both group were determined. The NOPR working group established a plan to
membership and outcome (confounder bias)? analyze the data in three phases. In the current report, the first phase of
Is the analysis appropriate?
the evaluation in which all cancers are aggregated together across all
indications is presented.
Hillner et al18 should be congratulated on their remarkable
achievement. It is less than 2 years since the registry opened, and
results on 34,000 patients registered in the first year have already been
be assessed by individuals who are blinded to the intervention alloca- analyzed and published. Overall, physicians changed their intended
tion, leading to further potential for bias. In other cases, follow-up is management in 36.5% of patients after PET. Whether the initial indi-
conducted passively through linkages to administrative databases. Be- cation for PET was diagnosis, initial staging, restaging, or suspected
cause the coding of events in these data sets is not primarily for recurrence, there was a major change in management of the patient in
research purposes (and because coding is tied to other incentives such approximately one third of the cases.
as reimbursement), these methods to ascertain outcomes, although At first pass, the large numbers and magnitude of the benefit
powerful, may be prone to misclassification of outcomes as a result of associated with PET are impressive. However, an important question
coding errors and variations in coding practices. Third, because data is, “How do the results apply to my patient in clinic?” In methodology
collection for registries is often more passive than data collection in jargon, this is referred to as generalizability. The results of the registry
randomized trials, missing data may be a greater potential problem for are presented aggregated by PET indication and not by tumor type,
registries. Finally, although registries are typically more generalizable stage, or clinical scenario. We do not know what other imaging tests
to real-world practice because of their observational design, entry into were performed (or not) before PET. There is no control group in the
a registry may not be as strictly monitored compared with randomized registry. So the question is, “PET changed management compared
trials. This creates the potential for ineligible patients to enter the with what?” It is not known whether the same changes in management
registry and may weaken the generalizability of findings obtained from would have occurred with simpler, cheaper tests. This is an important
analysis of registry data. To minimize these threats to validity, analyses question in many environments concerned about health care costs. In
of registry data should carefully select a comparison group that mini- patients with planned biopsy before PET, biopsy was avoided in ap-
mizes selection bias, describe differences in important prognostic fac- proximately 70%. The avoidance of a biopsy is important because it
tors between intervention and comparison groups, and statistically saves a patient from potential risk. However, it is possible that, in some
adjust for these differences.12-15 It is important to note, however, that cases, a biopsy would have been necessary and the PET-driven strategy
even the most sophisticated statistical methods cannot correct for may have been wrong. We do not know how often this occurred.
differences in unmeasured or unknown confounding factors. In addi- Finally, in the NOPR, the referring physician indicated his or her
tion, estimates of effectiveness may be highly sensitive to the analytic intended change in plan, not what he or she actually did. There is
method used.16,17 Assessment of outcomes should be performed by abundant evidence from the literature that what physicians say they
individuals blinded to allocation where possible. This is often achieved will do and what they actually do in practice is not always the same.23
through linkage to administrative data but, as discussed earlier, comes Hence, it is possible that the 30% change in management is an over-
with its own set of limitations. To address these limitations, critical estimate of the true effect. Hillner et al plan to link their results with
data elements from administrative databases can be validated through Centers for Medicare and Medicaid Services billing records to address
chart review. Finally, to optimize the quality of registry data (eg, this issue, and a more complete understanding of the meaning of these
minimize missing data and ensuring adherence to eligibility criteria), initial published data may then become more evident.
regular random audits of registry data can be performed. Three In the accompanying editorial to the NOPR article, Larson24
recent publications in JCO provide examples of the potential pit- describes the strengths of the study and also points out the limitation

2 © 2008 by American Society of Clinical Oncology JOURNAL OF CLINICAL ONCOLOGY


Information downloaded from jco.ascopubs.org and provided by GENENTECH INC on October 14, 2008 from
192.12.78.245.
Copyright © 2008 by the American Society of Clinical Oncology. All rights reserved.
Editorial

that the end point was based on intended patient management rather atory in nature. The duration of bevacizumab therapy is an important
than actual management. He argues that the NOPR is a clinical trial question and can only be answered by a randomized trial, which has
and an alternative to the randomized trial, which he deems to be already been initiated (Southwest Oncology Group 0600); in this
impractical in the setting of evaluating imaging technology.24 How- trial, patients experiencing progression on oxaliplatin-based chem-
ever, as discussed earlier, studies using observational data can be otherapy plus bevacizumab in first-line therapy will be randomly
subject to unrecognized biases. Equipoise is the underpinning of the assigned to either stopping bevacizumab or continuing it with second-
question addressed in any randomized trial: “I don’t know if PET line irinotecan-based chemotherapy.
improves patient management.” In the NOPR, it is possible that phy- The Monoclonal Antibody Erbitux in a European Pre-License
sicians who participated were already convinced that PET changes (MABEL) study in this issue of JCO can also be considered a registry.20
clinical management, and this could have influenced their selec- In this study, 1,147 patients with metastatic colorectal cancer who had
tion of patients and their recording of change in management, recently experienced progression on irinotecan and who expressed
resulting in a systematic overestimate of the magnitude of the benefit. epidermal growth factor receptor received cetuximab plus irinotecan.
With more than 20,000 PET patients enrolled onto the NOPR, there The goals for the study were to provide safety and efficacy data on
would have been sufficient patient numbers to carry out a series of cetuximab plus irinotecan and to confirm the findings of the Bowel
carefully controlled trials in a number of indications. To under- Oncology with Cetuximab Antibody (BOND) trial in a community
score that randomized trials should be part of the process of tech- setting. In the BOND trial, 329 patients with epidermal growth factor
nology assessment, the Ontario Ministry of Health and Long Term receptor– expressing metastatic colorectal cancer that had progressed
Care is funding randomized trials and prospective cohort trials to within the previous 3 months on irinotecan were randomly assigned
introduce PET into that province.25,26 to cetuximab (n ⫽ 111) or cetuximab plus irinotecan (n ⫽ 218) in the
Two studies based on registries are published in this issue of JCO. same dose and schedule as before random assignment.28
The BRiTE registry collected data on 1,953 patients with metastatic In the study by Grothey et al,19 two groups within the BRITE
colon cancer who started chemotherapy plus bevacizumab between registry are compared. In contrast, in the study by Wilke et al,20 the
February 2004 and July 2005.19 The primary objective of BRiTE was to MABEL cohort of patients is compared with patients in the BOND
collect information on adverse events, and a secondary objective was randomized trial.28 Is it reasonable to compare the results of this
to describe the effectiveness of bevacizumab in terms of progression- registry to those from a randomized trial?
free survival (PFS) and overall survival. The BRiTE population in- The first question we ask is, “Are the two groups comparable?”
cludes a broader range of patients than in the pivotal randomized The median age in MABEL was 62 years compared with 59 years in
trial.27 There was no formal assessment of effectiveness end points BOND. In both studies, approximately 64% of patients were male,
according to prespecified guidelines. and 80% had had two or more prior chemotherapy regimens. In the
The article by Grothey et al19 is a hypothesis-generating, un- MABEL study, 1% of patients had a Karnofsky performance score less
planned analysis of the BRiTE trial that compares two subgroups of than 80 compared with 12% of patients in BOND. The most common
patients who experienced progression of their colon cancer—patients dose of the irinotecan regimen was 180 mg/m2 every 2 weeks in both
who continued on bevacizumab and those who did not. The survival MABEL and BOND. Thus, the answer to the question on the compa-
of the patients who continued on bevacizumab was statistically signif- rability of the two patient populations is probably yes.
icantly longer compared with the survival of those who did not. The Our next question is, “Is it reasonable to compare the results of
authors conclude that “continued VEGF inhibition with bevacizumab registry data with those from a randomized trial?” The answer to this
beyond progressive disease could play an important role.” We would question depends on the outcome measure. The eligibility criteria for
like to highlight a few important points. Yes, the registry did include a a randomized trial can be restrictive. Accordingly, regulatory agencies
broader range of patients (including the elderly) and chemotherapy are interested in additional safety information on a new agent after
regimens than in the pivotal randomized trial.27 However, the charac- registration. Thus, a phase IV study can provide information on drug
teristics of the patient population at the time of disease progression are safety in a larger number of patients and in a broader range of patients.
unknown. Because time of disease progression was the start time for In addition, unrecognized toxicities can sometimes be identified. The
the analysis, it is important to know whether the overall disease burden MABEL investigators used standard criteria to document toxicity. It is
of the two cohorts was similar at this time. Tumor burden could be reassuring that, as in BOND, the most common irinotecan-related
reflected by sites of metastases, number of metastases, performance toxicity was diarrhea, the most common cetuximab-related toxicity
status, prothrombin time, and albumin. These characteristics of the was acne-like rash, and the rates of these toxicities in the two cohorts
two cohorts at the time of disease progression are not compared, and were similar. The answer to the question of whether it is reasonable to
we do not know whether they differ in terms of important known or compare MABEL with BOND for the outcomes of safety is yes.
unknown confounders. Although the time from first metastasis, when However, there are major limitations in performing cross-study
patients were enrolled onto the registry, to progressive disease was the comparisons for the end point of efficacy. The two studies assessed
same for both cohorts, the time from first diagnosis of colon cancer to response and PFS using different time intervals; these outcomes were
first metastasis for the two cohorts should be described to reassure that adjudicated by a committee blinded to treatment allocation in the
there is no underlying difference in the natural history of tumor BOND trial compared with the treating physician in MABEL. The
growth between the two cohorts. The authors used a propensity score median PFS in MABEL was 3.2 months compared with 4.1 months in
to adjust for potential imbalances between groups for important char- the BOND trial. These two rates cannot be compared statistically but
acteristics. However, there are limitations to this approach.17 only in a qualitative sense. The MABEL investigators interpret the PFS
Thus, on the basis of such a strong likelihood of bias from con- results from the two studies to be similar. However, there is an almost
founders, the results of the analysis can only be considered as explor- 25% difference between the two rates. Given the short PFS of the

www.jco.org © 2008 by American Society of Clinical Oncology 3


Information downloaded from jco.ascopubs.org and provided by GENENTECH INC on October 14, 2008 from
192.12.78.245.
Copyright © 2008 by the American Society of Clinical Oncology. All rights reserved.
Levine and Julian

patient population and the imprecision in measurement of progres- 9. Harlan LC, Hankey BF: The Surveillance, Epidemiology, and End-Results
program database as a resource for conducting descriptive epidemiologic and
sion, it would seem that comparison of efficacy results between the
clinical studies. J Clin Oncol 21:2232-2233, 2003
registry trial and the randomized trial is limited and not a worthwhile 10. Cannistra SA: Gynecologic oncology or medical oncology: What’s in a
endeavor. If the PFS for MABEL was 2.1 months, what would the name? J Clin Oncol 25:1157-1159, 2007
investigators have concluded? Given this result, would they still rec- 11. Schrag D: Enhancing cancer registry data to promote rational health
system design. J Natl Cancer Inst 100:378-379, 2008
ommend use of the irinotecan plus cetuximab combination?
12. Feinstein A: The role of observational studies in the evaluation of therapy.
Finally, in an unplanned retrospective analysis, severe infusion- Stat Med 3:341-345, 1984
related reactions were examined. The rate of such reactions was 13. Green S, Byar DP: Using observational data from registries to compare
reduced in patients receiving antihistamine plus corticosteroid treatments: The fallacy of omnimetrics. Stat Med 3:361-373, 1984
14. Mamdani M, Sykora K, Li P, et al: Reader’s guide to critical appraisal of
compared with patients who received antihistamine alone. The
cohort studies: 2. Assessing potential for confounding. BMJ 330:960-962, 2005
administration of corticosteroids was not controlled, and thus, the 15. Normand SL, Sykora K, Li P, et al: Readers guide to critical appraisal of
two patient groups could differ in potential important confound- cohort studies: 3. Analytical strategies to reduce confounding. BMJ 330:1021-
ers, potentially leading to a biased comparison.14 Nonetheless, this 1023, 2005
16. Stukel TA, Fisher ES, Wennberg DE, et al: Analysis of observational studies
potentially important observation requires validation in a larger pro-
in the presence of treatment selection bias: Effects of invasive cardiac manage-
spective cohort. ment on AMI survival using propensity score and instrumental variable methods.
The randomized controlled trial provides the highest level of JAMA 297:278-285, 2007
evidence for the comparison of alternative types of interventions in 17. D’Agostino RB Jr, D’Agostino RB Sr: Estimating treatment effects using
observational data. JAMA 297:314-316, 2007
clinical practice.29 However, these trials often take a long time to
18. Hillner BE, Siegel BA, Liu D, et al: Impact of positron emission tomography/
complete, are expensive, and frequently included highly selected pa- computed tomography and positron emission tomography (PET) alone on ex-
tient populations. For these reasons, there is constant interest in find- pected management of patients with cancer: Initial results from the National
ing alternatives to randomized trials. Registries can provide useful Oncologic PET Registry. J Clin Oncol 26:2155-2161, 2008
19. Grothey A, Sugrue MM, Purdie DM, et al: Bevacizumab beyond first
information but should be performed according to high method-
progression is associated with prolonged overall survival in metastatic colorectal
ologic standards to ensure completeness and minimize bias. Care cancer: Results from a large observational cohort study (BRiTE). J Clin Oncol
must be taken when they go beyond their original purpose and at- doi;10.1200/JCO.2008.16.3212
tempt to answer effectiveness questions. This point, made in a seminal 20. Wilke H, Glynne-Jones R, Thaler J, et al: Registrational study of cetuximab
plus irinotecan in heavily pretreated metastatic colorectal cancer progressing on
article by David Byar2 more than a quarter of a century ago, is still
Irinotecan: MABEL. J Clin Oncol doi;10.1200/JCO.2008.16.3758
important today. 21. Hillner BE, Liu D, Coleman RE, et al: The National Oncologic PET Registry
(NOPR): Design and analysis plan. J Nucl Med 48:1901-1908, 2007
AUTHORS’ DISCLOSURES OF POTENTIAL CONFLICTS OF INTEREST
22. Lindsay MJ, Siegel BA, Tunis SR, et al: The National Oncologic PET
The author(s) indicated no potential conflicts of interest. Registry: Expanded medicare coverage for PET under coverage with evidence
AUTHOR CONTRIBUTIONS development. AJR Am J Roentgenol 188:1109-1113, 2007
23. Siminoff LA, Fetting JH, Abeloff MD: Doctor-patient communication about
Manuscript writing: Mark N. Levine, Jim A. Julian
breast cancer adjuvant therapy. J Clin Oncol 7:1192-1200, 1989
Final approval of manuscript: Mark N. Levine, Jim A. Julian
24. Larson SM: Practice-based evidence of the beneficial impact of positron
REFERENCES emission tomography in clinical oncology. J Clin Oncol 26:2083-2084, 2008
1. Arts DGT, de Keizer NF, Scheffer GJ: Defining and improving data quality 25. Maziak D, Darling GE, Inculet RI, et al: A randomized controlled trial (RCT)
in medical registries: A literature review, case study, and generic framework. of 18F-fluorodeoxyglucose (FDG) positron emission tomography (PET) versus
J Am Med Inform Assoc 9:600-611, 2002 conventional imaging (CI) in staging potentially resectable non-small cell lung
2. Byar DP: Why data bases should not replace randomized clinical trials. cancer (NSCLC). J Clin Oncol 26; 397s, 2008 (suppl, abstr 7502)
Biometrics 36:337-342, 1980 26. Pritchard KI, Julian J, McCready D, et al: A prospective study evaluating
3. Mack MJ: Clinical trials versus registries in coronary revascularization: 18F-fluorodeoxyglucose (18FDG) positron emission tomography (PET) in the
Which are more relevant? Curr Opin Cardiol 22:524-528, 2007 assessment of axillary nodal spread in women undergoing sentinel lymph node
4. Pronovost P, Angus DC: Using large-scale databases to measure out- biopsy (SLNB) for breast cancer. J Clin Oncol 26; 14s, 2008 (suppl, abstr 533)
comes in critical care. Crit Care Clin 15:615-631, 1999 27. Hurwitz H, Fehrenbacher L, Novotny W, et al: Bevacizumab plus irinotecan,
5. Pronovost PJ, Berenholtz SM, Dorman T, et al: Evidence-based medicine fluorouracil, and leucovorin for metastatic colorectal cancer. N Engl J Med
in anesthesiology. Anesth Analg 92:787-794, 2001 350:2335-2342, 2004
6. Rubenfeld GD, Angus DC, Pinsky MR, et al: Outcomes research in critical 28. Cunningham D, Humblet Y, Siena S, et al: Cetuximab monotherapy and
care: Results of the American Thoracic Society Critical Care Assembly Workshop cetuximab plus irinotecan in irinotecan-refractory metastatic colorectal cancer.
on Outcomes Research. Am J Respir Crit Care Med 160:358-367, 1999 N Engl J Med 351:337-345, 2004
7. National Cancer Institute: Surveillance Epidemiology and End Results 29. Lachetti C, Guyatt G: Therapy and validity: Surprising results of randomized
(SEER). http://seer.cancer.gov/ controlled trials, in Guyatt GH, Rennie D (eds): Users’ Guides to the Medical
8. Yabroff KR, Lamont EB, Mariotto A, et al: Cost of care for elderly cancer Literature: A Manual for Evidence-Based Clinical Practice. Washington, DC, AMA
patients in the United States. J Natl Cancer Inst 100:630-641, 2008 Press, 2001, pp 247-265

■ ■ ■

4 © 2008 by American Society of Clinical Oncology JOURNAL OF CLINICAL ONCOLOGY


Information downloaded from jco.ascopubs.org and provided by GENENTECH INC on October 14, 2008 from
192.12.78.245.
Copyright © 2008 by the American Society of Clinical Oncology. All rights reserved.