Professional Documents
Culture Documents
2 Books
2 Books
Data resource basics NCRAS, and either registered as a new cancer or linked to
an existing cancer registration as a new event. A wide
National Cancer Registration and Analysis Service
range of information is received and used to develop a rich
The National Cancer Registration and Analysis Service data resource over time. The data are accessed through the
(NCRAS), part of Public Health England (PHE), is the Cancer Analysis System (CAS), which creates a monthly
population-based cancer registry for England. It collects, qual- snapshot of data for analysis. A provisional quarterly snap-
ity assures and analyses data on all people living in England shot is generated to facilitate timely data release, and then
who are diagnosed with malignant and pre-malignant neo- a designated annual snapshot when each calendar year of
plasms, with national coverage since 1971. It produces the na- registrations is completed. The annual snapshot is used to
tional cancer registration dataset for England. The primary produce national statistics.
role of NCRAS is to provide near real-time, cost-effective,
comprehensive data collection and quality assurance over the
entire cancer care pathway. To achieve this, it receives data Protection of cancer registration data
from across the National Health Service (NHS). The provisions of Section 251 of the NHS Act 2006 pro-
There are around 300 000 malignant tumours1 diag- vide the legal basis to Public Health England for collecting
nosed each year in England, for which NCRAS receives patient-level data on cancer patients for specified purposes,
about 25 million records (Figure 1). These records are sub- without consent. This is reviewed annually by the
mitted by 162 health care providers, which incorporate Confidentiality Advisory Group of the Health Research
over 1700 multidisciplinary teams (MDTs). MDTs are Authority.2 Strict technical and contractual controls are
groups of professionals with expertise in different disci- put in place to prevent unauthorized access and use of the
plines, whose remit is to assess relevant information about data, with staff undergoing regular training on data protec-
the patient and their cancer and make diagnosis and treat- tion and information governance.
ment recommendations; they are fundamental to cancer
services in England.
Data collected
Registration model The data collected by NCRAS comes from several sources in-
NCRAS employs an event-based registration model. Data cluding: multidisciplinary team meetings, pathology reports,
on each primary tumour are sent from multiple sources to molecular testing results, treatment records, hospital activity
C The Author(s) 2019. Published by Oxford University Press on behalf of the International Epidemiological Association.
V 16
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/),
which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact
journals.permissions@oup.com
International Journal of Epidemiology, 2020, Vol. 49, No. 1 16a
8,000,000
7,000,000
6,000,000
5,000,000
Number of records
4,000,000
3,000,000
2,000,000
0
2010-4
2011-1
2011-2
2011-3
2011-4
2012-1
2012-2
2012-3
2012-4
2013-1
2013-2
2013-3
2013-4
2014-1
2014-2
2014-3
2014-4
2015-1
2015-2
2015-3
2015-4
2016-1
2016-2
2016-3
2016-4
2017-1
2017-2
2017-3
2017-4
Quarter
Figure 1. Number of records loaded (processed) by quarter by the National Cancer Registration and Analysis Service in ENCORE, as of December
2017.
records,3 hospital Patient Administration Systems, and opera- and covers clinical and pathological items. The data struc-
tional standards such as cancer waiting times.4 The cause and ture specifies the information to be sourced from a number
date of death for deceased patients are supplied by the Office of datasets, such as operational data on patient waiting
for National Statistics.5 The primary patient identifier is the times,4 and mortality data.5 The national cancer registra-
NHS number (a unique identifier used throughout the health tion dataset includes a subset of COSD. The data standard
care system in England), but date of birth, full name and ad- undergoes periodic review to reflect changing clinical prac-
dress are also used for patient identification and linkage. tice and incorporate new data sources.
Registration is framed at the tumour level. Data submit-
ted are reviewed by skilled cancer registration officers
(CROs) with the assistance of some automated tools for New data sources
data linkage and de-duplication of identical data sources.
The cancer registration dataset is updated to reflect new
Review includes manual extraction of data from text-
iterations of COSD and additional data feeds. As clinical
based pathology reports. CROs require detailed knowledge
practice continues to evolve, and more detailed questions
of cancer biology, coding and terminology. As cancer care
on cancer epidemiology and care postulated, new sources
is often delivered at different hospitals, information may
of relevant data are sought to both enrich and complement
relate to patient episodes at many organizations, which can
that collected by COSD. The most recent addition to the
introduce inconsistencies in the incoming data. The CRO
data collection undertaken by NCRAS is molecular data.
will examine all data sources and, if necessary, seek addi-
Figure 2 provides an overview of the current data flows.
tional source information from primary or secondary care
via correspondence or direct interrogation of hospital radi-
ology or electronic health records via remote secure access.
Quality assurance of cancer registration data
There are robust automated and manual quality control and
assurance checks embedded into the registration process,
The data standard for the national cancer both at the individual record level and for the set of registra-
registration dataset tions. As records are added to the system, input validation
In January 2013, the Cancer Outcomes and Services checks compare them with expected values and cross-
Dataset (COSD) was introduced as the new national stan- validate against other records, for example if treatment oc-
dard for reporting cancer in England.6 The COSD data curred after death. Quality control checks are then under-
structure is extensive, containing 489 items in version 8,7 taken during processing: each new registration is assessed by
16b International Journal of Epidemiology, 2020, Vol. 49, No. 1
Figure 2. Summary of the data flows from the point a patient was diagnosed with cancer to the analytical views of data in the Cancer Analysis System.
two CROs and any inconsistencies resolved. Finally, checks international indices of data quality, namely the proportion
across all finalized records are performed before the quar- of microscopically confirmed cases and death certificate-only
terly sign-off of data for release, to find unprocessed records. cases. English cancer registration data have been assessed for
Trend- and population-level checks are also performed. completeness and accuracy against randomized controlled tri-
Cancer registry data quality should be assessed in terms of als data,12 inpatient hospital data13 and simulations,14 with
completeness, validity, comparability and timeliness.8,9 There positive results. Validity, comparability and timeliness of the
are extensive published quantitative data which demonstrate national cancer registration data are tested and published
quality, for example the performance indicators published by alongside the national statistics of both cancer incidence and
the United Kingdom and Ireland Association of Cancer survival.15 Of note is that serious errors (for example an inva-
Registries10 and the Cancer Incidence in Five Continents lid date) have been found in fewer than 0.1% of registrations
(CI5) publications.11 The CI5 publications report systematic for more than 10 years.15
International Journal of Epidemiology, 2020, Vol. 49, No. 1 16c
The percentage of microscopically verified cancers is an A summary of the key data items available in the cancer
indicator of quality; for 2016, 85.3% of all malignant can- registration dataset for each cancer is in Table 1. Full
cers were microscopically verified.10 Cancers where the details can be found in the published data dictionary20
death certificate is the only information available indicate which is updated periodically. The cancer registration
a failure to capture or process relevant health services in- dataset comprises three main tables: the tumour table
formation, and a high percentage indicates incomplete holds information about each primary tumour diagnosed
population coverage. The number of registrations where in an individual; the patient table holds information about
death certification is the only information available is re- the individuals who are diagnosed with cancer; the treat-
duced by actively looking for additional supporting data. ment table holds the treatment received by the patient.
Whereas registrations based solely on death certificate in- Several data items in the tables are derived using algo-
formation represented more than 8% of cancers in the rithms and multiple data sources, for example the
1980s, due to the increasing availability of source docu- Charlson comorbidity index score21 derived using hospital
ments to cancer registration staff this has been reduced to admission data.3,22
Table 1. Summary of key data items available for each cancer registration
Patient identifier Tumour identifier Date of incidence Event identifier Date of death
NHS number Site, morphology and Basis of diagnosis Type of treatment event Full coded causes of death
behaviour of tumour (surgery/radiotherapy/ from death certificate
chemotherapy)
Date of birth Multifocal flag Route to diagnosis Date of event Coded underlying cause of
death
Sex Tumour size Health care provider at Treatment health care Location of death
initial contact provider
Ethnicity Stage: registry-derived Health care provider at Indicator for whether Post-mortem
stage at diagnosis and diagnosis patient in a clinical trial
other stages
Postcode at diagnosis Laterality Date of MDTa meeting Details of event (dependent
on type of event)
100
90
80
70
60
Completeness (%)
50
40
30
20
10
2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016
Year of cancer diagnosis
Estrogen Receptor Status Ethnicity Performance status Stage at diagnosis Treatment intent
Unknown ethnicity includes 'Not stated' as selected by the individual. Performance status available for adults only. Estrogen receptor status for breast cancer only.
Figure 3. Trend in data completeness (%) from 2001 to 2016 of the recording of estrogen receptor status for breast cancer patients, self-reported eth-
nicity, the performance status at diagnosis for adults, the registry-derived stage at cancer diagnosis, and the intent of treatment.
International Journal of Epidemiology, 2020, Vol. 49, No. 1 16e
health care information be removed from the National duplicated from 2012 onwards. As an example, for years
Cancer Registration Dataset. Very few (<1 in 10 000) preceding this, around 1% of patients are flagged as dupli-
patients have opted out and therefore the impact is mini- cates (based on patient NHS number and date of birth). All
mal. Large numbers of opt-outs would impair the ability to duplicates are routinely excluded from analyses and data
produce meaningful statistics, particularly for rarer can- releases.
cers, so NCRAS is working with major cancer charities to
improve public and patient awareness of the value of
population-based cancer registration. Data resource access
Access to cancer registration data can be granted to indi-
viduals who demonstrate that either there is a justified pur-
Primary care information pose for the data release, and that there is an appropriate
Primary care information is available to link to cancer reg- legal basis with safeguards in place to protect the data, or
istration data for a subset of the population.40 It is not the data release is deemed to be anonymous (e.g. aggregate
CRO: cancer registration officer. These officers manu- diagnosis data [e.g. Tumour, Node, Metastasis
ally assess the multiple data sources submitted to NCRAS (TNM), FIGO]; cancer treatment data (e.g. procedure
to produce a cancer registration record for each diagnosis code); death information (e.g. date and cause of
of cancer. death); provider information (e.g. hospital of diagno-
ENCORE: English National Cancer Online sis, GP practice code); and geography at diagnosis
Registration Environment. This is a custom-built applica- (based on patient’s postcode of residence).
tion used by the cancer registration officers to process all • NCRAS data are accessible though applications to
data submitted by health care providers to produce a can- the Office for Data Release. They are also used to
cer registration record. enrich other data sources (such as the Clinical
ICD-10: International Statistical Classification of Practice Research Datalink) and generate national
Diseases and Related Health Problems, 10th Revision. statistics.
This is the international standard for classifying diseases
and is published by the World Health Organization.
11. Parkin DM, Ferlay J, Curado M-P et al. Fifty years of cancer inci- 18 cancers from 322 population-based registries in 71 countries.
dence: CI5 I–IX. Int J Cancer 2010;127:2918–27. Lancet 2018;391:1023–75.
12. Merriel SWD, Turner EL, Walsh E et al. Cross-sectional study 27. National Cancer Registration and Analysis Service, Office for
evaluating data quality of the National Cancer Registration and National Statistics. Cancer Survival by Stage at Diagnosis for
Analysis Service (NCRAS) prostate cancer registry data using the England (Experimental Statistics): Adults Diagnosed 2012, 2013
Cluster randomised trial of PSA testing for Prostate cancer and 2014 and Followed up to 2015. 2016. https://www.ons.gov.
(CAP). BMJ Open 2017;7:e015994. uk/peoplepopulationandcommunity/healthandsocialcare/condition
13. Møller H, Richards S, Hanchett N et al. Completeness of case sanddiseases/bulletins/cancersurvivalbystageatdiagnosisforenglan
ascertainment and survival time error in English cancer regis- dexperimentalstatistics/adultsdiagnosed20122013and2014and
tries: impact on 1-year survival estimates. Br J Cancer 2011; followedupto2015 (26 February 2018, date last accessed).
105:170–76. 28. National Cancer Registration and Analysis Service,
14. Woods LM, Coleman MP, Lawrence G, Rashbass J, Berrino F, Transforming Cancer Services Team for London. Segmented
Rachet B. Evidence against the proposition that “UK cancer sur- Analysis of Prostate Cancer Pathway from Referral to
vival statistics are misleading”: simulation study with National Treatment: 2013–2015. 2017. http://www.ncin.org.uk/view?
Cancer Registry data. BMJ 2011;342:d3399. rid¼3544 (26 February 2018, date last accessed).