Matching of Controls May Lead To Biased Estimates of Specificity in The Evaluation of Cancer Screening Tests

Journal of Clinical Epidemiology 66 (2013) 202e208
Matching of controls may lead to biased estimates of specificity

in the evaluation of cancer screening tests
Hermann Brennera,*, Lutz Altenhofenb, Sha Taoa
a
Division of Clinical Epidemiology and Aging Research, German Cancer Research Center, INF 581, D-69120 Heidelberg, Germany
b
Central Research Institute of Ambulatory Health Care in Germany, Hervert-Lewin-Platz 3, D-10623 Berlin, Germany
Accepted 24 September 2012
Abstract
Objectives: In the evaluation of cancer screening tests, cancer-free controls are often matched to cancer cases on factors such as sex and
age. We assessed the potential merits and pitfalls of such matching using an example from colorectal cancer (CRC) screening.
Study Design and Setting: We compared sex and age distribution of CRC cases and cancer-free people undergoing screening colono-
scopy in Germany in 2006 and 2007. We assessed specificity by sex and age of two immunochemical fecal occult blood tests (iFOBTs) in
a study among screening colonoscopy participants conducted in the same years, and we assessed the expected impact of matching by sex
and age on the validity of specificity estimates at various cut points.
Results: In the screening colonoscopy program, the proportion of men and mean age were 59.6% and 68.6 years among 10,324 CRC
patients compared with 45.6% and 64.7 years, respectively, among 997,490 cancer-free participants. The specificity of the iFOBTs was
higher among women than among men and decreased with age. Matching of cancer-free controls by age and sex would have led to the
underestimation of specificity at all cut points assessed.
Conclusion: In the evaluation of cancer screening tests, matching of controls may lead to biased estimates of specificity. Ó 2013
Elsevier Inc. All rights reserved.
Keywords: Bias; Early detection; Matching; Screening; Specificity; Statistical methods
1. Introduction in screening populations, cancer-free convenience samples

recruited in clinical settings are often used in practice.
In the ‘‘omics’’ era, research on novel early detection
The suitability of controls is often judged by their compa-
markers for cancer is blooming [1,2]. In the evaluation of
rability with cases with respect to the distribution of key
cancer early detection markers, cancer patients are re-
sociodemographic factors, such as sex and age. In some
cruited for determining sensitivity. For the evaluation of
studies, matching by these factors is used to ensure full
specificity, noncancer controls are needed. Ideally, cancer-
comparability. For example, in a recent systematic review
free controls should be recruited from the screening popu-
on the performance of blood-based tests for early detection
lation, but, given the difficulty of verification of the absence
of colorectal cancer (CRC) [3], information on sex and age
of cancer by a (often rather invasive) gold standard method of cases and controls was given in approximately half of the
studies. While many studies used convenience samples of
controls (such as patients with benign diseases or blood do-
Conflict of interest statement: None. nors) that were on average considerably younger than
Competing interests: There are no competing interests. cases, matching of controls by age or sex and age was ex-
Grant support: The German screening colonoscopy registry is funded
plicitly reported in four studies [4e7].
by the National Association of Statutory Health Insurance Physicians
and the National Health Insurance Fund. The BLITZ study was supported However, for valid judgment of the specificity of cancer
in part by the German Research Foundation (Deutsche Forschungsgemein- early detection markers, the controls should be representa-
schaft) within the framework of a PhD program (Graduiertenkolleg 793) tive of cancer-free people from the screening population,
and by a grant from the German Federal Ministry of Education and Re- who might differ from the cases with respect to sex and
search (O1ESO72). The test kits were provided free of charge by the man- age. For example, because of the strong rise of both cancer
ufacturer. The sponsor did not have any role in this study.
* Corresponding author. Tel.: þ49-6221-421301; fax: þ49-6221- incidence and prevalence with age [8,9], cases would
421302. almost always be expected to be on average older than non-
E-mail address: h.brenner@dkfz.de (H. Brenner). cases among screening participants. Likewise, the strong
0895-4356/$ - see front matter Ó 2013 Elsevier Inc. All rights reserved.
http://dx.doi.org/10.1016/j.jclinepi.2012.09.008
H. Brenner et al. / Journal of Clinical Epidemiology 66 (2013) 202e208 203
colonoscopy is conducted before 65 years of age, a second

What is new? screening examination will be offered 10 years later. Certifi-
cation to conduct screening colonoscopy is tightly regulated
Matching of controls to the sex and age distribu-
on the basis of extensive previous training and experience,
tion of cancer cases is commonly used, and the de-
and maintenance of certification is contingent on conducting
gree of matching is often perceived to be a quality
at least 200 colonoscopies and 10 polypectomies per year
criterion in studies evaluating cancer early detec-
that are subject to rigorous quality control. Histopathologic
tion tests.
examination is performed decentrally by certified pathology
The authors show that, in contrast to widespread laboratories.
belief, ‘‘perfect matching’’ may in fact lead to bi- Details on the national German screening colonoscopy
ased estimates of specificity, and they illustrate registry have previously been reported [16,17]. Briefly, the
the potential merits and pitfalls of matching using results of all screening colonoscopies are reported on a stan-
the example of colorectal cancer screening studies. dardized form by the physicians. Reporting is considered
virtually complete as it is a prerequisite for reimbursement
for colonoscopies by the health insurance funds. The re-
gistry includes only colonoscopies conducted as primary
screening examinations (i.e., colonoscopies conducted for
sex differences in incidence and prevalence seen for many work-up of results from other tests, such as positive results
forms of cancer, such as lung, stomach, and CRC or skin of fecal occult blood tests (FOBTs), because of symptoms
melanoma, would be expected to result in major differences or for surveillance of previous findings, which are sepa-
in the sex distribution of cases and cancer-free controls rately reimbursed as ‘‘therapeutic colonoscopies’’ are not
[8,9]. Unsurprisingly, age and sex were found to be strongly included). Items reported include basic sociodemographic
associated with findings of cancer and precancerous lesions variables and information on findings at colonoscopy. The
in many cancer screening studies [10e12]. reporting forms are scanned, processed, and checked for
Implications, pros, and cons of matching as a tool to pre- completeness and plausibility using standardized algorithms
vent confounding or enhance precision and power have at regional data centers before anonymized transfer to the
been widely addressed in the context of assessing risk fac- national data center is performed. Approximately 3% of el-
tor effects in cohort and caseecontrol studies, respectively igible people participate in screening colonoscopy each
[13e15]. To our knowledge, no previous article has year, which translates to an expected participation rate of
addressed the implications of matching in studies aiming 25e30% during the 10-year time window foreseen for this
to assess the performance characteristics of screening tests. screening option. For this analysis, we used data from
In this article, we aim to assess potential implications of 1,007,814 first-time screening colonoscopies in 2006 and
such matching. We illustrate by an empirical example from 2007. This time window was chosen as it corresponds to
the setting of CRC screening that matching of controls to the time window of data collection for the early detection
the sex and age distribution of cases might lead to biased marker evaluation study described in the following.
estimates of specificity if the sex and age distribution of iFOBTs are increasingly recommended and used for CRC
people in the screening population vary between those with screening because of advantages in test performance and
and without the disease and specificity likewise varies acceptance over traditional guaiac-based FOBTs [20e24].
according to sex and age. Our illustration is based on data Estimates of the specificity of two iFOBTs were derived
from the German national screening colonoscopy program from the BLITZ study, a study among participants of screen-
[16,17] and from a study evaluating two immunochemical ing colonoscopy in Southern Germany which has been de-
fecal occult blood tests (iFOBTs) among participants of this scribed in detail elsewhere [18,19,25e27]. Briefly, 1,785
program [18,19]. participants were recruited in 20 gastroenterology practices
between January 2006 and December 2007 according to
2. Methods a protocol approved by the ethics committees of the Medical
2.1. Databases Faculty Heidelberg of the University of Heidelberg and phy-
sicians’ chambers of Baden-W€urttemberg, Rheinland-Pfalz,
Data from the German national screening colonoscopy and Hessen. Patients were informed about the study at a
registry were used to assess the sex and age distribution of preparatory visit in the practice, typically about 1 week
CRC cases and cancer-free participants of screening before colonoscopy. They were asked to provide a stool sam-
colonoscopy. Colonoscopy is the current gold standard for ple before bowel preparation for colonoscopy, which was
diagnosis of CRC. In Germany, screening colonoscopy is used for the evaluation of multiple stool-based early detec-
offered as a primary screening examination for early tion markers, including multiple qualitative [18,25] and
detection and prevention of CRC since October 2002. quantitative iFOBTs [19,26,27] in a central laboratory.
Women and men are eligible for a first screening colono- Results for two quantitative tests (RIDASCREEN Haemo-
scopy from the age of 55 years. If this first screening globin and RIDASCREEN Haemo-/Haptoglobin Complex;
204 H. Brenner et al. / Journal of Clinical Epidemiology 66 (2013) 202e208
R-Biopharm AG, Darmstadt, Germany [28]) are used for il- 2. The selection of a sample of CRC-free participants
lustration here. The lower detection limit and the cut point with matching to the sex and age distribution of
for positivity given by the manufacturer are 0.42 and CRC cases (‘‘matched sampling’’).
2 mg/g stool for the hemoglobin test and 0.38 and 2 mg/g 3. ‘‘convenience sampling’’ of controls without specific
stool for the hemo-/haptoglobin complex test, respectively. attention to their sex and age distribution; here,
Furthermore, patients were asked to fill out a standardized a range of possible values of specificity was derived
questionnaire. Colonoscopy and histology reports were col- from the range of specificity estimates for the various
lected, and relevant data were extracted in a standardized subgroups defined by sex and age. In addition, we
manner. The latter was done independently by two trained provide an example of a specificity estimate expected
investigators who were blinded with respect to test results, from a relatively young convenience sample which
and the potential discrepancies were resolved by consensus. was derived by including only men and women aged
younger than 60 years in the control group.
2.2. Statistical analyses
The sex and age distribution (categories were 55e59, 3. Results
60e64, 65e69, 70e74, 75e79, and 80þ years) and mean
ages of screening colonoscopy participants with and The sex and age distribution of participants of the Ger-
without CRC were derived from the national registry by man screening colonoscopy program in 2006 and 2007
descriptive statistics. Specificities of the iFOBTs by sex with and without CRC is shown in Table 1. Forty-four per-
and age were derived from the BLITZ study. Among cent of cancer cases were aged 70 years or older compared
1,785 participants enrolled in BLITZ in 2006 and 2007, with 24% of participants without CRC. Conversely, only
the following exclusions were made to ensure conditions 12.7% of cancer patients were younger than 60 years of
of a screening setting and minimize potential misclassifi- age compared with 28.6% of screening participants with-
cation because of imperfect colonoscopy: visible rectal out CRC. These differences resulted in a mean age differ-
bleeding or previous positive FOBT result (n 5 111), in- ence of almost 4 years (68.6 vs. 64.7 years, respectively).
flammatory bowel disease (n 5 13), previous colonoscopy Also, the proportion of men was much higher (59.6%)
in the past 5 years (n 5 117), stool sampling after colono- among CRC cases than among participants without CRC
scopy (n 5 65), inadequate bowel preparation for colono- (45.7%).
scopy (n 5 79), and incomplete colonoscopy (n 5 22). In
addition, we excluded 48 patients with pseudopolyps or
Table 1. Age and sex distribution of participants of screening
histologically undefined polyps. After further exclusion
colonoscopy with and without colorectal cancer
of 1 participant with missing information on age, 6 parti-
Colorectal cancer
cipants with missing iFOBT results, and 11 cases with
CRC, 1,312 participants were retained for estimating Yes No
specificities by sex and age. To ensure reasonably precise Sex Age (yr) n % n %
specificity estimates, three rather than six age categories, Men 55e59 767 7.4 119,878 12.0
were used (!60, 60e69, and 70þ years). Multiple logis- 60e64 1,047 10.1 100,787 10.1
tic regression with test positivity as dependent variable 65e69 1,694 16.4 120,485 12.1
70e74 1,377 13.3 69,297 6.9
and sex and age as independent variables was used to test
75e79 862 8.3 33,203 3.3
for associations of both variables with specificity. Further- 80þ 404 3.9 11,481 1.2
more, specificities were also evaluated for a number of Total 6,151 59.6 455,131 45.6
higher cut points (6, 10, 14 mg/g stool) besides the one Mean (yr) 68.5 65.0
recommended by the manufacturer (2 mg/g stool) because Women 55e59 547 5.3 165,529 16.6
60e64 665 6.4 118,013 11.8
the latter yielded specificities for some subgroups that
65e69 1,071 10.4 132,926 13.3
would typically be regarded as too low for population- 70e74 871 8.4 73,760 7.4
based screening. 75e79 610 5.9 36,406 3.6
Finally, specificities were calculated for each cut point 80þ 409 4.0 15,725 1.6
that would be expected with the following sampling strate- Total 4,173 40.4 542,359 54.4
Mean (yr) 68.9 64.5
gies of controls:
Total 55e59 1,314 12.7 285,407 28.6
1. The complete inclusion of all CRC-free screening 60e64 1,712 16.6 218,800 21.9
65e69 2,765 26.8 253,411 25.4
colonoscopy participants meeting the criteria outlined
70e74 2,248 21.8 143,057 14.3
in the preceding paragraph or selection of a random 75e79 1,472 14.3 69,609 7.0
sample of these participants; we will refer to this 80þ 813 7.9 27,206 2.7
strategy as ‘‘correct sampling’’ as specificity is de- Total 10,324 100 997,490 100
rived from a sample that is expected to be representa- Mean (yr) 68.6 64.7
tive of the CRC-free screening population. German national screening colonoscopy registry, 2006e07.
Table 2. Specificity (%) of two immunologic fecal occult blood tests (RIDASCREEN Haemoglobin and RIDASCREEN Haemo-/Haptoglobin Complex)
by sex, age, and cut point of positivity
Test
Hemoglobin Hemo-/haptoglobin complex
Age (yr) Cut point (mg/g stool) Cut point (mg/g stool)
Sex Category Mean n 2 6 10 14 2 6 10 14
Men !60 55.8 215 82.3 92.6 91.5 96.3 86.5 94.9 96.7 96.7
60e69 64.6 316 81.3 86.7 87.7 90.2 83.2 92.4 95.9 97.8
70þ 73.6 130 79.9 80.0 83.3 86.9 81.5 90.0 92.3 92.3
Women !60 55.8 246 91.5 95.3 97.6 98.4 94.3 97.6 98.8 98.8
60e69 64.5 309 87.7 93.9 94.5 95.8 88.0 96.8 98.1 98.4
70þ 73.0 96 83.3 93.8 93.8 93.8 84.4 89.6 95.8 96.9
BLITZ study, Germany, 2006e07.
The specificity of the iFOBT by sex, age, and cut point correspond to a relative increase of the false-positive rate
of positivity among participants of the BLITZ study is by 12, 26, and 27%, respectively. Not caring about the
shown in Table 2. Overall, 1,312 participants without sex and age distribution in the selection of controls (conve-
CRC with a mean age of 63.0 years were included, of nience sampling) could result in under- or overestimation
whom 50.4% were males. Sex- and age-specific estimates of specificity. A range of values of specificity that might
of specificity are based on between 96 and 316 participants be obtained by convenience sampling are given by the
per subgroup. As expected, specificity increased with highest and lowest values of specificity observed in the
increasing cut points. For both iFOBTs and each cut point, subgroups defined by sex and age. Ranges were generally
specificity was generally lower in men than that in women rather wide, and specificities at a given cut point varied
and decreased with age between both sexes. Furthermore, by up to 15.3 percent units depending on the control group
both sex and age were independent significant predictors chosen. In the example of a relatively young convenience
of specificity in multiple logistic regression models with sample of controls (mean age, 55.8 years) which was
test result as dependent variable (P ! 0.05 in each case, obtained by restricting the analysis to men and women
data not shown). younger than 60 years of age, specificity would have been
As a result, matching of controls to the sex and age dis- overestimated at each cut point by between 2.4 and 4.5 per-
tribution of cases (matched sampling) would be expected to cent units with the hemoglobin test and between 0.6 and 4.1
lead to underestimation of specificity for all cut points as- percent units with the hemo-/haptoglobin complex test.
sessed. For example, specificities of 83.0, 88.4, and
89.6% of the hemoglobin test would be expected at cut
4. Discussion
points yielding true specificities of 84.8, 90.8, and 91.8%,
respectively (Table 3). This underestimation of specificity Our empirical example illustrates that matching of con-
in absolute terms by 1.8, 2.4, and 2.2 percent units would trols to the sex and age distribution of cases might lead to
Table 3. Expected specificity (%) of immunologic fecal occult blood test (RIDASCREEN Hemoglobin and RIDASCREEN Haemo-/Haptoglobin
Complex) according to the sampling of controls and cut points of positivity
Cut point (mg/g stool)
Test Sampling 2 6 10 14
a
Hemoglobin Correct sampling 84.8 90.8 91.8 93.8
Matched samplingb 83.0 88.4 89.6 91.8
Convenience sampling
Rangec 79.9e91.5 80.0e95.3 83.3e97.6 86.9e98.4
Exampled 87.2 94.4 96.3 97.4
Hemo-/haptoglobin complex Correct samplinga 86.6 94.0 96.6 97.2
Matched samplingb 84.6 92.5 95.5 96.9
Convenience sampling
Rangec 81.5e94.3 89.6e97.6 92.3e98.8 92.3e99.0
Exampled 90.7 96.3 97.8 97.8
German national screening colonoscopy registry and BLITZ study, Germany, 2006e07.
a
Sex and age distribution corresponds to that in carcinoma-free screening participants.
b
Matching to sex and age distribution of colorectal cancer cases.
c
Range of values observed in subgroups defined by sex and age (Table 2).
d
Example of a relatively young convenience sample (mean age, 55.8 years) that was derived by restricting the analysis to men and women
younger than 60 years of age.
biased estimates of the specificity of screening tests if the sex It should be noted, however, that the arguments outlined
and age distribution of cases and controls in the screening in this article refer to studies aiming to describe the sensi-
population vary and specificity likewise varies according to tivity and specificity of diagnostic tests in the screening set-
sex and age. The former condition is expected to commonly ting. In other contexts, preferences may be different. In
hold and has repeatedly been demonstrated in cancer screen- particular, there might be studies primarily aiming to assess
ing because of the strong increase in cancer incidence and to what extent a test, whose results vary by age and sex,
prevalence with age for almost all cancers and common var- might have any independent diagnostic value (beyond diag-
iation in cancer incidence and prevalence between men and nostic value mediated by its association with age and sex)
women [8e12]. The latter condition may also frequently to distinguish people with and without cancer. In such stud-
apply and was quite pronounced for the iFOBTs used for il- ies, matching for age and sex might be a method of choice
lustration in our analysis. Other common examples include indeed. Vice versa, the use of convenience samples of con-
measurements of pepsinogen and inflammatory markers in trols whose age and sex distribution strongly vary from that
peripheral blood, which have been suggested as screening of cases might inappropriately attribute diagnostic value to
markers for gastric and a variety of other cancers [29e31] tests whose results vary by age and sex independent of the
and which are well known to show a major variation by presence of cancer.
sex and age [32,33]. Therefore, the sex and age distribution Regarding the specific context of our empirical example,
of controls should reflect that of cancer-free screening partic- a few additional issues require further discussion. In many
ipants rather than that of cancer cases. studies on early detection markers of CRC, cases are re-
Ideally, the evaluation of cancer screening tests should cruited in the clinical setting rather than in a screening set-
be done in screening populations with direct sampling of ting, and the age distribution of those cases is likely to be
cases and controls from the subpopulations with and with- even further shifted toward higher ages compared with
out cancer. The German screening colonoscopy program CRC-free controls from the screening population. Accord-
provides such a setting for the evaluation of early detection ing to data from population-based cancer registries from
markers of CRC because subpopulations with and without Germany [39], the estimated proportion of patients diag-
CRC in the screening population can be reliably distin- nosed with CRC in Germany in 2006 and 2007 who were
guished by colonoscopy, which is considered the diagnostic 70 years or older at the time of diagnosis was 57.6%, that
gold standard in this context. Most studies aiming to eval- is, substantially higher than the corresponding proportion
uate cancer screening tests are conducted in different set- of 44% observed among CRC patients identified by screen-
tings, however, starting with the identification and ing colonoscopy (Table 1). On the other hand, cancer-free
sampling of cancer cases for which adequate controls are participants of screening colonoscopy in our empirical
then sought. Often, cases are identified in clinical settings example may be older on average than the typical cancer-
(e.g., after the diagnosis and before the start of therapy), free CRC screening population because screening colono-
and convenience samples are used as controls, such as scopy is offered from the age of 55 years only in Germany,
patients undergoing similar diagnostic procedures but whereas screening is recommended from the age of 50
found free of the cancer of interest or other healthy volun- years by expert panels [40e42], and screening by FOBT
teers, such as blood donors. Such convenience sampling is offered starting from the age of 50 years in Germany.
may sometimes lead to very large discrepancies in sex Therefore, the age gap between CRC cases and cancer-
and age distribution of cases and controls, far beyond those free controls may even be larger, and the potential bias
justified by true differences in the screening population. from matching by age may be more severe in other settings
For example, in the aforementioned review of studies on than in the one assessed in our empirical example. Further
blood-based tests for CRC early detection [3], the majority limitations to generalizability might arise from potential
of studies used healthy volunteers, patients with benign differences in self-selection of participants of screening
diseases, or a combination of both as controls. Where re- colonoscopy and users of other noninvasive options of pri-
ported, the mean age of controls was mostly much lower mary screening tests [43,44].
than that of CRC cases, with differences exceeding 20 years In our example, specificity was defined by the propor-
in several studies [34e38]. In such situations, matching of tion of negative tests among those without CRC. Ideally,
controls to the sex and age distribution of cases might have however, screening tests for CRC should also detect ad-
provided less biased estimates of specificity. Nevertheless, vanced adenomas, the precursors of CRC, and possibly
direct sampling from the cancer-free screening population, even other (nonadvanced) adenomas. It might therefore
which would avoid additional threats of validity that may be argued that specificity should be determined among
result from convenience sampling (a detailed discussion those free of advanced neoplasms (or even free of any neo-
of which is beyond the scope of this article) and which plasms) only. Repeating our analyses with these alternative
may typically yield some difference in sex and age distribu- definitions of specificity yielded substantially higher levels
tion of cases and controls, should be the preferred strategy of specificity at all cut points assessed, along with an even
to be applied whenever possible. slightly larger gap in mean age between CRC cases and
controls and a similar bias by matching of controls to the screening settings. Although extreme differences in such
sex and age distribution of cases. distribution may be indicative of the use of inappropriate
Specificity decreased with age and was lower among convenience samples, full agreement enforced by matching
men than among women for the specific tests evaluated is also often undesirable as it typically hinders controls
in our example, leading to underestimation of specificity from being representative of the cancer-free screening
by matched sampling and overestimation of specificity in population.
case of a relatively young convenience sample. The de- Ideally, controls should be a random sample of the
crease in specificity with age might be explained by an in- cancer-free screening population. In studies in which ran-
creasing risk of bleeding from other gastrointestinal lesions dom sampling of controls from the cancer-free study popu-
with older age. Likewise, lower specificity among men than lation is not possible, matching of controls to the age and
among women might result from higher prevalences of ul- sex distribution of the cancer-free screening population
cer or other gastrointestinal bleeding sources and higher (which can typically be closely approximated by the age
prevalences of aspirin use among men than among women and sex distribution of the total screening population) rather
[45]. Similar variation of specificity by sex and age would than matching to the age and sex distribution of cases might
be expected for other FOBTs. For other types of tests, inde- be the method of choice. In studies in which the age and sex
pendence of specificity from sex and age or even reverse distribution of controls differs from that of the cancer-free
patterns might also be conceivable. If sex and age are unre- screening participants, adjustment to the age and sex distri-
lated to specificity, then matching by these factors does not bution of the cancer-free screening population by appropri-
introduce bias, but it is also unnecessary as it then does not ate weighting of age- and sex-specific estimates of
have any impact on specificity estimates. specificity might be considered to prevent the type of pos-
In our study, performance of colonoscopy, which is con- sible bias outlined in this article.
sidered the gold standard for the detection of CRC, allowed
the distinction of screening participants with and without
CRC at high reliability. In many other screening settings, Acknowledgments
delineation of the cancer-free subpopulation may not be The authors acknowledge excellent contributions in the
as straightforward or not be possible at all. In such settings, conduction of the BLITZ study by Isabel Lerch, Sabrina
the age and sex distribution of the entire screening popula- Hundt, Ulrike Haug, and the cooperating gastroenterology
tion may often be a reasonable proxy for the age and sex practices. The authors are grateful to Labor Limbach
distribution of the cancer-free screening population, given (Heidelberg) for laboratory analyses of the iFOBT.
the low prevalence of cancer in most screening settings.
Our study has specific strengths and limitations. References
Strengths include the very large database of the German na-
[1] Bosch LJ, Carvalho B, Fijneman RJ, Jimenez CR, Pinedo HM, van
tional screening colonoscopy registry used to assess sex and
Engeland M, et al. Molecular tests for colorectal cancer screening.
age distribution of CRC cases and CRC-free controls and Clin Colorectal Cancer 2011;10:8e23.
the evaluation of the tests in a true screening setting. We [2] Luo X, Burwinkel B, Tao S, Brenner H. MicroRNA signaturesd
presented a detailed empirical illustration of variation of novel biomarker for colorectal cancer? Cancer Epidemiol Biomark
specificity by sex and age for two tests only. However, very Prev 2011;20:1272e86.
[3] Tao S, Hundt S, Haug U, Brenner H. Sensitivity estimates of blood
similar patterns were seen for both tests and in fact would
based tests for colorectal cancer detection: impact of overrepresenta-
be expected for FOBTs in general, the so far best estab- tion of advanced stage disease. Am J Gastroenterol 2011;106:242e53.
lished noninvasive tests for CRC screening. In most previ- [4] Dudouet B, Jacob L, Beuzeboc P, Magdalenat H, Robine S,
ous evaluations of other CRC early detection markers, no Chapuis Y, et al. Presence of villin, a tissue-specific cytoskeletal pro-
stratification by sex and age was done, and for many of tein, in sera of patients and an initial clinical evaluation of its value
for the diagnosis and follow-up of colorectal cancers. Cancer Res
those markers, it is unknown to what extent their specificity
1990;50:438e43.
may vary by sex and age. [5] Schiedeck TH, Wellm C, Roblick UJ, Broll R, Bruch HP. Diagnosis
Despite its limitations, our article illustrates the poten- and monitoring of colorectal cancer by L6 blood serum polymerase
tial merits and pitfalls of matching of controls in the eval- chain reaction is superior to carcinoembryonic antigen-enzyme-
uation of cancer screening tests. Although matching of linked immunosorbent assay. Dis Colon Rectum 2003;46:818e25.
[6] Holten-Andersen MN, Christensen IJ, Nielsen HJ, Lilja H,
controls to the sex and age distribution of cases may help
Murphy G, Jensen V, et al. Measurement of the noncomplexed free
to avoid potentially even larger bias by the use of conve- fraction of tissue inhibitor of metalloproteinases 1 in plasma by im-
nience samples whose sex and age distribution may be munoassay. Clin Chem 2002;48:1305e13.
even further away from that of cancer-free people from [7] Leung WK, To KF, Man EP, Chan MW, Bai AH, Hui AJ, et al. Quan-
the screening population, such matching may still result titative detection of promoter hypermethylation in multiple genes in
the serum of patients with colorectal cancer. Am J Gastroenterol
in biased estimates of specificity. Along the same lines,
2005;100:2274e9.
agreement of sex and age distribution of cases and controls [8] Lutz JM, Francisci S, Mugno E, Usel M, Pompe-Kirn V,
should not be regarded as a quality criterion of studies aim- Coebergh J-W, et al. Cancer prevalence in Central Europe: the
ing to assess the sensitivity and specificity of tests in cancer EUROPREVAL Study. Ann Oncol 2003;14:313e22.
[9] Forman D, Stockton D, Møller H, Quinn M, Babb P, De Angelis R, &product_class_two5Q29sb24gQ2FuY2VyIFByZXZlbnRpb245&,

et al. Cancer prevalence in the UK: results from the EUROPREVAL last accessed 3 January 2012.
Study. Ann Oncol 2003;14:648e54. [29] Shiotani A, Iishi H, Uedo N, Kumamoto M, Nakae Y, Ishiguro S,
[10] Segnan N, Senore C, Andreoni B, Aste H, Bonelli L, Crosta C, et al. et al. Histologic and serum risk markers for noncardia early gastric
Baseline findings of the Italian multicenter randomized controlled cancer. Int J Cancer 2005;115:463e9.
trial of ‘‘once only sigmoidoscopy’’dSCORE. J Natl Cancer Inst [30] Mukoubayashi C, Yanaoka K, Ohata H, Arii K, Tamai H, Oka M,
2002;94:1763e72. et al. Serum pepsinogen and gastric cancer screening. Intern Med
[11] Hocking WG, Hu P, Oken MM, Winslow SD, Kvale PA, Prorok PC, 2007;46:261e6.
et al. Lung cancer screening in the randomized Prostate, Lung, Colo- [31] Tao S, Haug U, Kuhn K, Brenner H. Comparison and combination of
rectal, and Ovarian (PLCO) Cancer Screening Trial. J Natl Cancer blood-based inflammatory markers with faecal occult blood tests
Inst 2010;102:722e31. for non-invasive colorectal cancer screening. Br J Cancer 2012;
[12] Maisonneuve P, Bagnardi V, Bellomi M, Spaggiari L, Pelosi G, 106:1424e30.
Rampinelli C, et al. Lung cancer risk prediction to select smokers [32] Sun LP, Gong YH, Wang L, Yuan Y. Serum pepsinogen levels and
for screening CTda model based on the Italian COSMOS trial. their influencing factors: a population-based study in 6990 Chinese
Cancer Prev Res 2011;4:1778e89. from North China. World J Gastroenterol 2007;13:6562e7.
[13] Kupper LL, Karon JM, Kleinbaum DG, Morgenstern H, Lewis DK. [33] Singh T, Newman AB. Inflammatory markers in population studies of
Matching in epidemiologic studies: validity and efficiency consider- aging. Ageing Res Rev 2011;10:319e29.
ations. Biometrics 1981;37:271e91. [34] van Kamp GJ, von Mensdorff-Pouilly S, Kenemans P, Verstraeten R,
[14] Costanza MC. Matching. Prev Med 1995;24:425e33. Yedema CA, Wobbes T, et al. Evaluation of colorectal
[15] St€urmer T, Brenner H. Degree of matching and gain in power and cancer-associated mucin CA M43 assay in serum. Clin Chem 1993;
efficiency in case-control studies. Epidemiology 2001;12:101e8. 39:1029e32.
[16] Brenner H, Hoffmeister M, Stegmaier C, Brenner G, Altenhofen L, [35] Huber K, Kirchheimer JC, Sedlmayer A, Bell C, Ermler D,
Haug U. Risk of progression of advanced adenomas to colorectal can- Binder BR. Clinical value of determination of urokinase-type plas-
cer by age and sex: estimates based on 840,149 screening colonos- minogen activator antigen in plasma for detection of colorectal
copies. Gut 2007;56:1585e9. cancer: comparison with circulating tumor-associated antigens
[17] Brenner H, Altenhofen L, Hoffmeister M. Sex, age and birth cohort CA 19-9 and carcinoembryonic antigen. Cancer Res 1993;53:
effects in colorectal neoplasms: a cohort analysis. Ann Intern Med 1788e93.
2010;152:697e703. [36] Hyodo I, Doi T, Endo H, Hosokawa Y, Nishikawa Y, Tanimizu M,
[18] Hundt S, Haug U, Brenner H. Comparative evaluation of immuno- et al. Clinical significance of plasma vascular endothelial growth fac-
chemical fecal occult blood tests for colorectal adenoma detection. tor in gastrointestinal cancer. Eur J Cancer 1998;34:2041e5.
Ann Intern Med 2009;150:162e9. [37] Douard R, Le Maire V, Wind P, Sales JP, Dumas F, Fayemendi L,
[19] Brenner H, Tao S, Haug U. Low-dose aspirin use and performance of et al. Carcinoembryonic gene member 2 mRNA expression as
immunochemical fecal occult blood tests. JAMA 2010;304:2513e20. a marker to detect circulating enterocytes in the blood of colorectal
[20] Guittet L, Bouvier V, Mariotte N, Vallee JP, Arsene D, Boutreux S, cancer patients. Surgery 2001;129:587e94.
et al. Comparison of a guaiac based and an immunochemical faecal [38] Douard R, Moutereau S, Serru V, Sales JP, Wind P, Cugnenc PH,
occult blood test in screening for colorectal cancer in a general aver- et al. Immunobead multiplex RT-PCR detection of carcinoembryonic
age risk population. Gut 2007;56:210e4. genes expressing cells in the blood of colorectal cancer patients. Clin
[21] van Rossum LG, van Rijn AF, Laheij RJ, van Oijen MG, Fockens P, Chem Lab Med 2005;43:127e32.
van Krieken HH, et al. Random comparison of guaiac and immuno- [39] Gesellschaft der epidemiologischen Krebsregister in Deutschland
chemical fecal occult blood tests for colorectal cancer in a screening e.V. (GEKID). GEKID-Atlas. http://www.gekid.de/, last accessed
population. Gastroenterology 2008;135:82e90. 2 January 2012.
[22] Park DI, Ryu S, Kim YH, Lee SH, Lee CK, Eun CS, et al. Comparison [40] U.S. Preventive Services Task Force. Screening for colorectal cancer:
of guaiac-based and quantitative immunochemical fecal occult blood U.S. preventive services task force recommendation statement. Ann
testing in a population at average risk undergoing colorectal cancer Intern Med 2008;149:627e37.
screening. Am J Gastroenterol 2010;105:2017e25. [41] Levin B, Lieberman DA, McFarland B, Andrews KS, Brooks D,
[23] van Dam L, Kuipers EJ, van Leerdam ME. Performance improve- Bond J, et al. Screening and surveillance for the early detection of
ments of stool-based screening tests. Best Pract Clin Gastroenterol colorectal cancer and adenomatous polyps, 2008: a joint guideline
2010;24:479e92. from the American Cancer Society, the US Multi-Society Task Force
[24] Duffy MJ, van Rossum LG, van Turenhout ST, Malminiemi O, on Colorectal Cancer, and the American College of Radiology. Gas-
Sturgeon C, Lamerz R, et al. Use of faecal markers in screening troenterology 2008;134:1570e95.
for colorectal neoplasia: a European group on tumor markers position [42] Schmiegel W, Reinacher-Schick A, Arnold D, Graeven U,
paper. Int J Cancer 2011;128:3e11. Heinemann V, Porschen R, et al. [Update S3 guidelines colorectal
[25] Brenner H, Haug U, Hundt S. Inter-test agreement and quantitative cancer 2008]. in German. Z Gastroenterol 2008;46:799e840.
cross-validation of immunochromatographical fecal occult blood [43] Powell AA, Burgess DJ, Vernon SW, Griffin JM, Grill JP,
tests. Int J Cancer 2010;127:1643e9. Noorbaloochi S, et al. Colorectal cancer screening mode preferences
[26] Brenner H, Haug U, Hundt S. Sex differences in performance of fecal among US veterans. Prev Med 2009;49:442e8.
occult blood testing. Am J Gastroenterol 2010;105:2457e64. [44] Stock C, Brenner H. Utilization of lower gastrointestinal endoscopy
[27] Haug U, Hundt S, Brenner H. Quantitative immunochemical fecal oc- and fecal occult blood test in 11 European countries: evidence from
cult blood testing for colorectal adenoma detection: evaluation in the the Survey of Health, Aging and Retirement in Europe (SHARE).
target population of screening and comparison with qualitative tests. Endoscopy 2010;42:546e56.
Am J Gastroenterol 2010;105:682e90. [45] Garcia Rodriguez LA, Hernandez-Diaz S. Risk of uncomplicated
[28] http://www.r-biopharm.com/product_site.php?product_range5Clinical peptic ulcer among users of aspirin and nonaspirin nonsteroidal
Diagnostics&product_class_one5R2FzdHJvZW50ZXJvbG9neQ55 anti-inflammatory drugs. Am J Epidemiol 2004;159:23e31.

Matching of Controls May Lead To Biased Estimates of Specificity in The Evaluation of Cancer Screening Tests

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Matching of Controls May Lead To Biased Estimates of Specificity in The Evaluation of Cancer Screening Tests

Uploaded by

Copyright:

Available Formats

Journal of Clinical Epidemiology 66 (2013) 202e208

Matching of controls may lead to biased estimates of specificity

1. Introduction in screening populations, cancer-free convenience samples

colonoscopy is conducted before 65 years of age, a second

[9] Forman D, Stockton D, Møller H, Quinn M, Babb P, De Angelis R, &product_class_two5Q29sb24gQ2FuY2VyIFByZXZlbnRpb245&,

You might also like