Professional Documents
Culture Documents
Abstract
Often, in epidemiologic research, classification of study participants with respect to the presence of a dichotomous condition (e.g.,
infection) is based on whether a quantitative measurement exceeds a specified cut point. The choice of a cut point involves a tradeoff
between sensitivity and specificity. When the classification is to be made for the purpose of estimating risk ratios (RRs) or odds ratios
(ORs), it might be argued that the best choice of cut point is one that maximizes the precision of estimates of the RRs or ORs. In this article,
two different approaches for estimating RRs and ORs are discussed. For each approach, formulae are derived that give the mean squared
error of the RR and OR estimates, for any choice of cut point. Based on these formulae, a cut point can be chosen that minimizes the
mean squared error of the estimate of interest. 쑖 2003 Elsevier Inc. All rights reserved.
Keywords: Epidemiologic methods; Sensitivity and specificity; Diagnostic tests; Misclassification; Odds ratio; Risk ratio; Study design
cut point. However, the results have implications for choos- How should we choose the value of the cut point when the
ing cut points when there is uncertainty about the sensitivity classification is being made strictly for research purposes?
and specificity of the test at each cut point. We would argue that the a cut point should be chosen
that maximizes the precision of estimates of scientific interest.
Here, we consider the cases in which the goal is to estimate
RRs or ORs. More precisely, let pD|E and pD|Ē stand for
2. Statement of the problem the probability of disease among the exposed and unexposed
For linguistic simplicity, we will use the words “disease” to respectively. In these terms,
refer to the health condition of interest, and “exposure”
RR ⫽ pD|E ÷ pD|Ē
to refer to a dichotomous risk factor. Let Z stand for the
quantitative variable used to classify people with respect to and
the presence or absence of disease. The usefulness of Z
for classifying people depends on the degree to which the OR ⫽ pD|E/(1⫺pD|E) ÷ pD|Ē/(1⫺pD|Ē).
distribution of Z among those with disease differs from
the distribution of Z among those without disease. Fig. 1 To quantify the precision of an estimate, it is common
illustrates hypothetical distributions of Z among the diseased to use the mean squared error (MSE), the average squared
and the nondiseased. In this illustration, the two distributions distance between an estimate and the true value [2]. The
are both normal with standard deviation equal to 1, but with MSE is equal to the variance of the estimate plus the square
means that differ by three standard deviations. The horizontal of the bias of the estimate. To determine the precision of
axis is labeled with respect to the distance from the mean estimates of the RR and the OR it is common and convenient
of Z in the nondiseased. to work on the log scale. Thus, the precision of an estimate
To classify people as diseased or nondiseased based on of RR, say RR̂, can be quantified using
an observed value of Z, a cut point is chosen. If the value
of Z for a person exceeds the cut point, then the person is MSE ⫽ Expected value of (logRR̂ ⫺ logRR)2
classified as diseased (test result positive). Otherwise, the
⫽ Var(logRR̂) ⫹ (bias(logRR̂))2.
person is classified as nondiseased (test result negative).
The two shaded areas in Fig. 1 illustrate the sensitivity The analogous expression is used for the OR. Given an
(shaded area on the right) and specificity (shaded are on the estimation approach, the problem reduces to finding the cut
left) that would result if a cut point of 2.0 were chosen. point that minimizes the MSE.
Fig. 1. Illustration of hypothetical distributions of Z among the diseased (right curve) and undiseased (left curve). The area in the shaded region to the
right of 2.0 represents the sensitivity that would result if 2.0 was chosen as the cut point. The area in the shaded region to the left of 2.0 represents
the specificity that would result.
958 L.S. Magder, A.D. Fix / Journal of Clinical Epidemiology 56 (2003) 956–962
3. Choosing a cut point when the RR or OR To illustrate the results of applying these formulae, we
will be estimated in a standard manner consider the special case in which the distribution of Z
among both the diseased and nondiseased is normal with
We assume independent observations are available from the same variance but different means. Fig. 2a–d shows the
n study subjects. For each subject, information regarding MSE of the standard estimates calculated at a range of cut
the presence or absence of the exposure and the value of Z
points under various scenarios. Fig. 2b and d is based on two
is known. Given such data, there are several approaches to distributions whose means are separated by three standard
estimating the RR and OR.
deviations, as illustrated in Fig. 1. The horizontal axes con-
The standard approach would be to choose a cut point,
sists of an interval of possible cut points, labeled by their
classify the subjects with respect to disease based on this distance (in standard deviations) from the mean of the distri-
cut point, and estimate the RR or OR as if these classifica-
bution of Z among the nondiseased, as in Fig. 1. The sensitiv-
tions represented the true disease status of each person. To
ity and specificity corresponding to each possible cut point
describe this approach more precisely, notation is provided (calculated based on the normality assumptions) are given
in Table 1. This represents a hypothetical two-by-two table
below the horizontal axis. The calculations are based on a
that can be constructed once a cut point for the quantita-
sample size of 200 per group. It can be seen that in these
tive measurement Z is chosen. scenarios, the optimum cut point occurs in a place of high
Using the notation in Table 1, let p̂T⫹|E ⫽ a/nE and
specificity and moderate sensitivity.
p̂T⫹|Ē ⫽ c/nĒ denote the standard estimates of the probability
The importance of high specificity in these scenarios is
of testing positive for the disease given exposure and nonex- tied to the fact that the probability of disease in each group
posure respectively. The standard approach to estimating the
is relatively low. Given a low probability of disease and
RR and the OR in this setting is to use imperfect specificity, the number of true positives might be
RR̂standard ⫽ p̂T⫹|E/p̂T⫹|Ē relatively low compared to the number of false positives.
This will lead to relatively greater bias and variance.
and
Fig. 2. Asymptotic mean squared error of the log RRstandard (plots a and b), and log ORstandard (plots c and d) at a range of cut points, under various scenarios.
All plots assume 200 subjects per group. Plots (a) and (c) are based on the assumption that the distributions of the quantitative assessment have the same
standard deviation, but differ in their means by two standard deviations. Plots (b) and (d) assume they differ by three standard deviations. The probability
of disease in the unexposed was assumed to equal 0.1 (thick), or 0.3 (thin). The probability of disease in the exposed was set so that the RR ⫽ 2 (plots a
and b) or the OR ⫽ 2 (plots c and d). The numbers for the cut points on the horizontal axes refer to distances from the mean in the undiseased.
Using the same approach to derive pD|Ē results in the Note that, if the specificity is 1.0, the adjusted estimate
following adjusted estimates for the RR and OR: of the RR is equivalent to the standard estimate. This reflects
the fact that when specificity is perfect, the standard estimate
p̂D|E p̂T⫹|E ⫺ (1⫺spec)
RR̂adjusted ⫽ ⫽ of the RR is asymptotically unbiased.
p̂D|Ē p̂T⫹|Ē ⫺ (1⫺spec) The asymptotic variances of these estimates are given by
and equations (7) and (8) in the appendix. These variances
depend on the sensitivity and specificity of the test, the
p̂D|E/(1⫺p̂D|E) values of pD|E and pD|Ē and the sample size in each group.
OR̂adjusted ⫽
p̂D|Ē/(1⫺p̂D|Ē) Again, the validity of these formulae does not depend on
assumptions about the normality Z. Therefore, for given
p̂T⫹|E ⫺ (1⫺spec) p̂T⫹ |Ē ⫺ (1⫺spec)
⫽ ÷ . values of pD|E and pD|Ē and sample size, a cut point can be
sens ⫺ p̂T⫹|E sens ⫺ p̂T⫹|Ē chosen which results in the lowest variance. Because
For some data sets, these formulae can result in negative these estimates are asymptotically unbiased, their asymptotic
values for the estimate. In those cases, the parameter should MSE is equivalent to their asymptotic variances.
be estimated with 0 or infinity depending on the situation. Fig. 3a–d shows the asymptotic MSE of the adjusted
For example, if the denominator of the adjusted RR estimate, estimates calculated at a range of cut points under the same
p̂T⫹|Ē ⫺ (1⫺spec), is less than 0, then there are fewer sub- scenarios used for Fig. 2. Again, it can be seen that the optimal
jects testing positive than would be expected if all the unex- cut points occur for high values of specificity. Interestingly,
posed subjects were truly nondiseased. In this case, the data despite the fact that these estimates are unbiased, the MSE
are most consistent with no probability of disease in the of the adjusted estimate exceeds the MSE of the standard
unexposed, and the appropriate estimate of the RR is infinity. estimate for many cut points.
960 L.S. Magder, A.D. Fix / Journal of Clinical Epidemiology 56 (2003) 956–962
Fig. 3. Asymptotic mean squared error of the log RRadj (plots a and b), and log ORadj (plots c and d) at a range of cut points, under various scenarios. All
plots assume 200 subjects per group. Plots (a) and (c) are based on the assumption that the distributions of the quantitative assessment have the same
standard deviation, but differ in their means by two standard deviations. Plots (b) and (d) assume they differ by three standard deviations. The probability
of disease in the unexposed was assumed to equal 0.1 (thick lines), or 0.3 (thin lines). The probability of disease in the exposed was set so that the RR ⫽ 2
(plots a and b) or the OR ⫽ 2 (plots c and d). The numbers for the cut points on the horizontal axes refer to distances from the mean in the undiseased.
for one of the enzyme-linked immunoassays designed to Optical density cut points
measure antibodies to the lytic phase glycoprotein K8.1. 0.80 1.00 1.50
Table 2 also shows the MSE of estimates of the risk ratio Sensitivitya 90% 85% 78%
under different scenarios, calculated using the formulae in Specificitya 83% 90% 98%
the appendices of this article. It can be seen that if the MSEb of log standard estimate of the risk ratio
standard estimate is used and the prevalence in the unex- Assuming prevalence in unexposed ⫽ 5% 0.33 0.26 0.16
Assuming prevalence in unexposed ⫽ 20% 0.114 0.072 0.038
posed is 5%, using a cut point of 1.5 leads to a far more precise MSEb of log adjusted estimate of the risk ratio
estimate than using a cut point of 0.8 (MSE ⫽ 0.16 compared Assuming prevalence in unexposed ⫽ 5% 0.79 0.55 0.26
to MSE ⫽ 0.33). A similar advantage of the larger cut point Assuming prevalence in unexposed ⫽ 20% 0.065 0.055 0.042
is seen when the prevalence is 20% and when the adjusted a
From Engels et al. [5].
b
estimate is used. Assuming 200 patients per group and a true risk ratio of 2.0.
L.S. Magder, A.D. Fix / Journal of Clinical Epidemiology 56 (2003) 956–962 961
sensitivity and specificity of the diagnostic test at various The methods described in this article are meant to be
cut points. For another thing, the formulae given in the used when disease status is a true dichotomy (e.g., infected/
Appendix provide only approximate MSEs, the quality of uninfected). This should be distinguished from the situation
which depend on the sample size. However, decisions often when disease status is matter of degree, and Z is a measure
have to be made in the presence of uncertainty, and the of the degree of disease (e.g., when the disease is obesity
formulae and the graphs in this article can still provide guid- and Z is body-mass index). In the latter case, the use of a
ance in choosing cut points. It is clear, for example, that cut point is mainly for the purpose of providing simpler
for relatively rare outcomes, good precision in estimation summaries of the data, and the notions of sensitivity and
requires high specificity, but is somewhat robust to depar- specificity do not apply. Considerations for choosing a cut
tures from high sensitivity. This would suggest that one point in that setting are discussed by Ragland [9].
should choose relatively high cut points in this context. The methods described in this article make the implicit
Interestingly, we found that in certain settings, the MSE assumption that the sensitivity and specificity of the diagnos-
of the standard estimates are lower than the MSE of the tic test are the same in both study groups. Although generally
adjusted estimates. This occurs because the standard esti- reasonable, this assumption can be relaxed by using different
mates are biased towards 1.0, reducing the probability of values for sensitivity and specificity in the terms in the
getting extremely large or small estimates. In fact, it can be formulae relate to each study group.
shown that the variances of the standard estimates are always
lower than the variances of the adjusted estimates. Thus, for
small sample sizes (where the MSE is predominantly de- Acknowledgments
termined by the variance), the MSE will be lower for
the standard estimates than for the adjusted estimates. This work was supported by research grant R0-1 AR
However, the lower MSE of the standard method does 43727 of the National Institutes of Health.
not mean that it is preferable to the adjusted estimates in
these settings. It can be argued based on the likelihood
principle [7] that if the sensitivity and specificity are known, Appendix
then the adjusted estimate of association is a more accurate
Formulae for the bias and variance of the standard estimates
representation of the information in the data with respect to
based on a given cutpoint.
the true value of the association. For example, consider a
data set in which the observed number of positive tests in Let pD|E and pD|Ē stand for the probability of disease in
the exposed group is less than would be expected even if the the exposed and unexposed respectively. Similarly, let pT+|E
true disease risk in the exposed was 0, given the imperfect and pT⫹|Ē stand for the probability of testing positive in the
specificity of the test. With such data it is arguable that the exposed and unexposed respectively based on a given cut
data are most consistent with a RR of 0. This is what point, T. Then,
the adjusted estimate would be, whereas the standard esti-
mate would not equal 0. Thus, if the goal of the analysis is pT⫹|E ⫽ PD|E sens ⫹ (1⫺pD|E)(1⫺spec) (1)
to report what the data say regarding the value of the associa- and
tion, it is best to use the adjusted estimate.
There is a third approach to estimating RRs and ORs in pT⫹|Ē ⫽ PD|Ē sens ⫹ (1⫺pD|Ē )(1⫺spec) (2)
this context that obviates the need to choose a specific cut
where “sens” and “spec” refer to the sensitivity and specific-
point. In brief, using methods described in Magder and
ity of the diagnostic test based on the chosen cut point. The
Hughes [5], the exact value of Z can be used in risk assess-
standard estimate for the RR, p̂T⫹|E/p̂T⫹|Ē, is an unbiased
ment using a probabilistic approach. Thus for example, those
estimate of pT⫹|E/pT⫹|Ē. Therefore,
with a very high value of Z might be classified as having a
higher probability of disease than those with a borderline
value of Z. These probabilities can be incorporated into an
algorithm to compute maximum likelihood estimates of risk
bias(logRR̂standard) ⫽ log
( ) pT⫹|E
pT⫹|Ē ( )
p
⫺ log D|E
pD|Ē
(3)
ratios or odds ratios. To use this approach the sensitivity Expression (3) can be rewritten in terms of sens, spec,
and specificity of the assay must be known (or assumed) pD|E and pD|E by substituting for pT⫹|E, and pT⫹|Ē based on
for multiple cut points. expressions (1) and (2).
Many studies seek estimates of the association of expo- Also, using the “delta” method [3],
sure and disease, while controlling for potential confounders. (1⫺pT⫹|E) (1⫺pT⫹|Ē)
The adjusted method described above can be extended to var(logRR̂standard) ≈ ⫹ (4)
nE pT⫹|E nE pT⫹|Ē
this context. A SAS macro is available on the Internet that
extends logistic regression to adjust for imperfect sensitivity where nE and nĒ are the number of study subjects in the exposed
and specificity of a diagnostic test [8]. and unexposed groups respectively. Again, expression
962 L.S. Magder, A.D. Fix / Journal of Clinical Epidemiology 56 (2003) 956–962
(4) can be written in terms of sens, spec, pD|E and pD|Ē by var(logOR̂adj)
substituting from expressions (1) and (2).
( )
2
Using analogous reasoning, sens ⫹ spec⫺1 pT⫹|E(1⫺pT⫹|E)
⫽
(pT ⫹ |E ⫹ spec ⫺ 1)(sens ⫺ pT⫹|E) nE
( )
p /(1⫺pT⫹|E) (8)
bias(logOR̂standard) ⫽ log T⫹|E (5)
pT⫹|Ē/(1⫺pT⫹|Ē)
( )
2
sens ⫹ spec ⫺ 1 pT⫹|Ē(1 ⫺ pT⫹|Ē)
( )
pD|E/(1⫺pD|E) ⫹
⫺ log (pT⫹|Ē ⫹ spec⫺1)(sens ⫺ pT⫹|Ē) nĒ
pD|Ē/(1⫺pD|Ē)
and
1 1 References
var(logOR̂standard) ⫽ ⫹ (6)
nE pT⫹|E nE(1⫺pT⫹|E) [1] Sox HC Jr, Blatt MA, Higgins MC, Marton KI. Medical decision
making. Boston, MA: Butterworth-Heinemann; 1988.
1 1 [2] Bickel PJ, Doksum KA. Mathematical statistics: basic ideas and se-
⫹ ⫹
nĒ pT⫹|Ē nĒ(1⫺pT⫹|Ē) lected topics. Oakland, CA: Holden-Day Inc.; 1977.
[3] Bishop YMM, Fienberg SE, Holland PW. Discrete multivariate analy-
sis: theory and practice. Massachusetts, MA: MIT Press; 1975.
[4] Copeland KT, Checkoway H, McMichael AJ, Holbrook RH. Bias due
to misclassification in the estimation of relative risk. Am J Epidemiol
Formulae for the variance of the adjusted estimates based 1977;105:488–95.
on a given cut point. [5] Magder LS, Hughes JP. Logistic regression when the outcome is mea-
sured with uncertainty. Am J Epidemiol 1997;146:195–203.
Using the delta method, the asymptotic variances of the [6] Engels EA, Whitby D, Goebel PB, Stossel A, Waters D, Pintus A, Contu
adjusted estimates can be derived. These are as follows: L, Bigger RJ, Goedert JJ. Identfying human herpesvirus 8 infection:
performance characteristics of serolgic assays. J Acquir Immune Defic
pT⫹|E(1⫺pT⫹|E) Syndr 2000;23:346–54.
var(logRR̂adj) ⫽ (7) [7] Royall RM. Statistical evidence. A likelihood paradigm. London: Chap-
(pT⫹|E ⫹ spec⫺1)2nE man & Hall; 1997.
[8] Web site. http://medschool.umaryland.edu/departments/Epidemiology/
pT⫹|Ē(1⫺pT⫹|Ē)
⫹ software.html
(pT⫹|Ē ⫹ spec⫺1)2nĒ [9] Ragland DR. Dichotomizing continuous outcome variables: Depen-
dence of the magnitude of association and statistical power on the
and cutpoint. Epidemiology 1992;3:434–40.